Hi, First off all I wanna thank you for the great Apache Drill. Thank you!
I'm doing some testing with Apache Drill 1.6.0 with several RDBMS and I have always performance issues. My latest tests are over MSSQL with SQLServer 2008 in Windows Server 2008R2 with jdbc driver 4.1, as recommend (but also tested with jtds with same results), and I notice that during the execution of the query MSSQL was waiting on ASYNC_NETWORK_IO until almost the end of execution of the query. I ran drill tests on machines in same network of the databases and in the machine of MSSQL with always the same results. Notes: I look at profiler and notice that the major operation was with operator Hash_join. I also increased JVM memory and machine memory, always with the same results. I also test in linux with drillbit and with drill-embeded. My question to the community are: Are there any best pratices for using Apache Drill with RDBMS that will increase it's performance ? Can/Should I apply partition pruning in this case ? Do you have any suggestion how to further debug this issue ? Query example: use siag_dre_asec2.dbo; SELECT Evento.ChvEUnidadeUtilizadora AS uu, TO_CHAR((CAST(lanc.DtLancamento AS DATE)),'yyyyMMdd') AS KDtLancamento, ESTRUTURA0.ChvP AS ECHVP4, THIS_.ValCredito AS MVALCREDITO5, THIS_.ChvP AS MCHVP1, THIS_.ChvECategoria AS MCHVECATEGORIA9, THIS_.ChvEEstrutura AS MCHVEESTRUTURA2, THIS_.ChvP AS MCHVP8, THIS_.ValDebito AS MVALDEBITO6, ESTRUTURA0.DesigEstrutura AS EDESIGESTRUTURA3, CATEGORIA7.ChvP AS CCHVP10 FROM Movimento THIS_ INNER JOIN Lancamento AS lanc ON lanc.ChvP = THIS_.ChvELancamento INNER JOIN Evento ON Evento.ChvP = lanc.ChvEEvento INNER JOIN CATEGORIA CATEGORIA7 ON THIS_.ChvECategoria = CATEGORIA7.ChvP INNER JOIN Estrutura ESTRUTURA0 ON THIS_.ChvEEstrutura = ESTRUTURA0.ChvP WHERE CATEGORIA7.SiglaCategoria = 'CPBSQ31' This query takes less than a second in MSSQL and 120s with apache drill. Thank you for your support. Miguel
