Hi, I'm looking into drill, to use it as an in memory db. I wanted to handle data that I have in a Sql Server db. I connected with an Sql Server jdbc plug in, and my test query ran for about 2 sec. When running directly from Sql Server it took 0.15 sec.
I ran a "create table" as a parquet file and then tried to query with dfs plug in. The query ran for 0.5 sec (after caching. first run is about 3 sec). Also tried to do "REFRESH TABLE METADATA", but it didn't change anything. My Test query is: select sum(f.Sales), p.`Product Category` from dfs.tmp.`/Demo/Facts/` f join dfs.tmp.`/Demo/Product/` p on p.productKey = f.productKey group by p.`Product Category`; Facts table has 422,833 rows, product has 606. The result set is 4 rows. This was done running drill locally (embedded) on a windows machine. I tried a linux machine, but the results where even slower. I didn't configure anything, just used the install as-is. Am I doing something wrong? Is a RDBMS going to be faster anyway? I read about the performance and I feel I'm not getting there. SqlServer: 0.15 sec. SqlServer in drill: 2 sec. Parquet in drill: 0.5 sec. Thank you, Imbar
