What does the query plan look like when you're using SqlServer with Drill? I'm guessing that the join isn't being pushed down to SqlServer. If so, you've hit DRILL-4818. There are known limitations with the JDBC storage plugin that prevent it from generating the optimal query plan in cases like this.
-- Zelaine On Thu, Aug 11, 2016 at 9:22 AM, imbar marinescu <[email protected]> wrote: > Hi, > > I'm looking into drill, to use it as an in memory db. > I wanted to handle data that I have in a Sql Server db. > I connected with an Sql Server jdbc plug in, and my test query ran for > about 2 sec. > When running directly from Sql Server it took 0.15 sec. > > I ran a "create table" as a parquet file and then tried to query with dfs > plug in. > The query ran for 0.5 sec (after caching. first run is about 3 sec). > Also tried to do "REFRESH TABLE METADATA", but it didn't change anything. > > My Test query is: > select sum(f.Sales), p.`Product Category` > from dfs.tmp.`/Demo/Facts/` f > join dfs.tmp.`/Demo/Product/` p on p.productKey = f.productKey > group by p.`Product Category`; > > Facts table has 422,833 rows, product has 606. > The result set is 4 rows. > > This was done running drill locally (embedded) on a windows machine. > I tried a linux machine, but the results where even slower. > > I didn't configure anything, just used the install as-is. > > Am I doing something wrong? Is a RDBMS going to be faster anyway? > I read about the performance and I feel I'm not getting there. > > SqlServer: 0.15 sec. > SqlServer in drill: 2 sec. > Parquet in drill: 0.5 sec. > > Thank you, > Imbar >
