Re: Apache drill query plan: cumulative cost

2023-05-04 Thread Prabhakar Bhosale
Thanks James, will check on it. REgards Prabhakar On Thu, May 4, 2023 at 4:36 PM James Turton wrote: > The cost you've found is just an estimate produced by the planner and > not a good number to performance tune with. I recommend reviewing the > fragment and operator metrics written to the que

Re: Apache drill query plan: cumulative cost

2023-05-04 Thread James Turton
The cost you've found is just an estimate produced by the planner and not a good number to performance tune with. I recommend reviewing the fragment and operator metrics written to the query profile after execution. If you are further able to isolate the query so that it's the only thing runnin

Apache drill query plan: cumulative cost

2023-05-04 Thread Prabhakar Bhosale
Hi Team, I am trying to understand what is the IO and network cost of the query? I could find cumulative cost but was not able to interpret it. Please advise or any documentation pointer will help. thx REgards Prabhakar

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-09 Thread Zelaine Fong
er to > > be at the debug level and retry the explain plan? > > > > > > Kunal Khatua > > > > > > > > From: rahul challapalli > > Sent: Tuesday, Mar

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-09 Thread Jinfeng Ni
store.dfs.FileSelection - FileSelection.getStatuses() > > took 0 ms, numFiles: 1 > > > > > > More than 30 secs is unaccounted for. Can you turn on the root logger to > > be at the debug level and retry the explain plan? > > > > >

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-08 Thread Zelaine Fong
you turn on the root logger to > be at the debug level and retry the explain plan? > > > Kunal Khatua > > > ____ > From: rahul challapalli > Sent: Tuesday, March 7, 2017 5:24:43 AM > To: user > Subject: Re: Minimise query plan time

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-07 Thread PROJJWAL SAHA
t; ____ > From: rahul challapalli > Sent: Tuesday, March 7, 2017 5:24:43 AM > To: user > Subject: Re: Minimise query plan time for dfs plugin for local file system > on tsv file > > I did not get a chance to review the log file. > > However th

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-07 Thread Kunal Khatua
ubject: Re: Minimise query plan time for dfs plugin for local file system on tsv file I did not get a chance to review the log file. However the next thing I would try is to make your cluster a single node cluster first and then run the same explain plan query separately on each individual file.

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-07 Thread rahul challapalli
le? Like gautam suggested, it would be good to take a look at >> drillbit.log file (from the foreman node where planning occurred) and the >> query profile as well. >> >> - Rahul >> >> On Mon, Mar 6, 2017 at 9:30 AM, Gautam Parai wrote: >> >> >

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-06 Thread rahul challapalli
> Sent: Monday, March 6, 2017 1:45:38 AM > To: user@drill.apache.org > Subject: Fwd: Minimise query plan time for dfs plugin for local file > system on tsv file > > all, please help me in giving suggestions on what areas i can look into > why the query planning time is taking so

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-06 Thread Gautam Parai
Can you please provide the drillbit.log file? Gautam From: PROJJWAL SAHA Sent: Monday, March 6, 2017 1:45:38 AM To: user@drill.apache.org Subject: Fwd: Minimise query plan time for dfs plugin for local file system on tsv file all, please help me in giving

Fwd: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-06 Thread PROJJWAL SAHA
dfs storage plugin. Query planning time is approx 30 secs Query execution time is apprx 1.5 secs Regards, Projjwal -- Forwarded message -- From: PROJJWAL SAHA Date: Fri, Mar 3, 2017 at 5:06 PM Subject: Minimise query plan time for dfs plugin for local file system on tsv file To

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-05 Thread PROJJWAL SAHA
; > 1GB file, local to the drill cluster. > > > > The 1GB file is split into 10 files of 100 MB each. > > > > As expected I see 11 minor and 2 major fagments. > > > > The drill cluster is 5 nodes cluster with 4 cores, 32 GB each. > > > > > >

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-03 Thread John Omernik
10 files of 100 MB each. > > > As expected I see 11 minor and 2 major fagments. > > > The drill cluster is 5 nodes cluster with 4 cores, 32 GB each. > > > > > > One observation is that the query plan time is more than 30 seconds. I > > ran > > >

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-03 Thread PROJJWAL SAHA
GB file, local to the drill cluster. > > The 1GB file is split into 10 files of 100 MB each. > > As expected I see 11 minor and 2 major fagments. > > The drill cluster is 5 nodes cluster with 4 cores, 32 GB each. > > > > One observation is that the query plan tim

Re: Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-03 Thread Nitin Pawar
the drill cluster. > The 1GB file is split into 10 files of 100 MB each. > As expected I see 11 minor and 2 major fagments. > The drill cluster is 5 nodes cluster with 4 cores, 32 GB each. > > One observation is that the query plan time is more than 30 seconds. I ran > the explai

Minimise query plan time for dfs plugin for local file system on tsv file

2017-03-03 Thread PROJJWAL SAHA
cluster is 5 nodes cluster with 4 cores, 32 GB each. One observation is that the query plan time is more than 30 seconds. I ran the explain plan query to validate this. The query execution time is 2 secs. total time taken is 32secs I wanted to understand how can i minimise the query plan time

Re: query plan caching?

2016-01-14 Thread Neeraja Rentachintala
Query plan caching is typically very useful if you have reporting type of queries where the query patterns are fixed and vary mostly on filter conditions. For adhoc queries or data exploration use cases where there may not be fixed query patterns, this technique is not frequently useful. Similar

Re: query plan caching?

2016-01-14 Thread Andries Engelbrecht
have not seen any specific effort to cache query plans. However Drill is used for analytical purposes the query volumes are substantially lower than transactional RDBMS where query plan caching can be critical. Are you potentially seeing an issue with query volume or query planning time

Re: query plan caching?

2016-01-14 Thread Jason Altekruse
Currently not to my knowledge. Are there queries you are seeing that are taking an abnormally long time to plan? On Thu, Jan 14, 2016 at 6:25 AM, Vince Gonzalez wrote: > Does Drill do any caching of query plans? >

query plan caching?

2016-01-14 Thread Vince Gonzalez
Does Drill do any caching of query plans?

Re: query plan ....

2015-08-25 Thread Hao Zhu
Team, I created a feature request https://issues.apache.org/jira/browse/DRILL-3710 to make the "20" magic number configurable so that we do not need to add junk/duplicate in-list elements. On Tue, Aug 25, 2015 at 4:49 PM, Sungwook Yoon wrote: > This trick definitely helps... > Not as fast as I w

Re: query plan ....

2015-08-25 Thread Sungwook Yoon
This trick definitely helps... Not as fast as I want it to be.. but.. it helps... Sungwook On Tue, Aug 25, 2015 at 4:25 PM, Jinfeng Ni wrote: > Hi Sungwook, > > I looked at the query profile. It did show that the bottleneck is Filter > operator, where Filter operator, > running on 4-node EC2

Re: query plan ....

2015-08-25 Thread Jinfeng Ni
Hi Sungwook, I looked at the query profile. It did show that the bottleneck is Filter operator, where Filter operator, running on 4-node EC2 cluster (?), took about 82 minutes to process total approximately 2.5 billion records. I'm not sure how powerful each Node's CPU. But seems the query is C

Re: query plan ....

2015-08-25 Thread Jinfeng Ni
Hi Sungwook, Can you upload the profile file to a public place? The user mailing list does not allow to attach files. On Tue, Aug 25, 2015 at 10:58 AM, Sungwook Yoon wrote: > > Jacques, > > Here is the physical profile for this query, > I attached the 1000 lines of the tail of the drillbit.lo

Re: query plan ....

2015-08-25 Thread Sungwook Yoon
Jacques, Here is the physical profile for this query, I attached the 1000 lines of the tail of the drillbit.log from the foreman. Also, attached is the picture of the filter minorfragment information. We have plenty of memory and Drill is not using them all. Let me know your thoughts. Thanks,

Re: query plan ....

2015-08-25 Thread Jacques Nadeau
I still think a profile analysis would be good. Can you post the following on gist: - profile for this query - source code for the slow filter operation (turn on debug logging to get this) thanks, Jacques -- Jacques Nadeau CTO and Co-Founder, Dremio On Tue, Aug 25, 2015 at 8:16 AM, Sungwook Yo

Re: query plan ....

2015-08-25 Thread Sungwook Yoon
Understood,.. I can see Jacques and Jinfeng's point of view. I would not bother think about this query but some claim that this query ran orders of magnitude faster on some other sql engine... I don't know how.. they got that performance from this query.. Sungwook On Tue, Aug 25, 2015 at

Re: query plan ....

2015-08-25 Thread Jacques Nadeau
CONVERT_FROMUTF8 is a trivial operation. I would be surprised if that was the bottleneck. See here [1] for the implementation. I'm guessing that the problem is simply that the best alternative on this structure would be a non-equality join. The way I look at this query is you have data set like

Re: query plan ....

2015-08-25 Thread Jinfeng Ni
I looked at the two physical plans you posted. Seems they are different : the first one has multiple CONVERT_FROMUTF8() call, while the second does not have. 1. Filter(condition=[AND(>=(CAST($0):INTEGER, 2009), <=(CAST($0):INTEGER, 2013), OR(=(CONVERT_FROMUTF8($1), '39891'), =(CONVERT_FROMUTF8($1

Re: query plan ....

2015-08-25 Thread Sungwook Yoon
Aman, Sorry for slow reply. Here is the SQL query. SELECT count(1) FROM `o` a WHERE (a.DX1 IN ('39891', '4280', '4281', '42820', '42821', '42822', '42823', '42830', '42831', '42832', '42833', '42840', '42841', '42842', '42843', '4289') OR a.DX2 IN ('39891', '4280', '4281', '42820', '42821', '428

Re: query plan ....

2015-08-24 Thread Ted Dunning
On Mon, Aug 24, 2015 at 9:32 PM, Jacques Nadeau wrote: > I think the isue in many cases is not generating the right data. It more > often is generating the right query. The easiest to solve these issues is > to continue to get real world example queries like the one Sunwook provided > above. >

Re: query plan ....

2015-08-24 Thread Ted Dunning
On Mon, Aug 24, 2015 at 9:32 PM, Jacques Nadeau wrote: > Good plug for log-synth. I'm still hoping for a storage plugin... (I > suppose I have to write some docs on the api first :) > I'll write it as soon as I can figure out how. :-)

Re: query plan ....

2015-08-24 Thread Jacques Nadeau
Good plug for log-synth. I'm still hoping for a storage plugin... (I suppose I have to write some docs on the api first :) I think the isue in many cases is not generating the right data. It more often is generating the right query. The easiest to solve these issues is to continue to get real w

Re: query plan ....

2015-08-24 Thread Ted Dunning
Just a plug here for a tool that started as a part of Drill and which might help people build test cases. Log-synth makes it very easy to build realistic data, flat or nested. It creates realistic data such as street addresses, zip codes and SSN's. It also includes lots of tables of auxiliary dat

Re: query plan ....

2015-08-24 Thread Aman Sinha
I was about to say that for IN lists of size 20 or more, Drill uses a more efficient Values operator instead of OR conditions but then realized the OR filter is referencing 4 different columns : $1..$4 and each of those individual lists is less than 20. Sungwook, can you please provide the SQL qu

Re: query plan ....

2015-08-24 Thread Ted Dunning
On Mon, Aug 24, 2015 at 4:50 PM, Sungwook Yoon wrote: > Still, the performance drop down due to OR filtering is just astounding... > That is what query optimizers are for and why getting them to work well is important. The difference in performance that you are observing is not surprising given

Re: query plan ....

2015-08-24 Thread Sungwook Yoon
E9, cumulative cost = {5.281182067E9 rows, 3.1687092402E10 > > cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 7674 > > > > > > Sungwook > > > > On Mon, Aug 24, 2015 at 4:18 PM, Jinfeng Ni > wrote: > > > > > Can you post the plan including th

Re: query plan ....

2015-08-24 Thread Jinfeng Ni
cumulative cost = {5.281182067E9 rows, 3.1687092402E10 > cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 7674 > > > Sungwook > > On Mon, Aug 24, 2015 at 4:18 PM, Jinfeng Ni wrote: > > > Can you post the plan including the HBaseScan part? I would like check > if > > the fi

Re: query plan ....

2015-08-24 Thread Sungwook Yoon
understand, currently, if part of the filter is pushed > into Scan operator, it will remain in the Filter operator, although the > filter in the Scan should have pruned out the rows which do not qualify for > the filter condition. > > > > > > On Mon,

Re: query plan ....

2015-08-24 Thread Jinfeng Ni
uery, doing something like > > > > a in (v1, v2, v3, v15) > > > > The physical query plan looks like the following. > > > > Filter(condition=[AND(>=(CAST($0):INTEGER, 2009), <=(CAST($0):INTEGER, > > 2013), OR(=(CONVERT_FROMUTF8($1), '39891&#

Re: query plan ....

2015-08-24 Thread Aman Sinha
once. Aman On Mon, Aug 24, 2015 at 1:34 PM, Sungwook Yoon wrote: > Hi, > > I have a query, doing something like > > a in (v1, v2, v3, v15) > > The physical query plan looks like the following. > > Filter(condition=[AND(>=(CAST($0):INTEGER, 2009), &l

query plan ....

2015-08-24 Thread Sungwook Yoon
Hi, I have a query, doing something like a in (v1, v2, v3, v15) The physical query plan looks like the following. Filter(condition=[AND(>=(CAST($0):INTEGER, 2009), <=(CAST($0):INTEGER, 2013), OR(=(CONVERT_FROMUTF8($1), '39891'), =(CONVERT_FROMUTF8($1), '4280

Re: Query plan changes based on field names

2015-04-17 Thread Aman Sinha
> hoping someone might be able to point me in the right direction. > > The behaviour seems extremely strange that a field name could affect a > query plan. Ultimately, the Mongo filter pushdown optimization rule (which > is the real reason this is causing me pain) should handle cas

Query plan changes based on field names

2015-04-17 Thread Adam Gilmore
Hi guys, I raised: https://issues.apache.org/jira/browse/DRILL-2732 I'd be keen to get stuck in and implement a patch for this, but I was hoping someone might be able to point me in the right direction. The behaviour seems extremely strange that a field name could affect a query