I don't see DRILL is transforming the query. Tried with a CSV file. Please let
me know if I am missing something.
00-00 Screen : rowType = RecordType(INTEGER EXPR$0): rowcount = 1.0,
cumulative cost = {3.1 rows, 17.1 cpu, 0.0 io, 0.0 network, 0.0 memory}, id =
260
00-01 Project(EXPR$0=[$0]) : rowType = RecordType(INTEGER EXPR$0):
rowcount = 1.0, cumulative cost = {3.0 rows, 17.0 cpu, 0.0 io, 0.0 network, 0.0
memory}, id = 259
00-02 StreamAgg(group=[{}], EXPR$0=[SUM($0)]) : rowType =
RecordType(INTEGER EXPR$0): rowcount = 1.0, cumulative cost = {3.0 rows, 17.0
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 258
00-03 Project($f0=[1]) : rowType = RecordType(INTEGER $f0): rowcount =
1.0, cumulative cost = {2.0 rows, 5.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id
= 257
00-04 Scan(groupscan=[EasyGroupScan
[selectionRoot=file:/C:/data/company.csv, numFiles=1, columns=[`*`],
files=[file:/C:/data/company.csv]]]) : rowType = RecordType(): rowcount = 1.0,
cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 256
Thanks,
Sudip
-----Original Message-----
From: Julian Hyde [mailto:[email protected]]
Sent: 16 March 2016 AM 12:50
To: [email protected]
Subject: Re: Optimizing SUM(1) query
Is there any reason why Drill cannot transform SUM(1) to COUNT(*) at an early
stage (i.e. using a logical optimization rule) so that this optimization does
not need to be done for each engine?
> On Mar 15, 2016, at 5:29 AM, Sudip Mukherjee <[email protected]> wrote:
>
> I was trying to have an Optimizer rule for the solr storage plugin that I'm
> working on for this query. Trying to use SOLR field stats for this , so that
> the query is faster..
> Getting the below exception while transforming project to scan. Could you
> please advise?
>
>
> 2016-03-15 08:20:35,149 [291801ee-33fc-064d-7aff-18391f15ae0e:foreman] DEBUG
> o.a.d.e.p.s.h.DefaultSqlHandler - Drill Logical :
> DrillScreenRel: rowcount = 1.0, cumulative cost = {60.1 rows, 320.1
> cpu, 0.0 io, 0.0 network, 176.0 memory}, id = 49
> DrillProjectRel(EXPR$0=[$0]): rowcount = 1.0, cumulative cost = {60.0 rows,
> 320.0 cpu, 0.0 io, 0.0 network, 176.0 memory}, id = 48
> DrillAggregateRel(group=[{}], EXPR$0=[SUM($0)]): rowcount = 1.0,
> cumulative cost = {60.0 rows, 320.0 cpu, 0.0 io, 0.0 network, 176.0 memory},
> id = 46
> DrillProjectRel($f0=[1]): rowcount = 20.0, cumulative cost = {40.0 rows,
> 80.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 44
> DrillScanRel(table=[[solr, ANalert_494]],
> groupscan=[SolrGroupScan [SolrScanSpec=SolrScanSpec
> [solrCoreName=ANalert_494, solrUrl=http://localhost:20000/solr/
> filter=[], solrDocFetchCount=-1, aggreegation=[]], columns=[`*`]]]):
> rowcount = 20.0, cumulative cost = {20.0 rows, 0.0 cpu, 0.0 io, 0.0
> network, 0.0 memory}, id = 26
>
> 2016-03-15 08:20:35,201 [291801ee-33fc-064d-7aff-18391f15ae0e:foreman]
> DEBUG o.a.drill.exec.work.foreman.Foreman -
> 291801ee-33fc-064d-7aff-18391f15ae0e: State change requested PENDING
> --> FAILED
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception
> during fragment initialization: index (0) must be less than size (0)
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:255)
> [drill-java-exec.jar:1.4.0]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> [na:1.8.0_65]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> [na:1.8.0_65]
> at java.lang.Thread.run(Unknown Source) [na:1.8.0_65] Caused by:
> java.lang.IndexOutOfBoundsException: index (0) must be less than size (0)
> at
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:305)
> ~[com.google.guava-guava.jar:na]
> at
> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:284)
> ~[com.google.guava-guava.jar:na]
> at
> com.google.common.collect.EmptyImmutableList.get(EmptyImmutableList.java:80)
> ~[com.google.guava-guava.jar:na]
> at org.apache.calcite.util.Pair$6.get(Pair.java:335)
> ~[org.apache.calcite-calcite-core.jar:1.4.0-drill-r10]
> at
> org.apache.drill.exec.planner.StarColumnHelper.containsStarColumnInProject(StarColumnHelper.java:60)
> ~[drill-java-exec.jar:1.4.0]
> at
> org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitProject(StarColumnConverter.java:138)
> ~[drill-java-exec.jar:1.4.0]
> at
> org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitProject(StarColumnConverter.java:45)
> ~[drill-java-exec.jar:1.4.0]
> at
> org.apache.drill.exec.planner.physical.ProjectPrel.accept(ProjectPrel.java:69)
> ~[drill-java-exec.jar:1.4.0]
> at
> org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitScreen(StarColumnConverter.java:77)
> ~[drill-java-exec.jar:1.4.0]
> at
> org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitScreen(StarColumnConverter.java:45)
> ~[drill-java-exec.jar:1.4.0]
> at
> org.apache.drill.exec.planner.physical.ScreenPrel.accept(ScreenPrel.java:64)
> ~[drill-java-exec.jar:1.4.0]
> at
> org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.insertRenameProject(StarColumnConverter.java:72)
> ~[drill-java-exec.jar:1.4.0]
> at
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel(DefaultSqlHandler.java:326)
> ~[drill-java-exec.jar:1.4.0]
> at
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:175)
> ~[drill-java-exec.jar:1.4.0]
> at
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorke
> r.java:197) ~[drill-java-exec.jar:1.4.0]
>
>
> Thanks,
> Sudip
>
> -----Original Message-----
> From: Sudip Mukherjee [mailto:[email protected]]
> Sent: 07 March 2016 PM 04:45
> To: [email protected]
> Subject: RE: Optimizing SUM(1) query
>
> Hi,
>
> Extremely sorry for the delayed response!
>
> Basically , I am experimenting an Apache Solr plugin for Drill and
> tried to use Tableau. On intial selection of a "Table" , I see these
> queries from drill webui profile
>
> SELECT * FROM (SELECT * FROM `solr`.`CSVStringData`) T LIMIT 0
>
> SELECT * FROM (SELECT SUM(1) AS `COL` FROM `solr`.`CSVStringData`
> `CSVStringData` HAVING COUNT(1)>0) T LIMIT 0
>
> I have a dataset of 2,297,451,8 and when I try access that,the tableau screen
> in kind of stuck in "Processing Request" for 5 minutes.
>
> Thanks,
> Sudip
>
> -----Original Message-----
> From: Andries Engelbrecht [mailto:[email protected]]
> Sent: 20 February 2016 AM 03:42
> To: [email protected]
> Subject: Re: Optimizing SUM(1) query
>
> What are you trying to do in Tableau that generates these queries?
>
> Do you have the actual full query that is being generated?
>
> Tableau has a number of customization features, wonder if it is possible to
> optimize those settings to avoid this issue.
>
> --Andries
>
>
>> On Feb 19, 2016, at 10:16 AM, Sudip Mukherjee <[email protected]>
>> wrote:
>>
>> Hi,
>>
>> Have anyone tried optimizing SUM(1) query in drill? Or is it implemented?
>> Getting these query while using Tableau. Mostly probably it is trying to
>> figure out NUMBER_OF_RECORDS.
>>
>> Thanks,
>> Sudip
>>
>>
>>
>> ***************************Legal
>> Disclaimer***************************
>> "This communication may contain confidential and privileged material
>> for the sole use of the intended recipient. Any unauthorized review,
>> use or distribution by others is strictly prohibited. If you have
>> received the message by mistake, please advise the sender by reply email and
>> delete the message. Thank you."
>> *********************************************************************
>> *
>
>
>
>
>
> ***************************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material for the
> sole use of the intended recipient. Any unauthorized review, use or
> distribution by others is strictly prohibited. If you have received the
> message by mistake, please advise the sender by reply email and delete the
> message. Thank you."
> **********************************************************************
>
>
>
>
> ***************************Legal Disclaimer***************************
> "This communication may contain confidential and privileged material
> for the sole use of the intended recipient. Any unauthorized review,
> use or distribution by others is strictly prohibited. If you have
> received the message by mistake, please advise the sender by reply email and
> delete the message. Thank you."
> **********************************************************************
***************************Legal Disclaimer***************************
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**********************************************************************