Is there any reason why Drill cannot transform SUM(1) to COUNT(*) at an early stage (i.e. using a logical optimization rule) so that this optimization does not need to be done for each engine?
> On Mar 15, 2016, at 5:29 AM, Sudip Mukherjee <[email protected]> wrote: > > I was trying to have an Optimizer rule for the solr storage plugin that I'm > working on for this query. Trying to use SOLR field stats for this , so that > the query is faster.. > Getting the below exception while transforming project to scan. Could you > please advise? > > > 2016-03-15 08:20:35,149 [291801ee-33fc-064d-7aff-18391f15ae0e:foreman] DEBUG > o.a.d.e.p.s.h.DefaultSqlHandler - Drill Logical : > DrillScreenRel: rowcount = 1.0, cumulative cost = {60.1 rows, 320.1 cpu, 0.0 > io, 0.0 network, 176.0 memory}, id = 49 > DrillProjectRel(EXPR$0=[$0]): rowcount = 1.0, cumulative cost = {60.0 rows, > 320.0 cpu, 0.0 io, 0.0 network, 176.0 memory}, id = 48 > DrillAggregateRel(group=[{}], EXPR$0=[SUM($0)]): rowcount = 1.0, > cumulative cost = {60.0 rows, 320.0 cpu, 0.0 io, 0.0 network, 176.0 memory}, > id = 46 > DrillProjectRel($f0=[1]): rowcount = 20.0, cumulative cost = {40.0 rows, > 80.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 44 > DrillScanRel(table=[[solr, ANalert_494]], groupscan=[SolrGroupScan > [SolrScanSpec=SolrScanSpec [solrCoreName=ANalert_494, > solrUrl=http://localhost:20000/solr/ filter=[], solrDocFetchCount=-1, > aggreegation=[]], columns=[`*`]]]): rowcount = 20.0, cumulative cost = {20.0 > rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 26 > > 2016-03-15 08:20:35,201 [291801ee-33fc-064d-7aff-18391f15ae0e:foreman] DEBUG > o.a.drill.exec.work.foreman.Foreman - 291801ee-33fc-064d-7aff-18391f15ae0e: > State change requested PENDING --> FAILED > org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception > during fragment initialization: index (0) must be less than size (0) > at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:255) > [drill-java-exec.jar:1.4.0] > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > [na:1.8.0_65] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > [na:1.8.0_65] > at java.lang.Thread.run(Unknown Source) [na:1.8.0_65] > Caused by: java.lang.IndexOutOfBoundsException: index (0) must be less than > size (0) > at > com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:305) > ~[com.google.guava-guava.jar:na] > at > com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:284) > ~[com.google.guava-guava.jar:na] > at > com.google.common.collect.EmptyImmutableList.get(EmptyImmutableList.java:80) > ~[com.google.guava-guava.jar:na] > at org.apache.calcite.util.Pair$6.get(Pair.java:335) > ~[org.apache.calcite-calcite-core.jar:1.4.0-drill-r10] > at > org.apache.drill.exec.planner.StarColumnHelper.containsStarColumnInProject(StarColumnHelper.java:60) > ~[drill-java-exec.jar:1.4.0] > at > org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitProject(StarColumnConverter.java:138) > ~[drill-java-exec.jar:1.4.0] > at > org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitProject(StarColumnConverter.java:45) > ~[drill-java-exec.jar:1.4.0] > at > org.apache.drill.exec.planner.physical.ProjectPrel.accept(ProjectPrel.java:69) > ~[drill-java-exec.jar:1.4.0] > at > org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitScreen(StarColumnConverter.java:77) > ~[drill-java-exec.jar:1.4.0] > at > org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.visitScreen(StarColumnConverter.java:45) > ~[drill-java-exec.jar:1.4.0] > at > org.apache.drill.exec.planner.physical.ScreenPrel.accept(ScreenPrel.java:64) > ~[drill-java-exec.jar:1.4.0] > at > org.apache.drill.exec.planner.physical.visitor.StarColumnConverter.insertRenameProject(StarColumnConverter.java:72) > ~[drill-java-exec.jar:1.4.0] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPrel(DefaultSqlHandler.java:326) > ~[drill-java-exec.jar:1.4.0] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:175) > ~[drill-java-exec.jar:1.4.0] > at > org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:197) > ~[drill-java-exec.jar:1.4.0] > > > Thanks, > Sudip > > -----Original Message----- > From: Sudip Mukherjee [mailto:[email protected]] > Sent: 07 March 2016 PM 04:45 > To: [email protected] > Subject: RE: Optimizing SUM(1) query > > Hi, > > Extremely sorry for the delayed response! > > Basically , I am experimenting an Apache Solr plugin for Drill and tried to > use Tableau. On intial selection of a "Table" , I see these queries from > drill webui profile > > SELECT * FROM (SELECT * FROM `solr`.`CSVStringData`) T LIMIT 0 > > SELECT * FROM (SELECT SUM(1) AS `COL` FROM `solr`.`CSVStringData` > `CSVStringData` HAVING COUNT(1)>0) T LIMIT 0 > > I have a dataset of 2,297,451,8 and when I try access that,the tableau screen > in kind of stuck in "Processing Request" for 5 minutes. > > Thanks, > Sudip > > -----Original Message----- > From: Andries Engelbrecht [mailto:[email protected]] > Sent: 20 February 2016 AM 03:42 > To: [email protected] > Subject: Re: Optimizing SUM(1) query > > What are you trying to do in Tableau that generates these queries? > > Do you have the actual full query that is being generated? > > Tableau has a number of customization features, wonder if it is possible to > optimize those settings to avoid this issue. > > --Andries > > >> On Feb 19, 2016, at 10:16 AM, Sudip Mukherjee <[email protected]> >> wrote: >> >> Hi, >> >> Have anyone tried optimizing SUM(1) query in drill? Or is it implemented? >> Getting these query while using Tableau. Mostly probably it is trying to >> figure out NUMBER_OF_RECORDS. >> >> Thanks, >> Sudip >> >> >> >> ***************************Legal Disclaimer*************************** >> "This communication may contain confidential and privileged material >> for the sole use of the intended recipient. Any unauthorized review, >> use or distribution by others is strictly prohibited. If you have >> received the message by mistake, please advise the sender by reply email and >> delete the message. Thank you." >> ********************************************************************** > > > > > > ***************************Legal Disclaimer*************************** > "This communication may contain confidential and privileged material for the > sole use of the intended recipient. Any unauthorized review, use or > distribution by others is strictly prohibited. If you have received the > message by mistake, please advise the sender by reply email and delete the > message. Thank you." > ********************************************************************** > > > > > ***************************Legal Disclaimer*************************** > "This communication may contain confidential and privileged material for the > sole use of the intended recipient. Any unauthorized review, use or > distribution > by others is strictly prohibited. If you have received the message by mistake, > please advise the sender by reply email and delete the message. Thank you." > **********************************************************************
