[jira] [Updated] (DRILL-6657) Unnest reports one batch less than the actual number of batches
[ https://issues.apache.org/jira/browse/DRILL-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6657: - Fix Version/s: 1.15.0 > Unnest reports one batch less than the actual number of batches > --- > > Key: DRILL-6657 > URL: https://issues.apache.org/jira/browse/DRILL-6657 > Project: Apache Drill > Issue Type: Bug >Reporter: Parth Chandra >Assignee: Parth Chandra >Priority: Major > Fix For: 1.15.0 > > > Unnest doesn't count the first batch that comes in. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6653) Unsupported Schema change exception where there is no schema change in lateral Unnest queries
[ https://issues.apache.org/jira/browse/DRILL-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6653: - Fix Version/s: 1.15.0 > Unsupported Schema change exception where there is no schema change in > lateral Unnest queries > - > > Key: DRILL-6653 > URL: https://issues.apache.org/jira/browse/DRILL-6653 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Parth Chandra >Priority: Major > Fix For: 1.15.0 > > > Unsupported Schema change exception where there is no schema change > DataSet - A single json file(sf1) > Query - > {code} > select customer.c_custkey, customer.c_name, sum(orders.totalprice) totalprice > from customer, lateral (select t.o.o_totalprice as totalprice from > unnest(customer.c_orders) t(o) order by totalprice limit 10) orders group by > customer.c_custkey, customer.c_name order by customer.c_custkey limit 50; > {code} > Result - > {code} > Exception: > java.sql.SQLException: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not > support schema change > Prior schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > New schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > Fragment 0:0 > [Error Id: 21d4d646-4e6a-4e4a-ba75-60ba247ddabd on drill191:31010] > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528) > at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:632) > at > oadd.org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:207) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:153) > at > org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:253) > at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema change > Prior schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > New schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > Fragment 0:0 > [Error Id: 21d4d646-4e6a-4e4a-ba75-60ba247ddabd on drill191:31010] > at > oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) > at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) > at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) > at > oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:274) > at > oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:244) > at > oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > at > oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > at > oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.
[jira] [Updated] (DRILL-6654) Data verification failure with lateral unnest query having filter in and order by
[ https://issues.apache.org/jira/browse/DRILL-6654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6654: - Fix Version/s: 1.15.0 > Data verification failure with lateral unnest query having filter in and > order by > - > > Key: DRILL-6654 > URL: https://issues.apache.org/jira/browse/DRILL-6654 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > Attachments: Lateral Parquet.pdf, Lateral json.pdf, flatten.pdf > > > Data verification failure with lateral unnest query having filter in and > order by . > lateral query - > {code} > select customer.c_custkey, customer.c_name, orders.totalprice from customer, > lateral (select sum(t.o.o_totalprice) as totalprice from > unnest(customer.c_orders) t(o) WHERE t.o.o_totalprice in > (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76)) orders > order by customer.c_custkey limit 50; > {code} > result :- > {code} > ++-+-+ > | c_custkey | c_name | totalprice | > ++-+-+ > | 101276 | Customer#000101276 | 82657.72 | > | 120295 | Customer#000120295 | 266119.96 | > | 120376 | Customer#000120376 | 180309.76 | > ++-+-+ > {code} > flatten query - > {code} > select f.c_custkey, f.c_name, sum(f.o.o_totalprice) from (select c_custkey, > c_name, flatten(c_orders) as o from customer) f WHERE f.o.o_totalprice in > (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76) group by > f.c_custkey, f.c_name order by f.c_custkey limit 50; > {code} > result :- > {code} > ++-++ > | c_custkey | c_name | EXPR$2 | > ++-++ > | 101276 | Customer#000101276 | 82657.72 | > | 120376 | Customer#000120376 | 180309.76 | > ++-++ > {code} > PS :- The above results are for Parquet type data .The same query for JSON > data gives identical result given as follows :- > {code} > ++-++ > | c_custkey | c_name | EXPR$2 | > ++-++ > | 101276 | Customer#000101276 | 82657.72 | > | 120376 | Customer#000120376 | 180309.76 | > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-6654) Data verification failure with lateral unnest query having filter in and order by
[ https://issues.apache.org/jira/browse/DRILL-6654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker reassigned DRILL-6654: Assignee: Sorabh Hamirwasia > Data verification failure with lateral unnest query having filter in and > order by > - > > Key: DRILL-6654 > URL: https://issues.apache.org/jira/browse/DRILL-6654 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > Attachments: Lateral Parquet.pdf, Lateral json.pdf, flatten.pdf > > > Data verification failure with lateral unnest query having filter in and > order by . > lateral query - > {code} > select customer.c_custkey, customer.c_name, orders.totalprice from customer, > lateral (select sum(t.o.o_totalprice) as totalprice from > unnest(customer.c_orders) t(o) WHERE t.o.o_totalprice in > (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76)) orders > order by customer.c_custkey limit 50; > {code} > result :- > {code} > ++-+-+ > | c_custkey | c_name | totalprice | > ++-+-+ > | 101276 | Customer#000101276 | 82657.72 | > | 120295 | Customer#000120295 | 266119.96 | > | 120376 | Customer#000120376 | 180309.76 | > ++-+-+ > {code} > flatten query - > {code} > select f.c_custkey, f.c_name, sum(f.o.o_totalprice) from (select c_custkey, > c_name, flatten(c_orders) as o from customer) f WHERE f.o.o_totalprice in > (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76) group by > f.c_custkey, f.c_name order by f.c_custkey limit 50; > {code} > result :- > {code} > ++-++ > | c_custkey | c_name | EXPR$2 | > ++-++ > | 101276 | Customer#000101276 | 82657.72 | > | 120376 | Customer#000120376 | 180309.76 | > ++-++ > {code} > PS :- The above results are for Parquet type data .The same query for JSON > data gives identical result given as follows :- > {code} > ++-++ > | c_custkey | c_name | EXPR$2 | > ++-++ > | 101276 | Customer#000101276 | 82657.72 | > | 120376 | Customer#000120376 | 180309.76 | > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-6653) Unsupported Schema change exception where there is no schema change in lateral Unnest queries
[ https://issues.apache.org/jira/browse/DRILL-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker reassigned DRILL-6653: Assignee: Sorabh Hamirwasia > Unsupported Schema change exception where there is no schema change in > lateral Unnest queries > - > > Key: DRILL-6653 > URL: https://issues.apache.org/jira/browse/DRILL-6653 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > > Unsupported Schema change exception where there is no schema change > DataSet - A single json file(sf1) > Query - > {code} > select customer.c_custkey, customer.c_name, sum(orders.totalprice) totalprice > from customer, lateral (select t.o.o_totalprice as totalprice from > unnest(customer.c_orders) t(o) order by totalprice limit 10) orders group by > customer.c_custkey, customer.c_name order by customer.c_custkey limit 50; > {code} > Result - > {code} > Exception: > java.sql.SQLException: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not > support schema change > Prior schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > New schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > Fragment 0:0 > [Error Id: 21d4d646-4e6a-4e4a-ba75-60ba247ddabd on drill191:31010] > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528) > at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:632) > at > oadd.org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:207) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:153) > at > org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:253) > at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema change > Prior schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > New schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > Fragment 0:0 > [Error Id: 21d4d646-4e6a-4e4a-ba75-60ba247ddabd on drill191:31010] > at > oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) > at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) > at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) > at > oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:274) > at > oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:244) > at > oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > at > oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > at > oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > at
[jira] [Assigned] (DRILL-6653) Unsupported Schema change exception where there is no schema change in lateral Unnest queries
[ https://issues.apache.org/jira/browse/DRILL-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker reassigned DRILL-6653: Assignee: Parth Chandra (was: Sorabh Hamirwasia) > Unsupported Schema change exception where there is no schema change in > lateral Unnest queries > - > > Key: DRILL-6653 > URL: https://issues.apache.org/jira/browse/DRILL-6653 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Parth Chandra >Priority: Major > > Unsupported Schema change exception where there is no schema change > DataSet - A single json file(sf1) > Query - > {code} > select customer.c_custkey, customer.c_name, sum(orders.totalprice) totalprice > from customer, lateral (select t.o.o_totalprice as totalprice from > unnest(customer.c_orders) t(o) order by totalprice limit 10) orders group by > customer.c_custkey, customer.c_name order by customer.c_custkey limit 50; > {code} > Result - > {code} > Exception: > java.sql.SQLException: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not > support schema change > Prior schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > New schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > Fragment 0:0 > [Error Id: 21d4d646-4e6a-4e4a-ba75-60ba247ddabd on drill191:31010] > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528) > at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:632) > at > oadd.org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:207) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:153) > at > org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:253) > at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: > UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema change > Prior schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > New schema : > BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` > (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] > Fragment 0:0 > [Error Id: 21d4d646-4e6a-4e4a-ba75-60ba247ddabd on drill191:31010] > at > oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) > at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) > at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) > at > oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:274) > at > oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:244) > at > oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > at > oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) > at > oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) > at > oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerConte
[jira] [Commented] (DRILL-6453) TPC-DS query 72 has regressed
[ https://issues.apache.org/jira/browse/DRILL-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564671#comment-16564671 ] ASF GitHub Bot commented on DRILL-6453: --- Ben-Zvi commented on a change in pull request #1408: DRILL-6453: Resolve deadlock when reading from build and probe sides simultaneously in HashJoin URL: https://github.com/apache/drill/pull/1408#discussion_r206742825 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinBatch.java ## @@ -248,32 +254,54 @@ protected void buildSchema() throws SchemaChangeException { } } - @Override - protected boolean prefetchFirstBatchFromBothSides() { -if (leftUpstream != IterOutcome.NONE) { - // We can only get data if there is data available - leftUpstream = sniffNonEmptyBatch(leftUpstream, LEFT_INDEX, left); -} - + private void prefetchFirstBuildBatch() { if (rightUpstream != IterOutcome.NONE) { // We can only get data if there is data available rightUpstream = sniffNonEmptyBatch(rightUpstream, RIGHT_INDEX, right); } buildSideIsEmpty = rightUpstream == IterOutcome.NONE; -if (verifyOutcomeToSetBatchState(leftUpstream, rightUpstream)) { +if (rightUpstream == IterOutcome.OUT_OF_MEMORY) { + // We reached a termination state + state = BatchState.OUT_OF_MEMORY; +} else if (rightUpstream == IterOutcome.STOP) { + state = BatchState.STOP; +} else { // For build side, use aggregate i.e. average row width across batches - batchMemoryManager.update(LEFT_INDEX, 0); batchMemoryManager.update(RIGHT_INDEX, 0, true); - - logger.debug("BATCH_STATS, incoming left: {}", batchMemoryManager.getRecordBatchSizer(LEFT_INDEX)); logger.debug("BATCH_STATS, incoming right: {}", batchMemoryManager.getRecordBatchSizer(RIGHT_INDEX)); // Got our first batche(s) state = BatchState.FIRST; +} + } + + /** + * + * @return True terminate. False continue. + */ + private boolean prefetchFirstProbeBatch() { Review comment: To reduce code duplication: Can have a generic "prefetch first" method, combining this method with *prefetchFirstBuildBatch()* and returning the up stream. (and set the "empty" boolean after that.) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > TPC-DS query 72 has regressed > - > > Key: DRILL-6453 > URL: https://issues.apache.org/jira/browse/DRILL-6453 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.14.0 >Reporter: Khurram Faraaz >Assignee: Timothy Farkas >Priority: Blocker > Fix For: 1.15.0 > > Attachments: 24f75b18-014a-fb58-21d2-baeab5c3352c.sys.drill, > jstack_29173_June_10_2018.txt, jstack_29173_June_10_2018.txt, > jstack_29173_June_10_2018_b.txt, jstack_29173_June_10_2018_b.txt, > jstack_29173_June_10_2018_c.txt, jstack_29173_June_10_2018_c.txt, > jstack_29173_June_10_2018_d.txt, jstack_29173_June_10_2018_d.txt, > jstack_29173_June_10_2018_e.txt, jstack_29173_June_10_2018_e.txt > > > TPC-DS query 72 seems to have regressed, query profile for the case where it > Canceled after 2 hours on Drill 1.14.0 is attached here. > {noformat} > On, Drill 1.14.0-SNAPSHOT > commit : 931b43e (TPC-DS query 72 executed successfully on this commit, took > around 55 seconds to execute) > SF1 parquet data on 4 nodes; > planner.memory.max_query_memory_per_node = 10737418240. > drill.exec.hashagg.fallback.enabled = true > TPC-DS query 72 executed successfully & took 47 seconds to complete execution. > {noformat} > {noformat} > TPC-DS data in the below run has date values stored as DATE datatype and not > VARCHAR type > On, Drill 1.14.0-SNAPSHOT > commit : 82e1a12 > SF1 parquet data on 4 nodes; > planner.memory.max_query_memory_per_node = 10737418240. > drill.exec.hashagg.fallback.enabled = true > and > alter system set `exec.hashjoin.num_partitions` = 1; > TPC-DS query 72 executed for 2 hrs and 11 mins and did not complete, I had to > Cancel it by stopping the Foreman drillbit. > As a result several minor fragments are reported to be in > CANCELLATION_REQUESTED state on UI. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6453) TPC-DS query 72 has regressed
[ https://issues.apache.org/jira/browse/DRILL-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564670#comment-16564670 ] ASF GitHub Bot commented on DRILL-6453: --- Ben-Zvi commented on a change in pull request #1408: DRILL-6453: Resolve deadlock when reading from build and probe sides simultaneously in HashJoin URL: https://github.com/apache/drill/pull/1408#discussion_r206743497 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinBatch.java ## @@ -381,16 +409,14 @@ public HashJoinMemoryCalculator getCalculatorImpl() { @Override public IterOutcome innerNext() { -if (!prefetched) { +if (!prefetchedBuild) { // If we didn't retrieve our first data hold batch, we need to do it now. - prefetched = true; - prefetchFirstBatchFromBothSides(); + prefetchedBuild = true; + prefetchFirstBuildBatch(); // Handle emitting the correct outcome for termination conditions - // Use the state set by prefetchFirstBatchFromBothSides to emit the correct termination outcome. + // Use the state set by prefetchFirstBuildBatch to emit the correct termination outcome. Review comment: Code cleaning: The check below (switch statement) can be done as part of the identical check after executeBuildPhase() (which is skipped anyway if STOP). Also maybe move the "wasKilled" check first thing in innerNext(). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > TPC-DS query 72 has regressed > - > > Key: DRILL-6453 > URL: https://issues.apache.org/jira/browse/DRILL-6453 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.14.0 >Reporter: Khurram Faraaz >Assignee: Timothy Farkas >Priority: Blocker > Fix For: 1.15.0 > > Attachments: 24f75b18-014a-fb58-21d2-baeab5c3352c.sys.drill, > jstack_29173_June_10_2018.txt, jstack_29173_June_10_2018.txt, > jstack_29173_June_10_2018_b.txt, jstack_29173_June_10_2018_b.txt, > jstack_29173_June_10_2018_c.txt, jstack_29173_June_10_2018_c.txt, > jstack_29173_June_10_2018_d.txt, jstack_29173_June_10_2018_d.txt, > jstack_29173_June_10_2018_e.txt, jstack_29173_June_10_2018_e.txt > > > TPC-DS query 72 seems to have regressed, query profile for the case where it > Canceled after 2 hours on Drill 1.14.0 is attached here. > {noformat} > On, Drill 1.14.0-SNAPSHOT > commit : 931b43e (TPC-DS query 72 executed successfully on this commit, took > around 55 seconds to execute) > SF1 parquet data on 4 nodes; > planner.memory.max_query_memory_per_node = 10737418240. > drill.exec.hashagg.fallback.enabled = true > TPC-DS query 72 executed successfully & took 47 seconds to complete execution. > {noformat} > {noformat} > TPC-DS data in the below run has date values stored as DATE datatype and not > VARCHAR type > On, Drill 1.14.0-SNAPSHOT > commit : 82e1a12 > SF1 parquet data on 4 nodes; > planner.memory.max_query_memory_per_node = 10737418240. > drill.exec.hashagg.fallback.enabled = true > and > alter system set `exec.hashjoin.num_partitions` = 1; > TPC-DS query 72 executed for 2 hrs and 11 mins and did not complete, I had to > Cancel it by stopping the Foreman drillbit. > As a result several minor fragments are reported to be in > CANCELLATION_REQUESTED state on UI. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-6657) Unnest reports one batch less than the actual number of batches
[ https://issues.apache.org/jira/browse/DRILL-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Chandra reassigned DRILL-6657: Assignee: Parth Chandra > Unnest reports one batch less than the actual number of batches > --- > > Key: DRILL-6657 > URL: https://issues.apache.org/jira/browse/DRILL-6657 > Project: Apache Drill > Issue Type: Bug >Reporter: Parth Chandra >Assignee: Parth Chandra >Priority: Major > > Unnest doesn't count the first batch that comes in. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6657) Unnest reports one batch less than the actual number of batches
Parth Chandra created DRILL-6657: Summary: Unnest reports one batch less than the actual number of batches Key: DRILL-6657 URL: https://issues.apache.org/jira/browse/DRILL-6657 Project: Apache Drill Issue Type: Bug Reporter: Parth Chandra Unnest doesn't count the first batch that comes in. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6655) Require Package Declaration In Checkstyle
[ https://issues.apache.org/jira/browse/DRILL-6655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564519#comment-16564519 ] ASF GitHub Bot commented on DRILL-6655: --- ilooner commented on issue #1412: DRILL-6655: Require package declaration in files. URL: https://github.com/apache/drill/pull/1412#issuecomment-409404958 @arina-ielchiieva please review This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Require Package Declaration In Checkstyle > - > > Key: DRILL-6655 > URL: https://issues.apache.org/jira/browse/DRILL-6655 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.15.0 >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6656) Add Regex To Disallow Extra Semicolons In Imports
Timothy Farkas created DRILL-6656: - Summary: Add Regex To Disallow Extra Semicolons In Imports Key: DRILL-6656 URL: https://issues.apache.org/jira/browse/DRILL-6656 Project: Apache Drill Issue Type: Improvement Reporter: Timothy Farkas Assignee: Timothy Farkas -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6655) Require Package Declaration In Checkstyle
[ https://issues.apache.org/jira/browse/DRILL-6655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6655: -- Affects Version/s: 1.15.0 > Require Package Declaration In Checkstyle > - > > Key: DRILL-6655 > URL: https://issues.apache.org/jira/browse/DRILL-6655 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.15.0 >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6655) Require Package Declaration In Checkstyle
[ https://issues.apache.org/jira/browse/DRILL-6655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6655: -- Reviewer: Arina Ielchiieva > Require Package Declaration In Checkstyle > - > > Key: DRILL-6655 > URL: https://issues.apache.org/jira/browse/DRILL-6655 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.15.0 >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6655) Require Package Declaration In Checkstyle
[ https://issues.apache.org/jira/browse/DRILL-6655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6655: -- Fix Version/s: 1.15.0 > Require Package Declaration In Checkstyle > - > > Key: DRILL-6655 > URL: https://issues.apache.org/jira/browse/DRILL-6655 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.15.0 >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6655) Require Package Declaration In Checkstyle
[ https://issues.apache.org/jira/browse/DRILL-6655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564513#comment-16564513 ] ASF GitHub Bot commented on DRILL-6655: --- ilooner opened a new pull request #1412: DRILL-6655: Require package declaration in files. URL: https://github.com/apache/drill/pull/1412 This checkstyle check prevents errors with package declarations (https://issues.apache.org/jira/browse/DRILL-6651) from sneaking in. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Require Package Declaration In Checkstyle > - > > Key: DRILL-6655 > URL: https://issues.apache.org/jira/browse/DRILL-6655 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6655) Require Package Declaration In Checkstyle
Timothy Farkas created DRILL-6655: - Summary: Require Package Declaration In Checkstyle Key: DRILL-6655 URL: https://issues.apache.org/jira/browse/DRILL-6655 Project: Apache Drill Issue Type: Improvement Reporter: Timothy Farkas Assignee: Timothy Farkas -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6651) Compilation error in IDE due to missing package name
[ https://issues.apache.org/jira/browse/DRILL-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564488#comment-16564488 ] ASF GitHub Bot commented on DRILL-6651: --- ilooner closed pull request #1411: DRILL-6651: Add missing package statements URL: https://github.com/apache/drill/pull/1411 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestComplexColumnInSchema.java b/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestComplexColumnInSchema.java index d0977b8f38f..29c223778e5 100644 --- a/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestComplexColumnInSchema.java +++ b/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestComplexColumnInSchema.java @@ -15,9 +15,9 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -import org.apache.drill.common.expression.SchemaPath; -import org.apache.drill.exec.store.parquet.ParquetReaderUtility; +package org.apache.drill.exec.store.parquet; +import org.apache.drill.common.expression.SchemaPath; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.parquet.hadoop.metadata.ParquetMetadata; diff --git a/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetReaderUtility.java b/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetReaderUtility.java index 4b24212c378..1aab4ab9db7 100644 --- a/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetReaderUtility.java +++ b/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetReaderUtility.java @@ -15,7 +15,7 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -import org.apache.drill.exec.store.parquet.ParquetReaderUtility; +package org.apache.drill.exec.store.parquet; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Compilation error in IDE due to missing package name > > > Key: DRILL-6651 > URL: https://issues.apache.org/jira/browse/DRILL-6651 > Project: Apache Drill > Issue Type: Bug >Reporter: Aman Sinha >Assignee: Boaz Ben-Zvi >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0, 1.15.0 > > > I am seeing the following compilation errors in my Eclipse build (only in the > IDE.. this does not happen on the maven command line): > {noformat} > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestComplexColumnInSchema.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line 1 > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestParquetReaderUtility.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line > 1{noformat} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6650) Remove Stray Semicolon in Printing Results Listener
[ https://issues.apache.org/jira/browse/DRILL-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564489#comment-16564489 ] ASF GitHub Bot commented on DRILL-6650: --- ilooner closed pull request #1410: DRILL-6650: Remove stray semicolon in imports for PrintingResultsListener. URL: https://github.com/apache/drill/pull/1410 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/exec/java-exec/src/test/java/org/apache/drill/test/PrintingResultsListener.java b/exec/java-exec/src/test/java/org/apache/drill/test/PrintingResultsListener.java index f5cd9954056..e1cfaeb8b09 100644 --- a/exec/java-exec/src/test/java/org/apache/drill/test/PrintingResultsListener.java +++ b/exec/java-exec/src/test/java/org/apache/drill/test/PrintingResultsListener.java @@ -16,7 +16,7 @@ * limitations under the License. */ package org.apache.drill.test; -; + import org.apache.drill.common.config.DrillConfig; import org.apache.drill.common.exceptions.UserException; import org.apache.drill.exec.client.LoggingResultsListener; This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Remove Stray Semicolon in Printing Results Listener > --- > > Key: DRILL-6650 > URL: https://issues.apache.org/jira/browse/DRILL-6650 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.14.0 >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0, 1.15.0 > > > Having empty import statements with stray semicolons can cause compilation in > eclipse to fail. I investigated adding a checkstyle check to prevent empty > import declarations, but apparently there is no way to do this. In fact > having multiple semicolons in an import statement is technically not > supported by the java spec > https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. > There is also a bug filed for the jdk for not throwing an error in this case > https://bugs.openjdk.java.net/browse/JDK-8072390 > Since there is no way to automate checking this I am just going to manually > remove this error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6650) Remove Stray Semicolon in Printing Results Listener
[ https://issues.apache.org/jira/browse/DRILL-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6650: -- Fix Version/s: 1.15.0 > Remove Stray Semicolon in Printing Results Listener > --- > > Key: DRILL-6650 > URL: https://issues.apache.org/jira/browse/DRILL-6650 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.14.0 >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0, 1.15.0 > > > Having empty import statements with stray semicolons can cause compilation in > eclipse to fail. I investigated adding a checkstyle check to prevent empty > import declarations, but apparently there is no way to do this. In fact > having multiple semicolons in an import statement is technically not > supported by the java spec > https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. > There is also a bug filed for the jdk for not throwing an error in this case > https://bugs.openjdk.java.net/browse/JDK-8072390 > Since there is no way to automate checking this I am just going to manually > remove this error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6650) Remove Stray Semicolon in Printing Results Listener
[ https://issues.apache.org/jira/browse/DRILL-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6650: -- Affects Version/s: 1.14.0 > Remove Stray Semicolon in Printing Results Listener > --- > > Key: DRILL-6650 > URL: https://issues.apache.org/jira/browse/DRILL-6650 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.14.0 >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0 > > > Having empty import statements with stray semicolons can cause compilation in > eclipse to fail. I investigated adding a checkstyle check to prevent empty > import declarations, but apparently there is no way to do this. In fact > having multiple semicolons in an import statement is technically not > supported by the java spec > https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. > There is also a bug filed for the jdk for not throwing an error in this case > https://bugs.openjdk.java.net/browse/JDK-8072390 > Since there is no way to automate checking this I am just going to manually > remove this error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6650) Remove Stray Semicolon in Printing Results Listener
[ https://issues.apache.org/jira/browse/DRILL-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6650: -- Fix Version/s: 1.14.0 > Remove Stray Semicolon in Printing Results Listener > --- > > Key: DRILL-6650 > URL: https://issues.apache.org/jira/browse/DRILL-6650 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.14.0 >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0 > > > Having empty import statements with stray semicolons can cause compilation in > eclipse to fail. I investigated adding a checkstyle check to prevent empty > import declarations, but apparently there is no way to do this. In fact > having multiple semicolons in an import statement is technically not > supported by the java spec > https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. > There is also a bug filed for the jdk for not throwing an error in this case > https://bugs.openjdk.java.net/browse/JDK-8072390 > Since there is no way to automate checking this I am just going to manually > remove this error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6651) Compilation error in IDE due to missing package name
[ https://issues.apache.org/jira/browse/DRILL-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aman Sinha updated DRILL-6651: -- Labels: ready-to-commit (was: ) > Compilation error in IDE due to missing package name > > > Key: DRILL-6651 > URL: https://issues.apache.org/jira/browse/DRILL-6651 > Project: Apache Drill > Issue Type: Bug >Reporter: Aman Sinha >Assignee: Boaz Ben-Zvi >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0, 1.15.0 > > > I am seeing the following compilation errors in my Eclipse build (only in the > IDE.. this does not happen on the maven command line): > {noformat} > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestComplexColumnInSchema.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line 1 > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestParquetReaderUtility.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line > 1{noformat} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6650) Remove Stray Semicolon in Printing Results Listener
[ https://issues.apache.org/jira/browse/DRILL-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aman Sinha updated DRILL-6650: -- Labels: ready-to-commit (was: ) > Remove Stray Semicolon in Printing Results Listener > --- > > Key: DRILL-6650 > URL: https://issues.apache.org/jira/browse/DRILL-6650 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > Labels: ready-to-commit > > Having empty import statements with stray semicolons can cause compilation in > eclipse to fail. I investigated adding a checkstyle check to prevent empty > import declarations, but apparently there is no way to do this. In fact > having multiple semicolons in an import statement is technically not > supported by the java spec > https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. > There is also a bug filed for the jdk for not throwing an error in this case > https://bugs.openjdk.java.net/browse/JDK-8072390 > Since there is no way to automate checking this I am just going to manually > remove this error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6618) Unnest changes for implicit column
[ https://issues.apache.org/jira/browse/DRILL-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas resolved DRILL-6618. --- Resolution: Done > Unnest changes for implicit column > -- > > Key: DRILL-6618 > URL: https://issues.apache.org/jira/browse/DRILL-6618 > Project: Apache Drill > Issue Type: Sub-task > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Parth Chandra >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > 1) Update unnest to work on entire left incoming instead of row by row > processing. > 2) Update unnest to generate an implicit field (name passed in PopConfig) > with rowId of each output row being generated. The type of implicit field > will be IntVector. > 3) Fix all existing unit tests -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6651) Compilation error in IDE due to missing package name
[ https://issues.apache.org/jira/browse/DRILL-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564457#comment-16564457 ] ASF GitHub Bot commented on DRILL-6651: --- Ben-Zvi opened a new pull request #1411: DRILL-6651: Add missing package statements URL: https://github.com/apache/drill/pull/1411 Added missing "package" stmts and removed thus unused imports (of those packages). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Compilation error in IDE due to missing package name > > > Key: DRILL-6651 > URL: https://issues.apache.org/jira/browse/DRILL-6651 > Project: Apache Drill > Issue Type: Bug >Reporter: Aman Sinha >Assignee: Boaz Ben-Zvi >Priority: Major > Fix For: 1.14.0, 1.15.0 > > > I am seeing the following compilation errors in my Eclipse build (only in the > IDE.. this does not happen on the maven command line): > {noformat} > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestComplexColumnInSchema.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line 1 > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestParquetReaderUtility.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line > 1{noformat} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6629) BitVector split and transfer does not work correctly for transfer length < 8
[ https://issues.apache.org/jira/browse/DRILL-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthikeyan Manivannan updated DRILL-6629: -- Labels: ready-to-commit (was: ) > BitVector split and transfer does not work correctly for transfer length < 8 > > > Key: DRILL-6629 > URL: https://issues.apache.org/jira/browse/DRILL-6629 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Data Types > Environment: BitVector split and transfer does not work correctly for > transfer length < 8. >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6654) Data verification failure with lateral unnest query having filter in and order by
Kedar Sankar Behera created DRILL-6654: -- Summary: Data verification failure with lateral unnest query having filter in and order by Key: DRILL-6654 URL: https://issues.apache.org/jira/browse/DRILL-6654 Project: Apache Drill Issue Type: Bug Affects Versions: 1.14.0 Reporter: Kedar Sankar Behera Attachments: Lateral Parquet.pdf, Lateral json.pdf, flatten.pdf Data verification failure with lateral unnest query having filter in and order by . lateral query - {code} select customer.c_custkey, customer.c_name, orders.totalprice from customer, lateral (select sum(t.o.o_totalprice) as totalprice from unnest(customer.c_orders) t(o) WHERE t.o.o_totalprice in (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76)) orders order by customer.c_custkey limit 50; {code} result :- {code} ++-+-+ | c_custkey | c_name | totalprice | ++-+-+ | 101276 | Customer#000101276 | 82657.72 | | 120295 | Customer#000120295 | 266119.96 | | 120376 | Customer#000120376 | 180309.76 | ++-+-+ {code} flatten query - {code} select f.c_custkey, f.c_name, sum(f.o.o_totalprice) from (select c_custkey, c_name, flatten(c_orders) as o from customer) f WHERE f.o.o_totalprice in (89230.03,270087.44,246408.53,82657.72,153941.38,65277.06,180309.76) group by f.c_custkey, f.c_name order by f.c_custkey limit 50; {code} result :- {code} ++-++ | c_custkey | c_name | EXPR$2 | ++-++ | 101276 | Customer#000101276 | 82657.72 | | 120376 | Customer#000120376 | 180309.76 | ++-++ {code} PS :- The above results are for Parquet type data .The same query for JSON data gives identical result given as follows :- {code} ++-++ | c_custkey | c_name | EXPR$2 | ++-++ | 101276 | Customer#000101276 | 82657.72 | | 120376 | Customer#000120376 | 180309.76 | {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6653) Unsupported Schema change exception where there is no schema change in lateral Unnest queries
Kedar Sankar Behera created DRILL-6653: -- Summary: Unsupported Schema change exception where there is no schema change in lateral Unnest queries Key: DRILL-6653 URL: https://issues.apache.org/jira/browse/DRILL-6653 Project: Apache Drill Issue Type: Bug Affects Versions: 1.14.0 Reporter: Kedar Sankar Behera Unsupported Schema change exception where there is no schema change DataSet - A single json file(sf1) Query - {code} select customer.c_custkey, customer.c_name, sum(orders.totalprice) totalprice from customer, lateral (select t.o.o_totalprice as totalprice from unnest(customer.c_orders) t(o) order by totalprice limit 10) orders group by customer.c_custkey, customer.c_name order by customer.c_custkey limit 50; {code} Result - {code} Exception: java.sql.SQLException: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema change Prior schema : BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] New schema : BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] Fragment 0:0 [Error Id: 21d4d646-4e6a-4e4a-ba75-60ba247ddabd on drill191:31010] at org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:528) at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:632) at oadd.org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:207) at org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:153) at org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:253) at org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema change Prior schema : BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] New schema : BatchSchema [fields=[[`c_custkey` (VARCHAR:OPTIONAL)], [`c_name` (VARCHAR:OPTIONAL)], [`totalprice` (FLOAT8:OPTIONAL)]], selectionVector=NONE] Fragment 0:0 [Error Id: 21d4d646-4e6a-4e4a-ba75-60ba247ddabd on drill191:31010] at oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) at oadd.org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:274) at oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:244) at oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at oadd.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312) at oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at oadd.io.netty.channel.AbstractChan
[jira] [Commented] (DRILL-6651) Compilation error in IDE due to missing package name
[ https://issues.apache.org/jira/browse/DRILL-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564426#comment-16564426 ] Boaz Ben-Zvi commented on DRILL-6651: - This error may be seen in IntelliJ as well, when set with the IntelliJ IDAE formatter ( https://drill.apache.org/docs/attachments/intellij-idea-settings.jar ). Not sure which specific setting activates this "inspection". > Compilation error in IDE due to missing package name > > > Key: DRILL-6651 > URL: https://issues.apache.org/jira/browse/DRILL-6651 > Project: Apache Drill > Issue Type: Bug >Reporter: Aman Sinha >Assignee: Boaz Ben-Zvi >Priority: Major > Fix For: 1.14.0, 1.15.0 > > > I am seeing the following compilation errors in my Eclipse build (only in the > IDE.. this does not happen on the maven command line): > {noformat} > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestComplexColumnInSchema.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line 1 > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestParquetReaderUtility.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line > 1{noformat} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6651) Compilation error in IDE due to missing package name
[ https://issues.apache.org/jira/browse/DRILL-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Ben-Zvi updated DRILL-6651: Summary: Compilation error in IDE due to missing package name (was: Compilation error in Eclipse IDE due to missing package name) > Compilation error in IDE due to missing package name > > > Key: DRILL-6651 > URL: https://issues.apache.org/jira/browse/DRILL-6651 > Project: Apache Drill > Issue Type: Bug >Reporter: Aman Sinha >Assignee: Boaz Ben-Zvi >Priority: Major > Fix For: 1.14.0, 1.15.0 > > > I am seeing the following compilation errors in my Eclipse build (only in the > IDE.. this does not happen on the maven command line): > {noformat} > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestComplexColumnInSchema.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line 1 > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestParquetReaderUtility.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line > 1{noformat} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6652) PartitionLimit changes for Lateral and Unnest
[ https://issues.apache.org/jira/browse/DRILL-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564412#comment-16564412 ] ASF GitHub Bot commented on DRILL-6652: --- sohami commented on issue #1407: DRILL-6652: PartitionLimit changes for Lateral and Unnest URL: https://github.com/apache/drill/pull/1407#issuecomment-409381241 @sohami - Once [PR-1401](https://github.com/apache/drill/pull/1401) is merged in master, we should rebase this PR on top of it and remove the commits from that PR. Also make sure to add a commit to remove all ignores from TestE2EUnnestAndLateral. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > PartitionLimit changes for Lateral and Unnest > - > > Key: DRILL-6652 > URL: https://issues.apache.org/jira/browse/DRILL-6652 > Project: Apache Drill > Issue Type: Task > Components: Execution - Relational Operators, Query Planning & > Optimization >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6636) Planner side changes to use PartitionLimitBatch in place of LimitBatch
[ https://issues.apache.org/jira/browse/DRILL-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-6636: - Reviewer: Aman Sinha > Planner side changes to use PartitionLimitBatch in place of LimitBatch > -- > > Key: DRILL-6636 > URL: https://issues.apache.org/jira/browse/DRILL-6636 > Project: Apache Drill > Issue Type: Task > Components: Query Planning & Optimization >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Hanumath Rao Maduri >Priority: Major > Fix For: 1.15.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6652) PartitionLimit changes for Lateral and Unnest
[ https://issues.apache.org/jira/browse/DRILL-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-6652: - Summary: PartitionLimit changes for Lateral and Unnest (was: PartitionLimit changes for Lateral&Unnest) > PartitionLimit changes for Lateral and Unnest > - > > Key: DRILL-6652 > URL: https://issues.apache.org/jira/browse/DRILL-6652 > Project: Apache Drill > Issue Type: Task > Components: Execution - Relational Operators, Query Planning & > Optimization >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6635) PartitionLimit for Lateral/Unnest
[ https://issues.apache.org/jira/browse/DRILL-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-6635: - Reviewer: Parth Chandra > PartitionLimit for Lateral/Unnest > - > > Key: DRILL-6635 > URL: https://issues.apache.org/jira/browse/DRILL-6635 > Project: Apache Drill > Issue Type: Task > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > With batch processing changes in Lateral/Unnest the limit/TopN clause within > Lateral-Unnest subquery will not work as expected since it will impose > limit/TopN across RowId's. We need a new mechanism to apply these operators > at rowId boundary. > For now we are planning to add support for only limit and hence need to have > a new operator PartitionLimit which will get the partitionColumn on which the > limit should be imposed. This will currently only support queries between > lateral and unnest. > For TopN we can still achieve that using combination of Sort and Partition > Limit and later we can figure out how to address it directly within TopN or > is it needed at all. Since the number of rows across EMIT boundary on which > SORT will operate should not be big enough and mostly be done in memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
issues@drill.apache.org
[ https://issues.apache.org/jira/browse/DRILL-6652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-6652: - Summary: PartitionLimit changes for Lateral&Unnest (was: PartitionLimit Handling for Lateral/Unnest) > PartitionLimit changes for Lateral&Unnest > - > > Key: DRILL-6652 > URL: https://issues.apache.org/jira/browse/DRILL-6652 > Project: Apache Drill > Issue Type: Task > Components: Execution - Relational Operators, Query Planning & > Optimization >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6652) PartitionLimit Handling for Lateral/Unnest
Sorabh Hamirwasia created DRILL-6652: Summary: PartitionLimit Handling for Lateral/Unnest Key: DRILL-6652 URL: https://issues.apache.org/jira/browse/DRILL-6652 Project: Apache Drill Issue Type: Task Components: Execution - Relational Operators, Query Planning & Optimization Affects Versions: 1.14.0 Reporter: Sorabh Hamirwasia Assignee: Sorabh Hamirwasia -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6651) Compilation error in Eclipse IDE due to missing package name
[ https://issues.apache.org/jira/browse/DRILL-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Ben-Zvi updated DRILL-6651: Fix Version/s: 1.15.0 1.14.0 > Compilation error in Eclipse IDE due to missing package name > > > Key: DRILL-6651 > URL: https://issues.apache.org/jira/browse/DRILL-6651 > Project: Apache Drill > Issue Type: Bug >Reporter: Aman Sinha >Assignee: Boaz Ben-Zvi >Priority: Major > Fix For: 1.14.0, 1.15.0 > > > I am seeing the following compilation errors in my Eclipse build (only in the > IDE.. this does not happen on the maven command line): > {noformat} > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestComplexColumnInSchema.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line 1 > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestParquetReaderUtility.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line > 1{noformat} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-6651) Compilation error in Eclipse IDE due to missing package name
[ https://issues.apache.org/jira/browse/DRILL-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Ben-Zvi reassigned DRILL-6651: --- Assignee: Boaz Ben-Zvi > Compilation error in Eclipse IDE due to missing package name > > > Key: DRILL-6651 > URL: https://issues.apache.org/jira/browse/DRILL-6651 > Project: Apache Drill > Issue Type: Bug >Reporter: Aman Sinha >Assignee: Boaz Ben-Zvi >Priority: Major > > I am seeing the following compilation errors in my Eclipse build (only in the > IDE.. this does not happen on the maven command line): > {noformat} > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestComplexColumnInSchema.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line 1 > The declared package "" does not match the expected package > "org.apache.drill.exec.store.parquet" TestParquetReaderUtility.java > /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line > 1{noformat} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6651) Compilation error in Eclipse IDE due to missing package name
Aman Sinha created DRILL-6651: - Summary: Compilation error in Eclipse IDE due to missing package name Key: DRILL-6651 URL: https://issues.apache.org/jira/browse/DRILL-6651 Project: Apache Drill Issue Type: Bug Reporter: Aman Sinha I am seeing the following compilation errors in my Eclipse build (only in the IDE.. this does not happen on the maven command line): {noformat} The declared package "" does not match the expected package "org.apache.drill.exec.store.parquet" TestComplexColumnInSchema.java /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line 1 The declared package "" does not match the expected package "org.apache.drill.exec.store.parquet" TestParquetReaderUtility.java /drill-java-exec/src/test/java/org/apache/drill/exec/store/parquet line 1{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6619) Lateral changes for implicit column
[ https://issues.apache.org/jira/browse/DRILL-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6619: -- Labels: ready-to-commit (was: ) > Lateral changes for implicit column > --- > > Key: DRILL-6619 > URL: https://issues.apache.org/jira/browse/DRILL-6619 > Project: Apache Drill > Issue Type: Sub-task > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > 1) Update Lateral to consume right batch such that it can contain rows for > multiple left incoming rows. > 2) Update lateral to exclude the implicit field (name passed in PopConfig) > with rowId from output container. The type of implicit field will be > IntVector. > 3) Fix all existing unit tests -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6618) Unnest changes for implicit column
[ https://issues.apache.org/jira/browse/DRILL-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6618: -- Labels: ready-to-commit (was: ) > Unnest changes for implicit column > -- > > Key: DRILL-6618 > URL: https://issues.apache.org/jira/browse/DRILL-6618 > Project: Apache Drill > Issue Type: Sub-task > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Parth Chandra >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > 1) Update unnest to work on entire left incoming instead of row by row > processing. > 2) Update unnest to generate an implicit field (name passed in PopConfig) > with rowId of each output row being generated. The type of implicit field > will be IntVector. > 3) Fix all existing unit tests -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6617) Planner Side changed to propagate $drill_implicit_field$ information
[ https://issues.apache.org/jira/browse/DRILL-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6617: -- Fix Version/s: 1.15.0 > Planner Side changed to propagate $drill_implicit_field$ information > > > Key: DRILL-6617 > URL: https://issues.apache.org/jira/browse/DRILL-6617 > Project: Apache Drill > Issue Type: Sub-task > Components: Query Planning & Optimization >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Hanumath Rao Maduri >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > *+Implement support in planning side for below:+* > 1) Propagate the implicit column $drill_implicit_field$ to both Lateral and > Unnest operator using PopConfig. > 2) Update the expressions for operators between Lateral/Unnest subquery to > use this implicit column. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6617) Planner Side changed to propagate $drill_implicit_field$ information
[ https://issues.apache.org/jira/browse/DRILL-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6617: -- Labels: ready-to-commit (was: ) > Planner Side changed to propagate $drill_implicit_field$ information > > > Key: DRILL-6617 > URL: https://issues.apache.org/jira/browse/DRILL-6617 > Project: Apache Drill > Issue Type: Sub-task > Components: Query Planning & Optimization >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Hanumath Rao Maduri >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > *+Implement support in planning side for below:+* > 1) Propagate the implicit column $drill_implicit_field$ to both Lateral and > Unnest operator using PopConfig. > 2) Update the expressions for operators between Lateral/Unnest subquery to > use this implicit column. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6618) Unnest changes for implicit column
[ https://issues.apache.org/jira/browse/DRILL-6618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6618: -- Fix Version/s: 1.15.0 > Unnest changes for implicit column > -- > > Key: DRILL-6618 > URL: https://issues.apache.org/jira/browse/DRILL-6618 > Project: Apache Drill > Issue Type: Sub-task > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Parth Chandra >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > 1) Update unnest to work on entire left incoming instead of row by row > processing. > 2) Update unnest to generate an implicit field (name passed in PopConfig) > with rowId of each output row being generated. The type of implicit field > will be IntVector. > 3) Fix all existing unit tests -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6619) Lateral changes for implicit column
[ https://issues.apache.org/jira/browse/DRILL-6619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6619: -- Fix Version/s: 1.15.0 > Lateral changes for implicit column > --- > > Key: DRILL-6619 > URL: https://issues.apache.org/jira/browse/DRILL-6619 > Project: Apache Drill > Issue Type: Sub-task > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > 1) Update Lateral to consume right batch such that it can contain rows for > multiple left incoming rows. > 2) Update lateral to exclude the implicit field (name passed in PopConfig) > with rowId from output container. The type of implicit field will be > IntVector. > 3) Fix all existing unit tests -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-6439) Sometimes Travis Times Out
[ https://issues.apache.org/jira/browse/DRILL-6439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas resolved DRILL-6439. --- Resolution: Duplicate > Sometimes Travis Times Out > -- > > Key: DRILL-6439 > URL: https://issues.apache.org/jira/browse/DRILL-6439 > Project: Apache Drill > Issue Type: Bug >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > Ocassionally Travis builds run a few minutes longer than usual and timeout. > {code} > [32;1mchanges detected, packing new archive[0m > . > . > The job exceeded the maximum time limit for jobs, and has been terminated. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6616) Batch Processing for Lateral/Unnest
[ https://issues.apache.org/jira/browse/DRILL-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6616: -- Labels: ready-to-commit (was: ) > Batch Processing for Lateral/Unnest > --- > > Key: DRILL-6616 > URL: https://issues.apache.org/jira/browse/DRILL-6616 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > Implement the execution and planner side changes for the batch processing > done by lateral and unnest. Based on the prototype we found performance to be > much better as compared to initial row-by-row execution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6589) Push transitive closure generated predicates past aggregates
[ https://issues.apache.org/jira/browse/DRILL-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564302#comment-16564302 ] ASF GitHub Bot commented on DRILL-6589: --- ilooner closed pull request #1372: DRILL-6589: Push transitive closure predicates past aggregates/projects URL: https://github.com/apache/drill/pull/1372 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java index e5a3746a42f..2d02011dc80 100644 --- a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java +++ b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java @@ -559,6 +559,7 @@ static RuleSet getJoinTransitiveClosureRules() { RuleInstance.DRILL_JOIN_PUSH_TRANSITIVE_PREDICATES_RULE, DrillFilterJoinRules.DRILL_FILTER_INTO_JOIN, RuleInstance.REMOVE_IS_NOT_DISTINCT_FROM_RULE, +DrillFilterAggregateTransposeRule.DRILL_LOGICAL_INSTANCE, RuleInstance.DRILL_FILTER_MERGE_RULE ).build()); } diff --git a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillAggregateRelBase.java b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillAggregateRelBase.java index 18103c44c58..cd1f4fa46f1 100644 --- a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillAggregateRelBase.java +++ b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillAggregateRelBase.java @@ -17,24 +17,23 @@ */ package org.apache.drill.exec.planner.common; -import java.util.List; - +import org.apache.calcite.plan.RelOptCluster; +import org.apache.calcite.plan.RelOptCost; +import org.apache.calcite.plan.RelOptPlanner; +import org.apache.calcite.plan.RelTraitSet; +import org.apache.calcite.rel.RelNode; +import org.apache.calcite.rel.core.Aggregate; +import org.apache.calcite.rel.core.AggregateCall; import org.apache.calcite.rel.metadata.RelMetadataQuery; +import org.apache.calcite.util.ImmutableBitSet; import org.apache.drill.exec.ExecConstants; import org.apache.drill.exec.expr.holders.IntHolder; import org.apache.drill.exec.planner.cost.DrillCostBase; import org.apache.drill.exec.planner.cost.DrillCostBase.DrillCostFactory; -import org.apache.calcite.plan.RelOptPlanner; -import org.apache.calcite.rel.core.Aggregate; -import org.apache.calcite.rel.core.AggregateCall; -import org.apache.calcite.rel.InvalidRelException; -import org.apache.calcite.rel.RelNode; -import org.apache.calcite.util.ImmutableBitSet; -import org.apache.calcite.plan.RelOptCluster; -import org.apache.calcite.plan.RelOptCost; -import org.apache.calcite.plan.RelTraitSet; import org.apache.drill.exec.planner.physical.PrelUtil; +import java.util.List; + /** * Base class for logical and physical Aggregations implemented in Drill @@ -42,11 +41,10 @@ public abstract class DrillAggregateRelBase extends Aggregate implements DrillRelNode { public DrillAggregateRelBase(RelOptCluster cluster, RelTraitSet traits, RelNode child, boolean indicator, - ImmutableBitSet groupSet, List groupSets, List aggCalls) throws InvalidRelException { + ImmutableBitSet groupSet, List groupSets, List aggCalls) { super(cluster, traits, child, indicator, groupSet, groupSets, aggCalls); } - /** * Estimate cost of hash agg. Called by DrillAggregateRel.computeSelfCost() and HashAggPrel.computeSelfCost() */ diff --git a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAggregateRel.java b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAggregateRel.java index 55cd7bfc22a..5a7421b6679 100644 --- a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAggregateRel.java +++ b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAggregateRel.java @@ -17,11 +17,15 @@ */ package org.apache.drill.exec.planner.logical; -import java.util.List; - +import com.google.common.collect.Lists; import org.apache.calcite.linq4j.Ord; +import org.apache.calcite.plan.RelOptCluster; import org.apache.calcite.plan.RelOptCost; import org.apache.calcite.plan.RelOptPlanner; +import org.apache.calcite.plan.RelTraitSet; +import org.apache.calcite.rel.RelNode; +import org.apache.calcite.rel.core.Aggregate; +import org.apache.calcite.rel.core.AggregateCall; import org.apache.calcite.rel.metadata.RelMetadataQuery; import org.apache.calcite.sql.SqlKind; import org.apache.calcite.sql.type.SqlTypeName; @@ -36,14 +40,8 @@ import org.apache.drill.common.logical.data
[jira] [Commented] (DRILL-6589) Push transitive closure generated predicates past aggregates
[ https://issues.apache.org/jira/browse/DRILL-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564303#comment-16564303 ] ASF GitHub Bot commented on DRILL-6589: --- ilooner commented on issue #1372: DRILL-6589: Push transitive closure predicates past aggregates/projects URL: https://github.com/apache/drill/pull/1372#issuecomment-409348913 @gparai Thanks for rebasing. I must have made a mistake on my end. Everything compiled fine and passed all tests. I have merged it now. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Push transitive closure generated predicates past aggregates > > > Key: DRILL-6589 > URL: https://issues.apache.org/jira/browse/DRILL-6589 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Gautam Kumar Parai >Assignee: Gautam Kumar Parai >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > Here is a sample query that may benefit from this optimization: > SELECT * FROM T1 WHERE a1 = 5 AND a1 IN (SELECT a2 FROM T2); > Here the transitive predicate a2 = 5 would be pushed past the aggregate due > to this optimization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6650) Remove Stray Semicolon in Printing Results Listener
[ https://issues.apache.org/jira/browse/DRILL-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564279#comment-16564279 ] ASF GitHub Bot commented on DRILL-6650: --- ilooner commented on issue #1410: DRILL-6650: Remove stray semicolon in imports for PrintingResultsListener. URL: https://github.com/apache/drill/pull/1410#issuecomment-409345038 @amansinha100 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Remove Stray Semicolon in Printing Results Listener > --- > > Key: DRILL-6650 > URL: https://issues.apache.org/jira/browse/DRILL-6650 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > Having empty import statements with stray semicolons can cause compilation in > eclipse to fail. I investigated adding a checkstyle check to prevent empty > import declarations, but apparently there is no way to do this. In fact > having multiple semicolons in an import statement is technically not > supported by the java spec > https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. > There is also a bug filed for the jdk for not throwing an error in this case > https://bugs.openjdk.java.net/browse/JDK-8072390 > Since there is no way to automate checking this I am just going to manually > remove this error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6650) Remove Stray Semicolon in Printing Results Listener
[ https://issues.apache.org/jira/browse/DRILL-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6650: -- Reviewer: Aman Sinha > Remove Stray Semicolon in Printing Results Listener > --- > > Key: DRILL-6650 > URL: https://issues.apache.org/jira/browse/DRILL-6650 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > Having empty import statements with stray semicolons can cause compilation in > eclipse to fail. I investigated adding a checkstyle check to prevent empty > import declarations, but apparently there is no way to do this. In fact > having multiple semicolons in an import statement is technically not > supported by the java spec > https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. > There is also a bug filed for the jdk for not throwing an error in this case > https://bugs.openjdk.java.net/browse/JDK-8072390 > Since there is no way to automate checking this I am just going to manually > remove this error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6650) Remove Stray Semicolon in Printing Results Listener
[ https://issues.apache.org/jira/browse/DRILL-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564277#comment-16564277 ] ASF GitHub Bot commented on DRILL-6650: --- ilooner opened a new pull request #1410: DRILL-6650: Remove stray semicolon in imports for PrintingResultsListener. URL: https://github.com/apache/drill/pull/1410 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Remove Stray Semicolon in Printing Results Listener > --- > > Key: DRILL-6650 > URL: https://issues.apache.org/jira/browse/DRILL-6650 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > Having empty import statements with stray semicolons can cause compilation in > eclipse to fail. I investigated adding a checkstyle check to prevent empty > import declarations, but apparently there is no way to do this. In fact > having multiple semicolons in an import statement is technically not > supported by the java spec > https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. > There is also a bug filed for the jdk for not throwing an error in this case > https://bugs.openjdk.java.net/browse/JDK-8072390 > Since there is no way to automate checking this I am just going to manually > remove this error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6650) Remove Stray Semicolon in Printing Results Listener
[ https://issues.apache.org/jira/browse/DRILL-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6650: -- Description: Having empty import statements with stray semicolons can cause compilation in eclipse to fail. I investigated adding a checkstyle check to prevent empty import declarations, but apparently there is no way to do this. In fact having multiple semicolons in an import statement is technically not supported by the java spec https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. There is also a bug filed for the jdk for not throwing an error in this case https://bugs.openjdk.java.net/browse/JDK-8072390 Since there is no way to automate checking this I am just going to manually remove this error. was: Having empty import statements with stray semicolons can cause compilation in eclipse to fail. I investigated adding a checkstyle check to prevent empty import declarations, but apparently there is no way to do this. In fact having multiple semicolons in an import statement is technically not supported by the java spec https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. Since there is no way to automate checking this I am just going to manually remove this error. > Remove Stray Semicolon in Printing Results Listener > --- > > Key: DRILL-6650 > URL: https://issues.apache.org/jira/browse/DRILL-6650 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > Having empty import statements with stray semicolons can cause compilation in > eclipse to fail. I investigated adding a checkstyle check to prevent empty > import declarations, but apparently there is no way to do this. In fact > having multiple semicolons in an import statement is technically not > supported by the java spec > https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. > There is also a bug filed for the jdk for not throwing an error in this case > https://bugs.openjdk.java.net/browse/JDK-8072390 > Since there is no way to automate checking this I am just going to manually > remove this error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6650) Remove Stray Semicolon in Printing Results Listener
[ https://issues.apache.org/jira/browse/DRILL-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6650: -- Description: Having empty import statements with stray semicolons can cause compilation in eclipse to fail. I investigated adding a checkstyle check to prevent empty import declarations, but apparently there is no way to do this. In fact having multiple semicolons in an import statement is technically not supported by the java spec https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. Since there is no way to automate checking this I am just going to manually remove this error. was:Having empty statements with stray semicolons can cause compilation in eclipse to fail. To fix this this change removes the empty semicolon statements and forbids them in our checkstyle configuration. > Remove Stray Semicolon in Printing Results Listener > --- > > Key: DRILL-6650 > URL: https://issues.apache.org/jira/browse/DRILL-6650 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > Having empty import statements with stray semicolons can cause compilation in > eclipse to fail. I investigated adding a checkstyle check to prevent empty > import declarations, but apparently there is no way to do this. In fact > having multiple semicolons in an import statement is technically not > supported by the java spec > https://stackoverflow.com/questions/8125558/eclipse-double-semi-colon-on-an-import. > Since there is no way to automate checking this I am just going to manually > remove this error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6650) Remove Stray Semicolon in Printing Results Listener
[ https://issues.apache.org/jira/browse/DRILL-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-6650: -- Summary: Remove Stray Semicolon in Printing Results Listener (was: Disallow Empty Statements With Stray Semicolons) > Remove Stray Semicolon in Printing Results Listener > --- > > Key: DRILL-6650 > URL: https://issues.apache.org/jira/browse/DRILL-6650 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Major > > Having empty statements with stray semicolons can cause compilation in > eclipse to fail. To fix this this change removes the empty semicolon > statements and forbids them in our checkstyle configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6650) Disallow Empty Statements With Stray Semicolons
Timothy Farkas created DRILL-6650: - Summary: Disallow Empty Statements With Stray Semicolons Key: DRILL-6650 URL: https://issues.apache.org/jira/browse/DRILL-6650 Project: Apache Drill Issue Type: Improvement Reporter: Timothy Farkas Assignee: Timothy Farkas Having empty statements with stray semicolons can cause compilation in eclipse to fail. To fix this this change removes the empty semicolon statements and forbids them in our checkstyle configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6649) Query with unnest of column from nested subquery fails
[ https://issues.apache.org/jira/browse/DRILL-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker updated DRILL-6649: - Fix Version/s: 1.15.0 > Query with unnest of column from nested subquery fails > -- > > Key: DRILL-6649 > URL: https://issues.apache.org/jira/browse/DRILL-6649 > Project: Apache Drill > Issue Type: Bug >Reporter: Volodymyr Vysotskyi >Assignee: Volodymyr Vysotskyi >Priority: Major > Fix For: 1.15.0 > > > This query: > {code:sql} > select t.c_name from (select * from cp.`lateraljoin/nested-customer.json` > limit 1) t, unnest(t.orders) t2(o) > {code} > fails with error: > {noformat} > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError > [Error Id: 6868e327-ab2c-44a2-ab0c-cf30f4a64349 on user515050-pc:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[classes/:na] > at > org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:761) > [classes/:na] > at > org.apache.drill.exec.work.foreman.QueryStateProcessor.checkCommonStates(QueryStateProcessor.java:325) > [classes/:na] > at > org.apache.drill.exec.work.foreman.QueryStateProcessor.planning(QueryStateProcessor.java:221) > [classes/:na] > at > org.apache.drill.exec.work.foreman.QueryStateProcessor.moveToState(QueryStateProcessor.java:83) > [classes/:na] > at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:293) > [classes/:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_181] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_181] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181] > Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected > exception during fragment initialization: null > at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:294) > [classes/:na] > ... 3 common frames omitted > Caused by: java.lang.AssertionError: null > at > org.apache.calcite.sql.SqlUnnestOperator.inferReturnType(SqlUnnestOperator.java:80) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.sql.SqlOperator.validateOperands(SqlOperator.java:437) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.sql.validate.UnnestNamespace.validateImpl(UnnestNamespace.java:67) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:928) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2975) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273) > ~[classes/:na] > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2960) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273) > ~[classes/:na] > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateJoin(SqlValidatorImpl.java:3012) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2969) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273) > ~[classes/:na] > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3219) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947) > ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] > at > org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:928) > ~[calcite-
[jira] [Commented] (DRILL-6616) Batch Processing for Lateral/Unnest
[ https://issues.apache.org/jira/browse/DRILL-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564051#comment-16564051 ] ASF GitHub Bot commented on DRILL-6616: --- amansinha100 commented on issue #1401: DRILL-6616: Batch Processing for Lateral/Unnest URL: https://github.com/apache/drill/pull/1401#issuecomment-409306428 @HanumathRao Updated changes LGTM. +1 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Batch Processing for Lateral/Unnest > --- > > Key: DRILL-6616 > URL: https://issues.apache.org/jira/browse/DRILL-6616 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > Implement the execution and planner side changes for the batch processing > done by lateral and unnest. Based on the prototype we found performance to be > much better as compared to initial row-by-row execution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6649) Query with unnest of column from nested subquery fails
Volodymyr Vysotskyi created DRILL-6649: -- Summary: Query with unnest of column from nested subquery fails Key: DRILL-6649 URL: https://issues.apache.org/jira/browse/DRILL-6649 Project: Apache Drill Issue Type: Bug Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi This query: {code:sql} select t.c_name from (select * from cp.`lateraljoin/nested-customer.json` limit 1) t, unnest(t.orders) t2(o) {code} fails with error: {noformat} org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError [Error Id: 6868e327-ab2c-44a2-ab0c-cf30f4a64349 on user515050-pc:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) ~[classes/:na] at org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:761) [classes/:na] at org.apache.drill.exec.work.foreman.QueryStateProcessor.checkCommonStates(QueryStateProcessor.java:325) [classes/:na] at org.apache.drill.exec.work.foreman.QueryStateProcessor.planning(QueryStateProcessor.java:221) [classes/:na] at org.apache.drill.exec.work.foreman.QueryStateProcessor.moveToState(QueryStateProcessor.java:83) [classes/:na] at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:293) [classes/:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181] Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization: null at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:294) [classes/:na] ... 3 common frames omitted Caused by: java.lang.AssertionError: null at org.apache.calcite.sql.SqlUnnestOperator.inferReturnType(SqlUnnestOperator.java:80) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.SqlOperator.validateOperands(SqlOperator.java:437) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.validate.UnnestNamespace.validateImpl(UnnestNamespace.java:67) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:928) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2975) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273) ~[classes/:na] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2960) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273) ~[classes/:na] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateJoin(SqlValidatorImpl.java:3012) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2969) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom(SqlConverter.java:273) ~[classes/:na] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3219) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:928) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:226) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:903) ~[calcite-core-1.16.0-drill-r6.jar:1.16.0-drill-r6] at org.apache.calcite.sql.val
[jira] [Commented] (DRILL-6616) Batch Processing for Lateral/Unnest
[ https://issues.apache.org/jira/browse/DRILL-6616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564038#comment-16564038 ] ASF GitHub Bot commented on DRILL-6616: --- HanumathRao commented on issue #1401: DRILL-6616: Batch Processing for Lateral/Unnest URL: https://github.com/apache/drill/pull/1401#issuecomment-409302961 @amansinha100 Thank you for the review. I have done the needed changes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Batch Processing for Lateral/Unnest > --- > > Key: DRILL-6616 > URL: https://issues.apache.org/jira/browse/DRILL-6616 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > Implement the execution and planner side changes for the batch processing > done by lateral and unnest. Based on the prototype we found performance to be > much better as compared to initial row-by-row execution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-5796) Filter pruning for multi rowgroup parquet file
[ https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5796: Labels: ready-to-commit (was: ) > Filter pruning for multi rowgroup parquet file > -- > > Key: DRILL-5796 > URL: https://issues.apache.org/jira/browse/DRILL-5796 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Reporter: Damien Profeta >Assignee: Jean-Blas IMBERT >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > Today, filter pruning use the file name as the partitioning key. This means > you can remove a partition only if the whole file is for the same partition. > With parquet, you can prune the filter if the rowgroup make a partition of > your dataset as the unit of work if the rowgroup not the file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)
[ https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563665#comment-16563665 ] ASF GitHub Bot commented on DRILL-6385: --- weijietong commented on issue #1334: DRILL-6385: Support JPPD feature URL: https://github.com/apache/drill/pull/1334#issuecomment-409220807 @amansinha100 The following JIRAs: DRILL-6573 ,DRILL-6572 would solve what we identified before . @sohami 's comments have been addressed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support JPPD (Join Predicate Push Down) > --- > > Key: DRILL-6385 > URL: https://issues.apache.org/jira/browse/DRILL-6385 > Project: Apache Drill > Issue Type: New Feature > Components: Server, Execution - Flow >Affects Versions: 1.14.0 >Reporter: weijie.tong >Assignee: weijie.tong >Priority: Major > > This feature is to support the JPPD (Join Predicate Push Down). It will > benefit the HashJoin ,Broadcast HashJoin performance by reducing the number > of rows to send across the network ,the memory consumed. This feature is > already supported by Impala which calls it RuntimeFilter > ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]). > The first PR will try to push down a bloom filter of HashJoin node to > Parquet’s scan node. The propose basic procedure is described as follow: > # The HashJoin build side accumulate the equal join condition rows to > construct a bloom filter. Then it sends out the bloom filter to the foreman > node. > # The foreman node accept the bloom filters passively from all the fragments > that has the HashJoin operator. It then aggregates the bloom filters to form > a global bloom filter. > # The foreman node broadcasts the global bloom filter to all the probe side > scan nodes which maybe already have send out partial data to the hash join > nodes(currently the hash join node will prefetch one batch from both sides ). > 4. The scan node accepts a global bloom filter from the foreman node. > It will filter the rest rows satisfying the bloom filter. > > To implement above execution flow, some main new notion described as below: > 1. RuntimeFilter > It’s a filter container which may contain BloomFilter or MinMaxFilter. > 2. RuntimeFilterReporter > It wraps the logic to send hash join’s bloom filter to the foreman.The > serialized bloom filter will be sent out through the data tunnel.This object > will be instanced by the FragmentExecutor and passed to the > FragmentContext.So the HashJoin operator can obtain it through the > FragmentContext. > 3. RuntimeFilterRequestHandler > It is responsible to accept a SendRuntimeFilterRequest RPC to strip the > actual BloomFilter from the network. It then translates this filter to the > WorkerBee’s new interface registerRuntimeFilter. > Another RPC type is BroadcastRuntimeFilterRequest. It will register the > accepted global bloom filter to the WorkerBee by the registerRuntimeFilter > method and then propagate to the FragmentContext through which the probe side > scan node can fetch the aggregated bloom filter. > 4.RuntimeFilterManager > The foreman will instance a RuntimeFilterManager .It will indirectly get > every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been > accepted and aggregated . It will broadcast the aggregated bloom filter to > all the probe side scan nodes through the data tunnel by a > BroadcastRuntimeFilterRequest RPC. > 5. RuntimeFilterEnableOption > A global option will be added to decide whether to enable this new feature. > > Welcome suggestion and advice from you.The related PR will be presented as > soon as possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6648) Unknown column types not being handled
[ https://issues.apache.org/jira/browse/DRILL-6648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563655#comment-16563655 ] Arina Ielchiieva commented on DRILL-6648: - This problem occurs in Apache Calcite, please see package of JDBC schema {{org.apache.calcite.adapter.jdbc}}. Also you'll see that {{SqlType}} that has full list of supported types which also is defined in Calcite. I suggest you file Jira in Apache Calcite for the fix. I don't think Apache Drill can doing anything about it. > Unknown column types not being handled > -- > > Key: DRILL-6648 > URL: https://issues.apache.org/jira/browse/DRILL-6648 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.13.0 >Reporter: Iulian Tirzuman >Priority: Major > Attachments: sql type -150.jpg > > > I've compiled Drill 1.13, installed SQL Server (Microsoft SQL Server 2017 > (RTM-CU9) (KB4341265) - 14.0.3030.27 (X64)) and then created a plugin in > drill to use that SQL Server instance. > When I try to run the query "SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS`" > from drill explorer I get the following error: > ERROR [HY000] [MapR][Drill] (1040) Drill failed to execute the query: SELECT > * FROM `INFORMATION_SCHEMA`.`COLUMNS` > [30034]Query execution error. Details:[ > SYSTEM ERROR: IllegalArgumentException: Unknown SQL type -150 > Fragment 0:0 > [Error Id: 47625973-9d73-4812-9c91-771d81addc73 on localhost:31010] > ] > at System.Data.Odbc.OdbcConnection.HandleError(OdbcHandle hrHandle, RetCode > retcode) > at System.Data.Odbc.OdbcCommand.ExecuteReaderObject(CommandBehavior > behavior, String method, Boolean needReader, Object[] methodArguments, > SQL_API odbcApiMethod) > at System.Data.Odbc.OdbcCommand.ExecuteReaderObject(CommandBehavior > behavior, String method, Boolean needReader) > at System.Data.Odbc.OdbcCommand.ExecuteReader(CommandBehavior behavior) > at DrillExplorer.DROdbcProvider.GetStatmentColumns(String in_query) > > I've did some debugging and saw that the error is due to the sql server > column type sql_variant of table SEQUENCES , column START_VALUE which has the > dataType -150 and the code "SqlType.valueOf(dataType)" fails because it > doesn't contain it; it should at least configurable to be defaulted to a a > default sql type ? > I've attached also a screenshot with the debug before the error occurs. > I've used sqljdbc42.jar as the JDBC driver to connect to SQL Server. > I've also compiled and tested the above in Drill 1.14 and the issue is still > present. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6648) Unknown column types not being handled
Iulian Tirzuman created DRILL-6648: -- Summary: Unknown column types not being handled Key: DRILL-6648 URL: https://issues.apache.org/jira/browse/DRILL-6648 Project: Apache Drill Issue Type: Bug Affects Versions: 1.13.0 Reporter: Iulian Tirzuman Attachments: sql type -150.jpg I've compiled Drill 1.13, installed SQL Server (Microsoft SQL Server 2017 (RTM-CU9) (KB4341265) - 14.0.3030.27 (X64)) and then created a plugin in drill to use that SQL Server instance. When I try to run the query "SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS`" from drill explorer I get the following error: ERROR [HY000] [MapR][Drill] (1040) Drill failed to execute the query: SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` [30034]Query execution error. Details:[ SYSTEM ERROR: IllegalArgumentException: Unknown SQL type -150 Fragment 0:0 [Error Id: 47625973-9d73-4812-9c91-771d81addc73 on localhost:31010] ] at System.Data.Odbc.OdbcConnection.HandleError(OdbcHandle hrHandle, RetCode retcode) at System.Data.Odbc.OdbcCommand.ExecuteReaderObject(CommandBehavior behavior, String method, Boolean needReader, Object[] methodArguments, SQL_API odbcApiMethod) at System.Data.Odbc.OdbcCommand.ExecuteReaderObject(CommandBehavior behavior, String method, Boolean needReader) at System.Data.Odbc.OdbcCommand.ExecuteReader(CommandBehavior behavior) at DrillExplorer.DROdbcProvider.GetStatmentColumns(String in_query) I've did some debugging and saw that the error is due to the sql server column type sql_variant of table SEQUENCES , column START_VALUE which has the dataType -150 and the code "SqlType.valueOf(dataType)" fails because it doesn't contain it; it should at least configurable to be defaulted to a a default sql type ? I've attached also a screenshot with the debug before the error occurs. I've used sqljdbc42.jar as the JDBC driver to connect to SQL Server. I've also compiled and tested the above in Drill 1.14 and the issue is still present. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)
[ https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563540#comment-16563540 ] ASF GitHub Bot commented on DRILL-6385: --- weijietong commented on a change in pull request #1334: DRILL-6385: Support JPPD feature URL: https://github.com/apache/drill/pull/1334#discussion_r206498489 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/BloomFilterCreator.java ## @@ -0,0 +1,41 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.work.filter; + +import io.netty.buffer.DrillBuf; +import org.apache.drill.exec.memory.BufferAllocator; + +public class BloomFilterCreator { Review comment: This description will be added at the `ScanBatch`'s `applyRuntimeFilter `. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support JPPD (Join Predicate Push Down) > --- > > Key: DRILL-6385 > URL: https://issues.apache.org/jira/browse/DRILL-6385 > Project: Apache Drill > Issue Type: New Feature > Components: Server, Execution - Flow >Affects Versions: 1.14.0 >Reporter: weijie.tong >Assignee: weijie.tong >Priority: Major > > This feature is to support the JPPD (Join Predicate Push Down). It will > benefit the HashJoin ,Broadcast HashJoin performance by reducing the number > of rows to send across the network ,the memory consumed. This feature is > already supported by Impala which calls it RuntimeFilter > ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]). > The first PR will try to push down a bloom filter of HashJoin node to > Parquet’s scan node. The propose basic procedure is described as follow: > # The HashJoin build side accumulate the equal join condition rows to > construct a bloom filter. Then it sends out the bloom filter to the foreman > node. > # The foreman node accept the bloom filters passively from all the fragments > that has the HashJoin operator. It then aggregates the bloom filters to form > a global bloom filter. > # The foreman node broadcasts the global bloom filter to all the probe side > scan nodes which maybe already have send out partial data to the hash join > nodes(currently the hash join node will prefetch one batch from both sides ). > 4. The scan node accepts a global bloom filter from the foreman node. > It will filter the rest rows satisfying the bloom filter. > > To implement above execution flow, some main new notion described as below: > 1. RuntimeFilter > It’s a filter container which may contain BloomFilter or MinMaxFilter. > 2. RuntimeFilterReporter > It wraps the logic to send hash join’s bloom filter to the foreman.The > serialized bloom filter will be sent out through the data tunnel.This object > will be instanced by the FragmentExecutor and passed to the > FragmentContext.So the HashJoin operator can obtain it through the > FragmentContext. > 3. RuntimeFilterRequestHandler > It is responsible to accept a SendRuntimeFilterRequest RPC to strip the > actual BloomFilter from the network. It then translates this filter to the > WorkerBee’s new interface registerRuntimeFilter. > Another RPC type is BroadcastRuntimeFilterRequest. It will register the > accepted global bloom filter to the WorkerBee by the registerRuntimeFilter > method and then propagate to the FragmentContext through which the probe side > scan node can fetch the aggregated bloom filter. > 4.RuntimeFilterManager > The foreman will instance a RuntimeFilterManager .It will indirectly get > every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been > accepted and aggregated . It will broadcast the aggregated bloom filter to > all the prob
[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)
[ https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16563539#comment-16563539 ] ASF GitHub Bot commented on DRILL-6385: --- weijietong commented on a change in pull request #1334: DRILL-6385: Support JPPD feature URL: https://github.com/apache/drill/pull/1334#discussion_r206498489 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/work/filter/BloomFilterCreator.java ## @@ -0,0 +1,41 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.work.filter; + +import io.netty.buffer.DrillBuf; +import org.apache.drill.exec.memory.BufferAllocator; + +public class BloomFilterCreator { Review comment: This description will be add at the `ScanBatch`'s `applyRuntimeFilter `. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Support JPPD (Join Predicate Push Down) > --- > > Key: DRILL-6385 > URL: https://issues.apache.org/jira/browse/DRILL-6385 > Project: Apache Drill > Issue Type: New Feature > Components: Server, Execution - Flow >Affects Versions: 1.14.0 >Reporter: weijie.tong >Assignee: weijie.tong >Priority: Major > > This feature is to support the JPPD (Join Predicate Push Down). It will > benefit the HashJoin ,Broadcast HashJoin performance by reducing the number > of rows to send across the network ,the memory consumed. This feature is > already supported by Impala which calls it RuntimeFilter > ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]). > The first PR will try to push down a bloom filter of HashJoin node to > Parquet’s scan node. The propose basic procedure is described as follow: > # The HashJoin build side accumulate the equal join condition rows to > construct a bloom filter. Then it sends out the bloom filter to the foreman > node. > # The foreman node accept the bloom filters passively from all the fragments > that has the HashJoin operator. It then aggregates the bloom filters to form > a global bloom filter. > # The foreman node broadcasts the global bloom filter to all the probe side > scan nodes which maybe already have send out partial data to the hash join > nodes(currently the hash join node will prefetch one batch from both sides ). > 4. The scan node accepts a global bloom filter from the foreman node. > It will filter the rest rows satisfying the bloom filter. > > To implement above execution flow, some main new notion described as below: > 1. RuntimeFilter > It’s a filter container which may contain BloomFilter or MinMaxFilter. > 2. RuntimeFilterReporter > It wraps the logic to send hash join’s bloom filter to the foreman.The > serialized bloom filter will be sent out through the data tunnel.This object > will be instanced by the FragmentExecutor and passed to the > FragmentContext.So the HashJoin operator can obtain it through the > FragmentContext. > 3. RuntimeFilterRequestHandler > It is responsible to accept a SendRuntimeFilterRequest RPC to strip the > actual BloomFilter from the network. It then translates this filter to the > WorkerBee’s new interface registerRuntimeFilter. > Another RPC type is BroadcastRuntimeFilterRequest. It will register the > accepted global bloom filter to the WorkerBee by the registerRuntimeFilter > method and then propagate to the FragmentContext through which the probe side > scan node can fetch the aggregated bloom filter. > 4.RuntimeFilterManager > The foreman will instance a RuntimeFilterManager .It will indirectly get > every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been > accepted and aggregated . It will broadcast the aggregated bloom filter to > all the probe
[jira] [Updated] (DRILL-6634) Add udf module under contrib directory and move some udfs into it
[ https://issues.apache.org/jira/browse/DRILL-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6634: Labels: ready-to-commit (was: ) > Add udf module under contrib directory and move some udfs into it > - > > Key: DRILL-6634 > URL: https://issues.apache.org/jira/browse/DRILL-6634 > Project: Apache Drill > Issue Type: Task >Affects Versions: 1.14.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > 1. Create udf module under contrib. > 2. Move udfs from DRILL-6519, DRILL-5834, DRILL-5634 from exec to new module. > 3. Integrate gis module into new module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6634) Add udf module under contrib directory and move some udfs into it
[ https://issues.apache.org/jira/browse/DRILL-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6634: Reviewer: Paul Rogers > Add udf module under contrib directory and move some udfs into it > - > > Key: DRILL-6634 > URL: https://issues.apache.org/jira/browse/DRILL-6634 > Project: Apache Drill > Issue Type: Task >Affects Versions: 1.14.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > 1. Create udf module under contrib. > 2. Move udfs from DRILL-6519, DRILL-5834, DRILL-5634 from exec to new module. > 3. Integrate gis module into new module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)