[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645516#comment-16645516 ] ASF GitHub Bot commented on DRILL-6766: --- sohami closed pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java b/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java index 2b9b31783f5..ffcfa78a707 100644 --- a/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java +++ b/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java @@ -20,6 +20,7 @@ import java.io.IOException; import java.util.List; +import org.apache.drill.shaded.guava.com.google.common.annotations.VisibleForTesting; import org.apache.drill.shaded.guava.com.google.common.collect.Lists; import org.apache.drill.common.exceptions.DrillRuntimeException; import org.apache.drill.common.exceptions.UserException; @@ -71,6 +72,7 @@ import static org.apache.drill.exec.record.RecordBatch.IterOutcome.NONE; import static org.apache.drill.exec.record.RecordBatch.IterOutcome.OK; import static org.apache.drill.exec.record.RecordBatch.IterOutcome.OK_NEW_SCHEMA; +import static org.apache.drill.exec.record.RecordBatch.IterOutcome.STOP; public class StreamingAggBatch extends AbstractRecordBatch { static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(StreamingAggBatch.class); @@ -104,7 +106,7 @@ // call to inner next is made. private boolean sendEmit = false; // In the case where we see an OK_NEW_SCHEMA along with the end of a data set // we send out a batch with OK_NEW_SCHEMA first, then in the next iteration, -// we send out an emopty batch with EMIT. +// we send out an empty batch with EMIT. private IterOutcome lastKnownOutcome = OK; // keep track of the outcome from the previous call to incoming.next private boolean firstBatchForSchema = true; // true if the current batch came in with an OK_NEW_SCHEMA private boolean firstBatchForDataSet = true; // true if the current batch is the first batch in a data set @@ -127,7 +129,11 @@ private boolean specialBatchSent = false; private static final int SPECIAL_BATCH_COUNT = 1; - public StreamingAggBatch(StreamingAggregate popConfig, RecordBatch incoming, FragmentContext context) throws OutOfMemoryException { + // TODO: Needs to adapt to batch sizing rather than hardcoded constant value + private int maxOutputRowCount = ValueVector.MAX_ROW_COUNT; + + public StreamingAggBatch(StreamingAggregate popConfig, RecordBatch incoming, FragmentContext context) +throws OutOfMemoryException { super(popConfig, context); this.incoming = incoming; @@ -189,7 +195,7 @@ public IterOutcome innerNext() { // if a special batch has been sent, we have no data in the incoming so exit early if (done || specialBatchSent) { - assert (sendEmit != true); // if special batch sent with emit then flag will not be set + assert (!sendEmit); // if special batch sent with emit then flag will not be set return NONE; } @@ -199,6 +205,7 @@ public IterOutcome innerNext() { first = false; // first is set only in the case when we see a NONE after an empty first (and only) batch sendEmit = false; firstBatchForDataSet = true; + firstBatchForSchema = false; recordCount = 0; specialBatchSent = false; return EMIT; @@ -239,18 +246,17 @@ public IterOutcome innerNext() { done = true; return IterOutcome.STOP; } + firstBatchForSchema = true; break; case EMIT: // if we get an EMIT with an empty batch as the first (and therefore only) batch // we have to do the special handling if (firstBatchForDataSet && popConfig.getKeys().size() == 0 && incoming.getRecordCount() == 0) { constructSpecialBatch(); -firstBatchForDataSet = true; // reset on the next iteration // If outcome is NONE then we send the special batch in the first iteration and the NONE // outcome in the next iteration. If outcome is EMIT, we can send the special // batch and the
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645305#comment-16645305 ] ASF GitHub Bot commented on DRILL-6766: --- parthchandra commented on a change in pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#discussion_r224170040 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggTemplate.java ## @@ -405,16 +414,16 @@ private final void incIndex() { } private final void resetIndex() { -underlyingIndex = -1; -incIndex(); +underlyingIndex = 0; Review comment: Are you sure this is correct? incIndex() increments underlyingIndex and then calls getVectorIndex(underlyingIndex). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Unexpected case where rowId 0 in > right batch of lateral is smaller than rowId 32768 in left batch being > processed > at > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:609) > ~[drill-shaded-guava-23.0.jar:23.0] > at > org.apache.drill.exec.physical.impl.join.LateralJoinBatch.crossJoinAndOutputRecords(LateralJoinBatch.java:1024) > ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.join.LateralJoinBatch.produceOutputBatch(LateralJoinBatch.java:575) >
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642288#comment-16642288 ] ASF GitHub Bot commented on DRILL-6766: --- sohami commented on issue #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#issuecomment-427937236 @Ben-Zvi - Thanks for the review. I have squashed commits and rebased on latest master. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Labels: ready-to-commit > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Unexpected case where rowId 0 in > right batch of lateral is smaller than rowId 32768 in left batch being > processed > at > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:609) > ~[drill-shaded-guava-23.0.jar:23.0] > at > org.apache.drill.exec.physical.impl.join.LateralJoinBatch.crossJoinAndOutputRecords(LateralJoinBatch.java:1024) > ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.join.LateralJoinBatch.produceOutputBatch(LateralJoinBatch.java:575) > ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.join.LateralJoinBatch.innerNext(LateralJoinBatch.java:238) > ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:175) > ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] >
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639302#comment-16639302 ] ASF GitHub Bot commented on DRILL-6766: --- sohami commented on a change in pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#discussion_r222897205 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ## @@ -298,59 +305,63 @@ public IterOutcome innerNext() { // If outcome is NONE then we send the special batch in the first iteration and the NONE // outcome in the next iteration. If outcome is EMIT, we can send the special // batch and the EMIT outcome at the same time. - - IterOutcome finalOutcome = getFinalOutcome(); - return finalOutcome; + return getFinalOutcome(); } firstBatchForDataSet = true; firstBatchForSchema = false; if(first) { first = false; } -if(lastKnownOutcome == OK_NEW_SCHEMA) { - sendEmit = true; +if(returnOutcome == OK_NEW_SCHEMA) { + sendEmit = (aggregator == null) || aggregator.previousBatchProcessed(); Review comment: ideally `aggregator` should not be null here. This is just to avoid NPE and since returned AggOutcome is `RETURN_AND_RESET` with `OK_NEW_SCHEMA` then the next batch should be an empty batch with `EMIT` outcome. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639299#comment-16639299 ] ASF GitHub Bot commented on DRILL-6766: --- sohami commented on a change in pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#discussion_r222897199 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ## @@ -281,15 +288,15 @@ public IterOutcome innerNext() { container.setRecordCount(recordCount); logger.debug("Aggregator response {}, records {}", aggOutcome, aggregator.getOutputCount()); // overwrite the outcome variable since we no longer need to remember the first batch outcome -lastKnownOutcome = aggregator.getOutcome(); +IterOutcome returnOutcome = aggregator.getOutcome(); Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Unexpected case where rowId 0 in > right batch of lateral is smaller than rowId 32768 in left batch being > processed > at > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:609) > ~[drill-shaded-guava-23.0.jar:23.0] > at > org.apache.drill.exec.physical.impl.join.LateralJoinBatch.crossJoinAndOutputRecords(LateralJoinBatch.java:1024) > ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at >
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639300#comment-16639300 ] ASF GitHub Bot commented on DRILL-6766: --- sohami commented on a change in pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#discussion_r222897190 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ## @@ -259,15 +265,16 @@ public IterOutcome innerNext() { throw new IllegalStateException(String.format("unknown outcome %s", lastKnownOutcome)); } } else { - if ( lastKnownOutcome != NONE && firstBatchForDataSet && !aggregator.isDone()) { + if ( lastKnownOutcome != NONE && firstBatchForDataSet && !aggregator.isDone() +&& aggregator.previousBatchProcessed()) { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Unexpected case where rowId 0 in > right batch of lateral is smaller than rowId 32768 in left batch being > processed > at > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:609) > ~[drill-shaded-guava-23.0.jar:23.0] > at > org.apache.drill.exec.physical.impl.join.LateralJoinBatch.crossJoinAndOutputRecords(LateralJoinBatch.java:1024) > ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at >
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639298#comment-16639298 ] ASF GitHub Bot commented on DRILL-6766: --- sohami commented on a change in pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#discussion_r222897211 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ## @@ -298,59 +305,63 @@ public IterOutcome innerNext() { // If outcome is NONE then we send the special batch in the first iteration and the NONE // outcome in the next iteration. If outcome is EMIT, we can send the special // batch and the EMIT outcome at the same time. - - IterOutcome finalOutcome = getFinalOutcome(); - return finalOutcome; + return getFinalOutcome(); } firstBatchForDataSet = true; firstBatchForSchema = false; if(first) { first = false; } -if(lastKnownOutcome == OK_NEW_SCHEMA) { - sendEmit = true; +if(returnOutcome == OK_NEW_SCHEMA) { + sendEmit = (aggregator == null) || aggregator.previousBatchProcessed(); } // Release external sort batches after EMIT is seen ExternalSortBatch.releaseBatches(incoming); -return lastKnownOutcome; +lastKnownOutcome = EMIT; +return returnOutcome; case RETURN_OUTCOME: // In case of complex writer expression, vectors would be added to batch run-time. // We have to re-build the schema. if (complexWriters != null) { container.buildSchema(SelectionVectorMode.NONE); } -if (lastKnownOutcome == IterOutcome.NONE ) { +if (returnOutcome == IterOutcome.NONE ) { // we will set the 'done' flag in the next call to innerNext and use the lastKnownOutcome // to determine whether we should set the flag or not. // This is so that if someone calls getRecordCount in between calls to innerNext, we will // return the correct record count (if the done flag is set, we will return 0). if (first) { first = false; +lastKnownOutcome = NONE; return OK_NEW_SCHEMA; } else { +lastKnownOutcome = NONE; Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) >
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639301#comment-16639301 ] ASF GitHub Bot commented on DRILL-6766: --- sohami commented on a change in pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#discussion_r222897205 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ## @@ -298,59 +305,63 @@ public IterOutcome innerNext() { // If outcome is NONE then we send the special batch in the first iteration and the NONE // outcome in the next iteration. If outcome is EMIT, we can send the special // batch and the EMIT outcome at the same time. - - IterOutcome finalOutcome = getFinalOutcome(); - return finalOutcome; + return getFinalOutcome(); } firstBatchForDataSet = true; firstBatchForSchema = false; if(first) { first = false; } -if(lastKnownOutcome == OK_NEW_SCHEMA) { - sendEmit = true; +if(returnOutcome == OK_NEW_SCHEMA) { + sendEmit = (aggregator == null) || aggregator.previousBatchProcessed(); Review comment: `aggregator` should ideally not be null here. This is just to avoid NPE and since returned AggOutcome is `RETURN_AND_RESET` with `OK_NEW_SCHEMA` then the next batch should be an empty batch with `EMIT` outcome. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639174#comment-16639174 ] ASF GitHub Bot commented on DRILL-6766: --- Ben-Zvi commented on a change in pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#discussion_r222875690 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ## @@ -298,59 +305,63 @@ public IterOutcome innerNext() { // If outcome is NONE then we send the special batch in the first iteration and the NONE // outcome in the next iteration. If outcome is EMIT, we can send the special // batch and the EMIT outcome at the same time. - - IterOutcome finalOutcome = getFinalOutcome(); - return finalOutcome; + return getFinalOutcome(); } firstBatchForDataSet = true; firstBatchForSchema = false; if(first) { first = false; } -if(lastKnownOutcome == OK_NEW_SCHEMA) { - sendEmit = true; +if(returnOutcome == OK_NEW_SCHEMA) { + sendEmit = (aggregator == null) || aggregator.previousBatchProcessed(); } // Release external sort batches after EMIT is seen ExternalSortBatch.releaseBatches(incoming); -return lastKnownOutcome; +lastKnownOutcome = EMIT; +return returnOutcome; case RETURN_OUTCOME: // In case of complex writer expression, vectors would be added to batch run-time. // We have to re-build the schema. if (complexWriters != null) { container.buildSchema(SelectionVectorMode.NONE); } -if (lastKnownOutcome == IterOutcome.NONE ) { +if (returnOutcome == IterOutcome.NONE ) { // we will set the 'done' flag in the next call to innerNext and use the lastKnownOutcome // to determine whether we should set the flag or not. // This is so that if someone calls getRecordCount in between calls to innerNext, we will // return the correct record count (if the done flag is set, we will return 0). if (first) { first = false; +lastKnownOutcome = NONE; return OK_NEW_SCHEMA; } else { +lastKnownOutcome = NONE; Review comment: Very minor: The `lastKnownOutcome = NONE;` can be called once only, before the if() statement. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at >
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639176#comment-16639176 ] ASF GitHub Bot commented on DRILL-6766: --- Ben-Zvi commented on a change in pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#discussion_r222874733 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ## @@ -281,15 +288,15 @@ public IterOutcome innerNext() { container.setRecordCount(recordCount); logger.debug("Aggregator response {}, records {}", aggOutcome, aggregator.getOutputCount()); // overwrite the outcome variable since we no longer need to remember the first batch outcome -lastKnownOutcome = aggregator.getOutcome(); +IterOutcome returnOutcome = aggregator.getOutcome(); Review comment: Need to modify the comment above this line; like explaining that `lastKnownOutcome` is modified below if needed and as needed, depends on the `aggOutcome` . This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Unexpected case where rowId 0 in > right batch of lateral is smaller than rowId 32768 in left batch being > processed > at > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:609) > ~[drill-shaded-guava-23.0.jar:23.0] > at >
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639175#comment-16639175 ] ASF GitHub Bot commented on DRILL-6766: --- Ben-Zvi commented on a change in pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#discussion_r222876377 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ## @@ -298,59 +305,63 @@ public IterOutcome innerNext() { // If outcome is NONE then we send the special batch in the first iteration and the NONE // outcome in the next iteration. If outcome is EMIT, we can send the special // batch and the EMIT outcome at the same time. - - IterOutcome finalOutcome = getFinalOutcome(); - return finalOutcome; + return getFinalOutcome(); } firstBatchForDataSet = true; firstBatchForSchema = false; if(first) { first = false; } -if(lastKnownOutcome == OK_NEW_SCHEMA) { - sendEmit = true; +if(returnOutcome == OK_NEW_SCHEMA) { + sendEmit = (aggregator == null) || aggregator.previousBatchProcessed(); Review comment: (1) Can be helpful to add a short comment here. (2) When the `aggregator` is null -- is this the case of doing *DISTINCT* ? If so, could the previous batch still be partly unprocessed ? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748)
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639177#comment-16639177 ] ASF GitHub Bot commented on DRILL-6766: --- Ben-Zvi commented on a change in pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#discussion_r222874084 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ## @@ -259,15 +265,16 @@ public IterOutcome innerNext() { throw new IllegalStateException(String.format("unknown outcome %s", lastKnownOutcome)); } } else { - if ( lastKnownOutcome != NONE && firstBatchForDataSet && !aggregator.isDone()) { + if ( lastKnownOutcome != NONE && firstBatchForDataSet && !aggregator.isDone() +&& aggregator.previousBatchProcessed()) { Review comment: How about putting this "review comment" as a comment in the code (with some rewording).. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Unexpected case where rowId 0 in > right batch of lateral is smaller than rowId 32768 in left batch being > processed > at > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:609) > ~[drill-shaded-guava-23.0.jar:23.0] > at > org.apache.drill.exec.physical.impl.join.LateralJoinBatch.crossJoinAndOutputRecords(LateralJoinBatch.java:1024) >
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638665#comment-16638665 ] ASF GitHub Bot commented on DRILL-6766: --- sohami opened a new pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490 This PR has 2 commits: 1) Commit 1 just fixes and adds few logging statement mainly in Lateral/Unnest operator 2) Commit 2 fixes the issues where if one incoming batch to streaming agg is not fully consumed to produce an output batch because output batch got full, the unprocessed rows in incoming was discarded and new incoming was pulled by Streaming Agg. The commit fixes this case for both OK and EMIT outcome scenarios and have also added a mechanism to simulate output batch getting full scenario. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Unexpected case where rowId 0 in > right batch of lateral is smaller than rowId 32768 in left batch being > processed > at > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:609) > ~[drill-shaded-guava-23.0.jar:23.0] > at > org.apache.drill.exec.physical.impl.join.LateralJoinBatch.crossJoinAndOutputRecords(LateralJoinBatch.java:1024) > ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.join.LateralJoinBatch.produceOutputBatch(LateralJoinBatch.java:575) >
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638672#comment-16638672 ] ASF GitHub Bot commented on DRILL-6766: --- sohami commented on issue #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#issuecomment-427126018 With this change the jar size of jdbc-all package is increasing beyond limit. Upon checking more it's because of increase in size of 2 files below. Same thing is observed in couple of other PR's. Looks like we have to increase the size of jar as there is no new dependencies are added. _11846 oadd/org/apache/drill/exec/record/AbstractRecordBatch.class 1265 git.properties_ **compared to jdbc-all in apache master:** _11470 oadd/org/apache/drill/exec/record/AbstractRecordBatch.class 783 git.properties_ This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Unexpected case where rowId 0 in > right batch of lateral is smaller than rowId 32768 in left batch being > processed > at > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:609) > ~[drill-shaded-guava-23.0.jar:23.0] > at > org.apache.drill.exec.physical.impl.join.LateralJoinBatch.crossJoinAndOutputRecords(LateralJoinBatch.java:1024) > ~[drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at >
[jira] [Commented] (DRILL-6766) Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed
[ https://issues.apache.org/jira/browse/DRILL-6766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16638673#comment-16638673 ] ASF GitHub Bot commented on DRILL-6766: --- sohami commented on a change in pull request #1490: DRILL-6766: Lateral Unnest query : IllegalStateException - rowId in right batch of lateral is smaller than rowId in left batch being processed URL: https://github.com/apache/drill/pull/1490#discussion_r222782213 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java ## @@ -259,15 +265,16 @@ public IterOutcome innerNext() { throw new IllegalStateException(String.format("unknown outcome %s", lastKnownOutcome)); } } else { - if ( lastKnownOutcome != NONE && firstBatchForDataSet && !aggregator.isDone()) { + if ( lastKnownOutcome != NONE && firstBatchForDataSet && !aggregator.isDone() +&& aggregator.previousBatchProcessed()) { Review comment: This is the place where `StreamingAgg` after returning previous output batch doesn't know if the incoming batch still has some rows leftover to process before calling next(). The new added method `previousBatchProcessed` will provide that info now. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Lateral Unnest query : IllegalStateException - rowId in right batch of > lateral is smaller than rowId in left batch being processed > --- > > Key: DRILL-6766 > URL: https://issues.apache.org/jira/browse/DRILL-6766 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.14.0 >Reporter: Kedar Sankar Behera >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.15.0 > > > The error is coming when one batch of streaming agg is split across multiple > output batches. In that case the first output batch is sent downstream and in > subsequent next() StreamingAgg discards the unprocessed rows in previous > incoming and call's next() to upstream again. At this point lateral is > waiting to receive the rows for unprocessed ones from right side and instead > it see's a batch with lower rowId from streaming agg resulting in the > exception. > This is also a regression in previous behavior where without EMIT outcome > support StreamingAgg was handling this case properly. > {code:java} > SELECT ttt.number_segments_in_group from BGFsmall t, LATERAL (select > l.o.`element`.geo_segments.list as ot from unnest(t.geo_onds.list) l(o)) tt, > LATERAL (select count(*) as number_segments_in_group from unnest(tt.ot) > ll(o)) ttt limit 3; > {code} > Log:- > {code:java} > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Unexpected case where rowId 0 in right batch of > lateral is smaller than rowId 32768 in left batch being processed > > Fragment 0:0 > > [Error Id: cfe63aaa-7115-4fab-b0fb-d5098d3b6350 on drill182:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:360) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:215) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:326) > [drill-java-exec-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.15.0-SNAPSHOT.jar:1.15.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] > Caused by: java.lang.IllegalStateException: Unexpected case where rowId 0 in > right batch of lateral is smaller than rowId 32768 in left batch being > processed > at > org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:609) >