[
https://issues.apache.org/jira/browse/DRILL-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15703972#comment-15703972
]
Paul Rogers commented on DRILL-5083:
------------------------------------
Looking at {{AbstractRecordBatch.next( )}}: If the operator ended up in the
{{STOP}} state, then {{next()}} will call {{innerNext()}}.
{{MergeJoin.innerNext()}} does a {{status.prepare()}}, which checks both sides
of the join. If either returned {{OK_NEW_SCHEMA}}, then the merge proceeds to
code generation.
What may have happened is that one side of the join sorted and returned its
first batch (which would have an {{OK_NEW_SCHEMA}} code.) The other side, also
a sort, started to do its sort and ran out of memory. When {{next()}} is called
again on clean-up, the {{OK_NEW_SCHEMA}} still stands, and the other sort is
called again, resulting in another OOM and the cycle repeats.
Seems there needs to be a more clear marker of failed: operators need to set an
error state when the encounter the (unchecked) {{UserException}}.
> IteratorValidator does not handle RecordIterator cleanup call to next( )
> ------------------------------------------------------------------------
>
> Key: DRILL-5083
> URL: https://issues.apache.org/jira/browse/DRILL-5083
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.8.0
> Reporter: Paul Rogers
> Priority: Minor
>
> This one is very confusing...
> In a test with a MergeJoin and external sort, operators are stacked something
> like this:
> {code}
> Screen
> - MergeJoin
> - - External Sort
> ...
> {code}
> Using the injector to force a OOM in spill, the external sort threw a
> UserException up the stack. This was handed by:
> {code}
> IteratorValidatorBatchIterator.next( )
> RecordIterator.clearInflightBatches( )
> RecordIterator.close( )
> MergeJoinBatch.close( )
> {code}
> Which does the following:
> {code}
> // Check whether next() should even have been called in current state.
> if (null != exceptionState) {
> throw new IllegalStateException(
> {code}
> But, the exceptionState is set, so we end up throwing an
> IllegalStateException during cleanup.
> Seems the code should agree: if {{next( )}} will be called during cleanup,
> then {{next( )}} should gracefully handle that case.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)