[ 
https://issues.apache.org/jira/browse/DRILL-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15703972#comment-15703972
 ] 

Paul Rogers commented on DRILL-5083:
------------------------------------

Looking at {{AbstractRecordBatch.next( )}}: If the operator ended up in the 
{{STOP}} state, then {{next()}} will call {{innerNext()}}.

{{MergeJoin.innerNext()}} does a {{status.prepare()}}, which checks both sides 
of the join. If either returned {{OK_NEW_SCHEMA}}, then the merge proceeds to 
code generation.

What may have happened is that one side of the join sorted and returned its 
first batch (which would have an {{OK_NEW_SCHEMA}} code.) The other side, also 
a sort, started to do its sort and ran out of memory. When {{next()}} is called 
again on clean-up, the {{OK_NEW_SCHEMA}} still stands, and the other sort is 
called again, resulting in another OOM and the cycle repeats.

Seems there needs to be a more clear marker of failed: operators need to set an 
error state when the encounter the (unchecked) {{UserException}}.

> IteratorValidator does not handle RecordIterator cleanup call to next( )
> ------------------------------------------------------------------------
>
>                 Key: DRILL-5083
>                 URL: https://issues.apache.org/jira/browse/DRILL-5083
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.8.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> This one is very confusing...
> In a test with a MergeJoin and external sort, operators are stacked something 
> like this:
> {code}
> Screen
> - MergeJoin
> - - External Sort
> ...
> {code}
> Using the injector to force a OOM in spill, the external sort threw a 
> UserException up the stack. This was handed by:
> {code}
> IteratorValidatorBatchIterator.next( )
> RecordIterator.clearInflightBatches( )
> RecordIterator.close( )
> MergeJoinBatch.close( )
> {code}
> Which does the following:
> {code}
>       // Check whether next() should even have been called in current state.
>       if (null != exceptionState) {
>         throw new IllegalStateException(
> {code}
> But, the exceptionState is set, so we end up throwing an 
> IllegalStateException during cleanup.
> Seems the code should agree: if {{next( )}} will be called during cleanup, 
> then {{next( )}} should gracefully handle that case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to