Hi Hakim,
RecordIterator will not hold all batches in memory. It holds batches from
last mark() operation.
It will purge batches as join moves along.

Worst case case is when there are lots of repeating values on right side
which iterator will hold in memory.

~ Amit.

On Tue, Dec 15, 2015 at 11:23 AM, Abdel Hakim Deneche <[email protected]
> wrote:

> Amit,
>
> I am looking at DRILL-4190 where one of the sort operators is hitting it's
> allocator limit when it's sending data downstream. This generally happen
> when a downstream operator is holding those batches in memory (e.g. Window
> Operator).
>
> The same query is running fine on 1.2.0 which seems to suggest that the
> recent changes to MergeJoinBatch "may" be causing the issue.
>
> It looks like RecordIterator is holding all incoming batches into a
> TreeRangeMap and if I'm not mistaken it doesn't release anything until it's
> closed. Is this correct ?
>
> I am not familiar with how merge join used to work before RecordIterator.
> Was it also the case that we hold all incoming batches in memory ?
>
> Thanks
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Reply via email to