Ok, thanks. I will add a comment to the JIRA and assign it to you ;)

On Tue, Dec 15, 2015 at 12:02 PM, Amit Hadke <[email protected]> wrote:

> Yup that may be it. I'll add an option to not hold on to left side iterator
> batches.
>
> On Tue, Dec 15, 2015 at 11:56 AM, Abdel Hakim Deneche <
> [email protected]
> > wrote:
>
> > RecordIterator.mark() is only called for the right side of the merge
> join.
> > How about the left side, de we ever release the batches on the left side
> ?
> > In 4190 the sort that runs out of memory is on the left side of the
> merge.
> >
> > On Tue, Dec 15, 2015 at 11:51 AM, Abdel Hakim Deneche <
> > [email protected]
> > > wrote:
> >
> > > I see, it's in RecordIterator.mark()
> > >
> > > On Tue, Dec 15, 2015 at 11:50 AM, Abdel Hakim Deneche <
> > > [email protected]> wrote:
> > >
> > >> Amit,
> > >>
> > >> thanks for the prompt answer. Can you point me, in the code, where the
> > >> purge is done ?
> > >>
> > >>
> > >>
> > >> On Tue, Dec 15, 2015 at 11:42 AM, Amit Hadke <[email protected]>
> > >> wrote:
> > >>
> > >>> Hi Hakim,
> > >>> RecordIterator will not hold all batches in memory. It holds batches
> > from
> > >>> last mark() operation.
> > >>> It will purge batches as join moves along.
> > >>>
> > >>> Worst case case is when there are lots of repeating values on right
> > side
> > >>> which iterator will hold in memory.
> > >>>
> > >>> ~ Amit.
> > >>>
> > >>> On Tue, Dec 15, 2015 at 11:23 AM, Abdel Hakim Deneche <
> > >>> [email protected]
> > >>> > wrote:
> > >>>
> > >>> > Amit,
> > >>> >
> > >>> > I am looking at DRILL-4190 where one of the sort operators is
> hitting
> > >>> it's
> > >>> > allocator limit when it's sending data downstream. This generally
> > >>> happen
> > >>> > when a downstream operator is holding those batches in memory (e.g.
> > >>> Window
> > >>> > Operator).
> > >>> >
> > >>> > The same query is running fine on 1.2.0 which seems to suggest that
> > the
> > >>> > recent changes to MergeJoinBatch "may" be causing the issue.
> > >>> >
> > >>> > It looks like RecordIterator is holding all incoming batches into a
> > >>> > TreeRangeMap and if I'm not mistaken it doesn't release anything
> > until
> > >>> it's
> > >>> > closed. Is this correct ?
> > >>> >
> > >>> > I am not familiar with how merge join used to work before
> > >>> RecordIterator.
> > >>> > Was it also the case that we hold all incoming batches in memory ?
> > >>> >
> > >>> > Thanks
> > >>> >
> > >>> > --
> > >>> >
> > >>> > Abdelhakim Deneche
> > >>> >
> > >>> > Software Engineer
> > >>> >
> > >>> >   <http://www.mapr.com/>
> > >>> >
> > >>> >
> > >>> > Now Available - Free Hadoop On-Demand Training
> > >>> > <
> > >>> >
> > >>>
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >> Abdelhakim Deneche
> > >>
> > >> Software Engineer
> > >>
> > >>   <http://www.mapr.com/>
> > >>
> > >>
> > >> Now Available - Free Hadoop On-Demand Training
> > >> <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> > >>
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   <http://www.mapr.com/>
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Reply via email to