Stefan,

Can you share the query profile for the query that seems to be running
forever ? you won't find it on disk but you can append .json to the profile
web url and save the file.

Thanks

On Fri, May 13, 2016 at 9:55 AM, Stefan Sedich <[email protected]>
wrote:

> Zelaine,
>
> It does, I forgot about those ones, I will do a test where I filter those
> out and see how I go, in my test with a 12GB heap size it seemed to just
> sit there forever and not finish.
>
>
> Thanks
>
> On Fri, May 13, 2016 at 9:50 AM Zelaine Fong <[email protected]> wrote:
>
> > Stefan,
> >
> > Does your source data contain varchar columns?  We've seen instances
> where
> > Drill isn't as efficient as it can be when Parquet is dealing with
> variable
> > length columns.
> >
> > -- Zelaine
> >
> > On Fri, May 13, 2016 at 9:26 AM, Stefan Sedich <[email protected]>
> > wrote:
> >
> > > Thanks for getting back to me so fast!
> > >
> > > I was just playing with that now, went up to 8GB and still ran into it,
> > > trying to go higher to see if I can find the sweet spot, only got 16GB
> > > total RAM on this laptop :)
> > >
> > > Is this an expected amount of memory for not an overly huge table (16
> > > million rows, 6 columns of integers), even now at a 12GB heap seems to
> > have
> > > filled up again.
> > >
> > >
> > >
> > > Thanks
> > >
> > > On Fri, May 13, 2016 at 9:20 AM Jason Altekruse <[email protected]>
> > wrote:
> > >
> > > > I could not find anywhere this is mentioned in the docs, but it has
> > come
> > > up
> > > > a few times one the list. While we made a number of efforts to move
> our
> > > > interactions with the Parquet library to the off-heap memory (which
> we
> > > use
> > > > everywhere else in the engine during processing) the version of the
> > > writer
> > > > we are using still buffers a non-trivial amount of data into heap
> > memory
> > > > when writing parquet files. Try raising your JVM heap memory in
> > > > drill-env.sh on startup and see if that prevents the out of memory
> > issue.
> > > >
> > > > Jason Altekruse
> > > > Software Engineer at Dremio
> > > > Apache Drill Committer
> > > >
> > > > On Fri, May 13, 2016 at 9:07 AM, Stefan Sedich <
> > [email protected]>
> > > > wrote:
> > > >
> > > > > Just trying to do a CTAS on a postgres table, it is not huge and
> only
> > > has
> > > > > 16 odd million rows, I end up with an out of memory after a while.
> > > > >
> > > > > Unable to handle out of memory condition in FragmentExecutor.
> > > > >
> > > > > java.lang.OutOfMemoryError: GC overhead limit exceeded
> > > > >
> > > > >
> > > > > Is there a way to avoid this without needing to do the CTAS on a
> > subset
> > > > of
> > > > > my table?
> > > > >
> > > >
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Reply via email to