I am curious if this is a bug in the JDBC plugin. Can you try to change the
output format to CSV? In that case we don't do any large buffering.

Jason Altekruse
Software Engineer at Dremio
Apache Drill Committer

On Fri, May 13, 2016 at 10:35 AM, Stefan Sedich <[email protected]>
wrote:

> Seems like it just ran out of memory again and was not hanging. I tried to
> append a limit 100 to the select query and it still runs out of memory,
> Just ran the CTAS against some other smaller tables and it works fine.
>
> I will play around with this some more on the weekend, I can only assume I
> am messing something up here, I have in the past created parquet files from
> large tables without any issue, will report back.
>
>
>
> Thanks
>
> On Fri, May 13, 2016 at 10:05 AM Abdel Hakim Deneche <
> [email protected]>
> wrote:
>
> > Stefan,
> >
> > Can you share the query profile for the query that seems to be running
> > forever ? you won't find it on disk but you can append .json to the
> profile
> > web url and save the file.
> >
> > Thanks
> >
> > On Fri, May 13, 2016 at 9:55 AM, Stefan Sedich <[email protected]>
> > wrote:
> >
> > > Zelaine,
> > >
> > > It does, I forgot about those ones, I will do a test where I filter
> those
> > > out and see how I go, in my test with a 12GB heap size it seemed to
> just
> > > sit there forever and not finish.
> > >
> > >
> > > Thanks
> > >
> > > On Fri, May 13, 2016 at 9:50 AM Zelaine Fong <[email protected]>
> wrote:
> > >
> > > > Stefan,
> > > >
> > > > Does your source data contain varchar columns?  We've seen instances
> > > where
> > > > Drill isn't as efficient as it can be when Parquet is dealing with
> > > variable
> > > > length columns.
> > > >
> > > > -- Zelaine
> > > >
> > > > On Fri, May 13, 2016 at 9:26 AM, Stefan Sedich <
> > [email protected]>
> > > > wrote:
> > > >
> > > > > Thanks for getting back to me so fast!
> > > > >
> > > > > I was just playing with that now, went up to 8GB and still ran into
> > it,
> > > > > trying to go higher to see if I can find the sweet spot, only got
> > 16GB
> > > > > total RAM on this laptop :)
> > > > >
> > > > > Is this an expected amount of memory for not an overly huge table
> (16
> > > > > million rows, 6 columns of integers), even now at a 12GB heap seems
> > to
> > > > have
> > > > > filled up again.
> > > > >
> > > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Fri, May 13, 2016 at 9:20 AM Jason Altekruse <[email protected]>
> > > > wrote:
> > > > >
> > > > > > I could not find anywhere this is mentioned in the docs, but it
> has
> > > > come
> > > > > up
> > > > > > a few times one the list. While we made a number of efforts to
> move
> > > our
> > > > > > interactions with the Parquet library to the off-heap memory
> (which
> > > we
> > > > > use
> > > > > > everywhere else in the engine during processing) the version of
> the
> > > > > writer
> > > > > > we are using still buffers a non-trivial amount of data into heap
> > > > memory
> > > > > > when writing parquet files. Try raising your JVM heap memory in
> > > > > > drill-env.sh on startup and see if that prevents the out of
> memory
> > > > issue.
> > > > > >
> > > > > > Jason Altekruse
> > > > > > Software Engineer at Dremio
> > > > > > Apache Drill Committer
> > > > > >
> > > > > > On Fri, May 13, 2016 at 9:07 AM, Stefan Sedich <
> > > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Just trying to do a CTAS on a postgres table, it is not huge
> and
> > > only
> > > > > has
> > > > > > > 16 odd million rows, I end up with an out of memory after a
> > while.
> > > > > > >
> > > > > > > Unable to handle out of memory condition in FragmentExecutor.
> > > > > > >
> > > > > > > java.lang.OutOfMemoryError: GC overhead limit exceeded
> > > > > > >
> > > > > > >
> > > > > > > Is there a way to avoid this without needing to do the CTAS on
> a
> > > > subset
> > > > > > of
> > > > > > > my table?
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>

Reply via email to