Zelaine, It does, I forgot about those ones, I will do a test where I filter those out and see how I go, in my test with a 12GB heap size it seemed to just sit there forever and not finish.
Thanks On Fri, May 13, 2016 at 9:50 AM Zelaine Fong <[email protected]> wrote: > Stefan, > > Does your source data contain varchar columns? We've seen instances where > Drill isn't as efficient as it can be when Parquet is dealing with variable > length columns. > > -- Zelaine > > On Fri, May 13, 2016 at 9:26 AM, Stefan Sedich <[email protected]> > wrote: > > > Thanks for getting back to me so fast! > > > > I was just playing with that now, went up to 8GB and still ran into it, > > trying to go higher to see if I can find the sweet spot, only got 16GB > > total RAM on this laptop :) > > > > Is this an expected amount of memory for not an overly huge table (16 > > million rows, 6 columns of integers), even now at a 12GB heap seems to > have > > filled up again. > > > > > > > > Thanks > > > > On Fri, May 13, 2016 at 9:20 AM Jason Altekruse <[email protected]> > wrote: > > > > > I could not find anywhere this is mentioned in the docs, but it has > come > > up > > > a few times one the list. While we made a number of efforts to move our > > > interactions with the Parquet library to the off-heap memory (which we > > use > > > everywhere else in the engine during processing) the version of the > > writer > > > we are using still buffers a non-trivial amount of data into heap > memory > > > when writing parquet files. Try raising your JVM heap memory in > > > drill-env.sh on startup and see if that prevents the out of memory > issue. > > > > > > Jason Altekruse > > > Software Engineer at Dremio > > > Apache Drill Committer > > > > > > On Fri, May 13, 2016 at 9:07 AM, Stefan Sedich < > [email protected]> > > > wrote: > > > > > > > Just trying to do a CTAS on a postgres table, it is not huge and only > > has > > > > 16 odd million rows, I end up with an out of memory after a while. > > > > > > > > Unable to handle out of memory condition in FragmentExecutor. > > > > > > > > java.lang.OutOfMemoryError: GC overhead limit exceeded > > > > > > > > > > > > Is there a way to avoid this without needing to do the CTAS on a > subset > > > of > > > > my table? > > > > > > > > > >
