I could not find anywhere this is mentioned in the docs, but it has come up a few times one the list. While we made a number of efforts to move our interactions with the Parquet library to the off-heap memory (which we use everywhere else in the engine during processing) the version of the writer we are using still buffers a non-trivial amount of data into heap memory when writing parquet files. Try raising your JVM heap memory in drill-env.sh on startup and see if that prevents the out of memory issue.
Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Fri, May 13, 2016 at 9:07 AM, Stefan Sedich <[email protected]> wrote: > Just trying to do a CTAS on a postgres table, it is not huge and only has > 16 odd million rows, I end up with an out of memory after a while. > > Unable to handle out of memory condition in FragmentExecutor. > > java.lang.OutOfMemoryError: GC overhead limit exceeded > > > Is there a way to avoid this without needing to do the CTAS on a subset of > my table? >
