Hi,

I am evaluating Apache Drill and have run into the following issue with CTAS 
and large table from RDBMS.

I am using apache drill 1.12.0 with ubuntu 16.04.3 vm 8 cores, 8GB of ram with 
all patches, oracle java 1.8.0.161-b12.  Postgres 9.4.1212 jdbc driver 
connecting to Greenplum 4.3.
I enabled the rdms plugin to connect to Greenplum.


I start drill using drill-embedded but this also fails with sqline.  I have 
tried a few older versions of the jdbc driver with the same error message.
The view contains 25M rows.  This works with smaller tables.  Can apache drill 
chunk data while processing?

The environment is set to Parquet storage format.

I ran a the following sql (example):

create table dfs.data.table1(col1, col2, col3, col4) partition by (col4) as
SELECT col1, col2, col3, col4
FROM bdl.schema.view_in_greenplum
order by col4, col3;

This uses 100% cpu until it fails with the following error:

2018-02-23 10:20:00,681 [256fd5fc-efb8-3504-d08a-0fdcb662f9d6:frag:0:0] ERROR 
o.a.drill.common.CatastrophicFailure - Catastrophic Failure Occurred, exiting. 
Information message: Unable to handle out of memory condition in 
FragmentExecutor.
java.lang.OutOfMemoryError: Java heap space
        at java.lang.String.toCharArray(String.java:2899) ~[na:1.8.0_161]
        at java.util.zip.ZipCoder.getBytes(ZipCoder.java:78) ~[na:1.8.0_161]
        at java.util.zip.ZipFile.getEntry(ZipFile.java:316) ~[na:1.8.0_161]
        at java.util.jar.JarFile.getEntry(JarFile.java:240) ~[na:1.8.0_161]
        at java.util.jar.JarFile.getJarEntry(JarFile.java:223) ~[na:1.8.0_161]
        at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1042) 
~[na:1.8.0_161]
        at sun.misc.URLClassPath.getResource(URLClassPath.java:239) 
~[na:1.8.0_161]
        at java.net.URLClassLoader$1.run(URLClassLoader.java:365) 
~[na:1.8.0_161]
        at java.net.URLClassLoader$1.run(URLClassLoader.java:362) 
~[na:1.8.0_161]
        at java.security.AccessController.doPrivileged(Native Method) 
~[na:1.8.0_161]
        at java.net.URLClassLoader.findClass(URLClassLoader.java:361) 
~[na:1.8.0_161]
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_161]
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338) 
~[na:1.8.0_161]
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[na:1.8.0_161]
        at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2122)
 ~[postgresql-9.4.1212.jar:9.4.1212]
        at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:288) 
~[postgresql-9.4.1212.jar:9.4.1212]
        at 
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:430) 
~[postgresql-9.4.1212.jar:9.4.1212]
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:356) 
~[postgresql-9.4.1212.jar:9.4.1212]
        at 
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:303) 
~[postgresql-9.4.1212.jar:9.4.1212]
        at 
org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:289) 
~[postgresql-9.4.1212.jar:9.4.1212]
        at 
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:266) 
~[postgresql-9.4.1212.jar:9.4.1212]
        at org.postgresql.jdbc.PgStatement.executeQuery(PgStatement.java:233) 
~[postgresql-9.4.1212.jar:9.4.1212]
        at 
org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
 ~[commons-dbcp-1.4.jar:1.4]
        at 
org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
 ~[commons-dbcp-1.4.jar:1.4]
        at 
org.apache.drill.exec.store.jdbc.JdbcRecordReader.setup(JdbcRecordReader.java:177)
 ~[drill-jdbc-storage-1.12.0.jar:1.12.0]
        at 
org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas(ScanBatch.java:242)
 ~[drill-java-exec-1.12.0.jar:1.12.0]
        at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:166) 
~[drill-java-exec-1.12.0.jar:1.12.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 ~[drill-java-exec-1.12.0.jar:1.12.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
 ~[drill-java-exec-1.12.0.jar:1.12.0]
        at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-1.12.0.jar:1.12.0]
        at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
 ~[drill-java-exec-1.12.0.jar:1.12.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:164)
 ~[drill-java-exec-1.12.0.jar:1.12.0]


This also fails while running on Windows 2012 R2, oracle jdk 1.8.0.152 with 
same postgres driver.

Thank you for any assistance you can provide.

Edgardo Robles
Dell EMC | Big Data Operations
edgardo.robl...@emc.com<mailto:edgardo.robl...@emc.com>


Reply via email to