Glad to see you figure out the cause of the problem! I thought the drillbit which run out of spilling disk space would capture the error stack trace in it's drillbit.log. If the query is executed on multiple nodes, you probably need concat drillbit.log from all the nodes and search through the entire logs to find the cause.
On Thu, Aug 18, 2016 at 10:52 AM, Stefan Sedich <[email protected]> wrote: > Ok got it! looks like it was the spil directory, > http://www.openkb.info/2016/04/how-to-use-mapr-local-volume-as-spill.html > created > this and I am good to go! > > Is there any reason why the EMR template does not do this? considering the > root volume on these EMR instances seems to be very small. > > On Thu, Aug 18, 2016 at 10:22 AM Stefan Sedich <[email protected]> > wrote: > >> Spun up a fresh cluster and I can see a useful error now, looks like it is >> due to no space, I assume it is due to the root volume being almost full >> (this is a fresh EMR install). >> >> When running in EMR is there any drill config I should be changing to >> avoid this? >> >> >> [Error Id: 650c9f25-760e-498d-9f2c-3e37f6ac013d on >> ip-10-154-247-159.ec2.internal:31010] >> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: >> RuntimeException: Error closing operators >> >> Fragment 0:0 >> >> [Error Id: 650c9f25-760e-498d-9f2c-3e37f6ac013d on >> ip-10-154-247-159.ec2.internal:31010] >> at >> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) >> ~[drill-common-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:318) >> [drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:185) >> [drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:287) >> [drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) >> [drill-common-1.6.0.jar:1.6.0] >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> [na:1.7.0_71] >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> [na:1.7.0_71] >> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] >> Caused by: java.lang.RuntimeException: Error closing operators >> at >> org.apache.drill.exec.physical.impl.BaseRootExec$1.get(BaseRootExec.java:144) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.BaseRootExec$1.get(BaseRootExec.java:141) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.common.DeferredException.addException(DeferredException.java:81) >> ~[drill-common-1.6.0.jar:1.6.0] >> at >> org.apache.drill.common.DeferredException.suppressingClose(DeferredException.java:161) >> ~[drill-common-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.BaseRootExec.close(BaseRootExec.java:149) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.close(ScreenCreator.java:141) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:336) >> [drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:180) >> [drill-java-exec-1.6.0.jar:1.6.0] >> ... 5 common frames omitted >> 2016-08-18 17:12:44,146 [drill-executor-1] ERROR >> o.a.d.exec.server.BootStrapContext - >> org.apache.drill.exec.work.WorkManager$WorkerBee$1.run() leaked an >> exception. >> org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device >> at >> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:249) >> ~[hadoop-common-2.7.0-mapr-1602.jar:na] >> at >> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) >> ~[na:1.7.0_71] >> at >> java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) >> ~[na:1.7.0_71] >> at >> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) >> ~[hadoop-common-2.7.0-mapr-1602.jar:na] >> at java.io.DataOutputStream.write(DataOutputStream.java:107) >> ~[na:1.7.0_71] >> at >> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.writeChunk(ChecksumFileSystem.java:419) >> ~[hadoop-common-2.7.0-mapr-1602.jar:na] >> at >> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:206) >> ~[hadoop-common-2.7.0-mapr-1602.jar:na] >> at >> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:163) >> ~[hadoop-common-2.7.0-mapr-1602.jar:na] >> at >> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:144) >> ~[hadoop-common-2.7.0-mapr-1602.jar:na] >> at >> org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:135) >> ~[hadoop-common-2.7.0-mapr-1602.jar:na] >> at >> org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:110) >> ~[hadoop-common-2.7.0-mapr-1602.jar:na] >> at >> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) >> ~[hadoop-common-2.7.0-mapr-1602.jar:na] >> at java.io.DataOutputStream.write(DataOutputStream.java:107) >> ~[na:1.7.0_71] >> at java.io.FilterOutputStream.write(FilterOutputStream.java:97) >> ~[na:1.7.0_71] >> at >> io.netty.buffer.PooledUnsafeDirectByteBuf.getBytes(PooledUnsafeDirectByteBuf.java:184) >> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final] >> at >> io.netty.buffer.WrappedByteBuf.getBytes(WrappedByteBuf.java:301) >> ~[netty-buffer-4.0.27.Final.jar:4.0.27.Final] >> at >> io.netty.buffer.UnsafeDirectLittleEndian.getBytes(UnsafeDirectLittleEndian.java:30) >> ~[drill-memory-base-1.6.0.jar:4.0.27.Final] >> at io.netty.buffer.DrillBuf.getBytes(DrillBuf.java:709) >> ~[drill-memory-base-1.6.0.jar:4.0.27.Final] >> at >> org.apache.drill.exec.cache.VectorAccessibleSerializable.writeToStream(VectorAccessibleSerializable.java:172) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.xsort.BatchGroup.addBatch(BatchGroup.java:96) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:562) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:394) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:94) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:135) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:135) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:91) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:135) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:81) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:257) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:251) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at java.security.AccessController.doPrivileged(Native Method) >> ~[na:1.7.0_71] >> at javax.security.auth.Subject.doAs(Subject.java:415) >> ~[na:1.7.0_71] >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595) >> ~[hadoop-common-2.7.0-mapr-1602.jar:na] >> at >> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:251) >> ~[drill-java-exec-1.6.0.jar:1.6.0] >> at >> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) >> ~[drill-common-1.6.0.jar:1.6.0] >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> ~[na:1.7.0_71] >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> [na:1.7.0_71] >> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] >> Caused by: java.io.IOException: No space left on device >> at java.io.FileOutputStream.writeBytes(Native Method) >> ~[na:1.7.0_71] >> at java.io.FileOutputStream.write(FileOutputStream.java:345) >> ~[na:1.7.0_71] >> at >> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:247) >> ~[hadoop-common-2.7.0-mapr-1602.jar:na] >> ... 59 common frames omitted >> >> >> >> Thanks >> >> On Wed, Aug 17, 2016 at 9:01 PM Jinfeng Ni <[email protected]> wrote: >> >>> The source code that raised this exception seems to be here [1]. Can >>> you please check your drillbit.log and see if it has more information? >>> >>> >>> [1] >>> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/BaseRootExec.java#L144 >>> >>> On Wed, Aug 17, 2016 at 8:37 PM, Stefan Sedich <[email protected]> >>> wrote: >>> > I have a gzipped json sample file ~2GB which I can create a parquet >>> table >>> > from perfectly fine on my laptop. >>> > >>> > I have spun up a new EMR cluster running MapR M5 and am using the >>> bootstrap >>> > script: https://www.mapr.com/blog/bootstrap-apache-drill-amazon-emr >>> > >>> > Running the same CTAS I can see if gets through reading around 1/5th of >>> the >>> > rows from the json and crashes with the following: >>> > >>> > Error: SYSTEM ERROR: RuntimeException: Error closing operators >>> > >>> > What is my best way to really diagnose what is going on? I wouldn't have >>> > thought it would be a memory issue as these EMR nodes have far more >>> memory >>> > that that of my laptop. >>> > >>> > Any advice is appreciated. >>> > >>> > >>> > >>> > Thanks >>> >>
