[ https://issues.apache.org/jira/browse/DRILL-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078790#comment-16078790 ]
Rahul Challapalli commented on DRILL-1162: ------------------------------------------ Eventually the drillbit crashed. There were many of these messages in the logs {code} java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused: qa-node183.qa.lab/10.10.100.183:31012 at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:47) ~[netty-common-4.0.27.Final.jar:4.0.27.Final] at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:225) [drill-rpc-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.rpc.BasicClient$ConnectionMultiListener$ConnectionHandler.operationComplete(BasicClient.java:212) [drill-rpc-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680) [netty-common-4.0.27.Final.jar:4.0.27.Final] at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603) [netty-common-4.0.27.Final.jar:4.0.27.Final] at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563) [netty-common-4.0.27.Final.jar:4.0.27.Final] at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424) [netty-common-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:268) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:284) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) [netty-common-4.0.27.Final.jar:4.0.27.Final] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92] Caused by: java.net.ConnectException: Connection refused: qa-node183.qa.lab/10.10.100.183:31012 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[na:1.8.0_92] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[na:1.8.0_92] at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224) ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:281) [netty-transport-4.0.27.Final.jar:4.0.27.Final] ... 6 common frames omitted {code} And eventually before the drillbit crashing, the log contains {code} java.lang.InterruptedException: null at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) ~[na:1.8.0_92] at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) ~[na:1.8.0_92] at java.util.concurrent.Semaphore.acquire(Semaphore.java:467) ~[na:1.8.0_92] at org.apache.drill.exec.ops.SendingAccountor.waitForSendComplete(SendingAccountor.java:48) ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.ops.FragmentContext.waitForSendComplete(FragmentContext.java:486) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.BaseRootExec.close(BaseRootExec.java:134) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:313) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:155) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:264) [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_92] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_92] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92] {code} And drillbit.out contained the below exception {code} java.lang.OutOfMemoryError: Java heap space Catastrophic failure occurred. Exiting. Information follows: Unable to handle out of memory condition in FragmentExecutor. java.lang.OutOfMemoryError: Java heap space Jun 27, 2017 10:34:58 AM WARNING: org.apache.parquet.CorruptStatistics: Ignoring statistics because created_by could not be parsed (see PARQUET-251): parquet-mr org.apache.parquet.VersionParser$VersionParseException: Could not parse created_by: parquet-mr using format: (.+) version ((.*) )?\(build ?(.*)\) at org.apache.parquet.VersionParser.parse(VersionParser.java:112) at org.apache.parquet.CorruptStatistics.shouldIgnoreStatistics(CorruptStatistics.java:66) at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetStatistics(ParquetMetadataConverter.java:264) at org.apache.parquet.format.converter.ParquetMetadataConverter.fromParquetMetadata(ParquetMetadataConverter.java:568) at org.apache.parquet.format.converter.ParquetMetadataConverter.readParquetMetadata(ParquetMetadataConverter.java:545) at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:455) at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:412) at org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:387) at org.apache.drill.exec.store.parquet.Metadata.access$100(Metadata.java:81) at org.apache.drill.exec.store.parquet.Metadata$MetadataGatherer.runInner(Metadata.java:326) at org.apache.drill.exec.store.parquet.Metadata$MetadataGatherer.runInner(Metadata.java:314) at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:56) at org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:122) at org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:288) at org.apache.drill.exec.store.parquet.Metadata.createMetaFilesRecursively(Metadata.java:198) at org.apache.drill.exec.store.parquet.Metadata.createMetaFilesRecursively(Metadata.java:184) at org.apache.drill.exec.store.parquet.Metadata.createMetaFilesRecursively(Metadata.java:184) at org.apache.drill.exec.store.parquet.Metadata.createMeta(Metadata.java:103) at org.apache.drill.exec.planner.sql.handlers.RefreshMetadataHandler.getPlan(RefreshMetadataHandler.java:116) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:131) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:79) at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:1050) at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:280) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} > 25 way join ended up in 0 results which is not expected > ------------------------------------------------------- > > Key: DRILL-1162 > URL: https://issues.apache.org/jira/browse/DRILL-1162 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow, Query Planning & Optimization > Reporter: Rahul Challapalli > Assignee: Chris Westin > Priority: Critical > Fix For: Future > > Attachments: error.log, oom_error.log > > > git.commit.id.abbrev=e5c2da0 > The below query results in 0 results being returned > select count(*) from `lineitem1.parquet` a > inner join `part.parquet` j on a.l_partkey = j.p_partkey > inner join `orders.parquet` k on a.l_orderkey = k.o_orderkey > inner join `supplier.parquet` l on a.l_suppkey = l.s_suppkey > inner join `partsupp.parquet` m on j.p_partkey = m.ps_partkey and l.s_suppkey > = m.ps_suppkey > inner join `customer.parquet` n on k.o_custkey = n.c_custkey > inner join `lineitem2.parquet` b on a.l_orderkey = b.l_orderkey > inner join `lineitem2.parquet` c on a.l_partkey = c.l_partkey > inner join `lineitem2.parquet` d on a.l_suppkey = d.l_suppkey > inner join `lineitem2.parquet` e on a.l_extendedprice = e.l_extendedprice > inner join `lineitem2.parquet` f on a.l_comment = f.l_comment > inner join `lineitem2.parquet` g on a.l_shipdate = g.l_shipdate > inner join `lineitem2.parquet` h on a.l_commitdate = h.l_commitdate > inner join `lineitem2.parquet` i on a.l_receiptdate = i.l_receiptdate > inner join `lineitem2.parquet` o on a.l_receiptdate = o.l_receiptdate > inner join `lineitem2.parquet` p on a.l_receiptdate = p.l_receiptdate > inner join `lineitem2.parquet` q on a.l_receiptdate = q.l_receiptdate > inner join `lineitem2.parquet` r on a.l_receiptdate = r.l_receiptdate > inner join `lineitem2.parquet` s on a.l_receiptdate = s.l_receiptdate > inner join `lineitem2.parquet` t on a.l_receiptdate = t.l_receiptdate > inner join `lineitem2.parquet` u on a.l_receiptdate = u.l_receiptdate > inner join `lineitem2.parquet` v on a.l_receiptdate = v.l_receiptdate > inner join `lineitem2.parquet` w on a.l_receiptdate = w.l_receiptdate > inner join `lineitem2.parquet` x on a.l_receiptdate = x.l_receiptdate; > However when we remove the last 'inner join' and run the query it returns > '716372534'. Since the last inner join is similar to the one's before it, it > should match some records and return the data appropriately. > The logs indicated that it actually returned 0 results. Attached the log file. -- This message was sent by Atlassian JIRA (v6.4.14#64029)