[ https://issues.apache.org/jira/browse/DRILL-4855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kathiravelu Pradeeban resolved DRILL-4855. ------------------------------------------ Resolution: Invalid We cannot reference column aliases in the following clauses: * WHERE * GROUP BY * HAVING SELECT camic.provenance.image.case_id caseid from mongo.users.`contacts2` camic where camic.provenance.image.case_id > 10; works! > Querying MongoDB collection with nested data fails > -------------------------------------------------- > > Key: DRILL-4855 > URL: https://issues.apache.org/jira/browse/DRILL-4855 > Project: Apache Drill > Issue Type: Bug > Components: Storage - MongoDB > Affects Versions: 1.6.0, 1.7.0 > Environment: Centos7 and Ubuntu 16.04 > Reporter: Kathiravelu Pradeeban > Priority: Critical > > To reproduce: > 1. Create a json file called small.json with the below line: > {"_id":{"$oid":"56a784b76952647b7b51c562"},"provenance":{"image":{"case_id":"TCGA-TS2","subject_id":"TCGA"}}} > 2. Create a Mongo DB with the small.json as below: > mongoimport --db users --collection contacts --type small.json > 3. Create a Mongo Query involving the nested data to confirm everything is > fine. > use users; > db.contacts.find({ "provenance.image.case_id": "TCGA-TS2"}); > This returns: > { "_id" : ObjectId("56a784b76952647b7b51c562"), "provenance" : { "image" : { > "case_id" : "TCGA-TS2", "subject_id" : "TCGA" } } } > 4. Create a Drill query for the same: > SELECT camic.provenance.image.case_id caseid > FROM mongo.users.`contacts` camic > WHERE caseid = 'TCGA-TS2'; > The above query fails with the below error message. > Error: SYSTEM ERROR: NumberFormatException: TCGA-TS2 > Fragment 0:0 > [Error Id: 142d9f37-fe13-4757-8009-e713d55bc1d8 on llovizna:31010] > (state=,code=0) > "tail -f sqlline.log" indicates the below: > 2016-08-18 16:33:32,097 [2849e462-ade5-621f-5e4b-59e93c07ff11:foreman] INFO > o.a.drill.exec.work.foreman.Foreman - Query text for query id > 2849e462-ade5-621f-5e4b-59e93c07ff11: SELECT camic.provenance.image.case_id > caseid > FROM mongo.users.`contacts` camic > WHERE caseid = 'TCGA-TS2' > 2016-08-18 16:33:33,369 [2849e462-ade5-621f-5e4b-59e93c07ff11:frag:0:0] INFO > o.a.d.e.s.m.MongoScanBatchCreator - Number of record readers initialized : 1 > 2016-08-18 16:33:33,371 [2849e462-ade5-621f-5e4b-59e93c07ff11:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2849e462-ade5-621f-5e4b-59e93c07ff11:0:0: State change requested > AWAITING_ALLOCATION --> RUNNING > 2016-08-18 16:33:33,371 [2849e462-ade5-621f-5e4b-59e93c07ff11:frag:0:0] INFO > o.a.d.e.w.f.FragmentStatusReporter - > 2849e462-ade5-621f-5e4b-59e93c07ff11:0:0: State to report: RUNNING > 2016-08-18 16:33:33,371 [2849e462-ade5-621f-5e4b-59e93c07ff11:frag:0:0] INFO > o.a.d.e.s.mongo.MongoRecordReader - Filters Applied : Document{{}} > 2016-08-18 16:33:33,371 [2849e462-ade5-621f-5e4b-59e93c07ff11:frag:0:0] INFO > o.a.d.e.s.mongo.MongoRecordReader - Fields Selected :Document{{_id=0, > caseid=1, provenance=1}} > 2016-08-18 16:33:33,372 [2849e462-ade5-621f-5e4b-59e93c07ff11:frag:0:0] WARN > o.a.d.e.e.ExpressionTreeMaterializer - Unable to find value vector of path > `caseid`, returning null instance. > 2016-08-18 16:33:33,375 [2849e462-ade5-621f-5e4b-59e93c07ff11:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2849e462-ade5-621f-5e4b-59e93c07ff11:0:0: State change requested RUNNING --> > FAILED > 2016-08-18 16:33:33,375 [2849e462-ade5-621f-5e4b-59e93c07ff11:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 2849e462-ade5-621f-5e4b-59e93c07ff11:0:0: State change requested FAILED --> > FINISHED > 2016-08-18 16:33:33,376 [2849e462-ade5-621f-5e4b-59e93c07ff11:frag:0:0] ERROR > o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NumberFormatException: > TCGA-TS2 > Fragment 0:0 > [Error Id: efb49b38-0515-4b20-9d24-052944f04a73 on llovizna:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > NumberFormatException: TCGA-TS2 > Fragment 0:0 > [Error Id: efb49b38-0515-4b20-9d24-052944f04a73 on llovizna:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) > ~[drill-common-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:318) > [drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:185) > [drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:287) > [drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.6.0.jar:1.6.0] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0] > at java.lang.Thread.run(Thread.java:744) [na:1.8.0] > Caused by: java.lang.NumberFormatException: TCGA-TS2 > at > org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI(StringFunctionHelpers.java:95) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varTypesToInt(StringFunctionHelpers.java:120) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.test.generated.FiltererGen20.doSetup(FilterTemplate2.java:45) > ~[na:na] > at > org.apache.drill.exec.test.generated.FiltererGen20.setup(FilterTemplate2.java:54) > ~[na:na] > at > org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer(FilterRecordBatch.java:197) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema(FilterRecordBatch.java:109) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:94) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:81) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:257) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:251) > ~[drill-java-exec-1.6.0.jar:1.6.0] > at java.security.AccessController.doPrivileged(Native Method) > ~[na:1.8.0] > at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > ~[hadoop-common-2.7.1.jar:na] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:251) > [drill-java-exec-1.6.0.jar:1.6.0] > ... 4 common frames omitted > 2016-08-18 16:33:33,426 [CONTROL-rpc-event-queue] WARN > o.a.drill.exec.work.foreman.Foreman - Dropping request to move to COMPLETED > state as query is already at FAILED state (which is terminal). > 2016-08-18 16:33:33,426 [CONTROL-rpc-event-queue] WARN > o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel fragment. > 2849e462-ade5-621f-5e4b-59e93c07ff11:0:0 does not exist. > 2016-08-18 16:33:33,427 [USER-rpc-event-queue] INFO > o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#14] Query failed: > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > NumberFormatException: TCGA-TS2 > Fragment 0:0 > [Error Id: efb49b38-0515-4b20-9d24-052944f04a73 on llovizna:31010] > at > org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119) > [drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113) > [drill-java-exec-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46) > [drill-rpc-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31) > [drill-rpc-1.6.0.jar:1.6.0] > at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67) > [drill-rpc-1.6.0.jar:1.6.0] > at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374) > [drill-rpc-1.6.0.jar:1.6.0] > at > org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89) > [drill-rpc-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252) > [drill-rpc-1.6.0.jar:1.6.0] > at > org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123) > [drill-rpc-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285) > [drill-rpc-1.6.0.jar:1.6.0] > at > org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257) > [drill-rpc-1.6.0.jar:1.6.0] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89) > [netty-codec-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254) > [netty-handler-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) > [netty-codec-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242) > [netty-codec-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847) > [netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) > [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > [netty-common-4.0.27.Final.jar:4.0.27.Final] > at java.lang.Thread.run(Thread.java:744) [na:1.8.0] > 5. Please note the below returns the correct output: > SELECT camic.provenance.image.case_id caseid > FROM mongo.users.`contacts` camic; > +-----------+ > | caseid | > +-----------+ > | TCGA-TS2 | > +-----------+ > 1 row selected (1,135 seconds) > So the issue is with the WHERE clause: > WHERE caseid = 'TCGA-TS2"'; -- This message was sent by Atlassian JIRA (v6.3.4#6332)