Kathiravelu Pradeeban created DRILL-4855:
--------------------------------------------

             Summary: Querying MongoDB collection with nested data fails
                 Key: DRILL-4855
                 URL: https://issues.apache.org/jira/browse/DRILL-4855
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - MongoDB
    Affects Versions: 1.6.0, 1.7.0
         Environment: Centos7 and Ubuntu 16.04
            Reporter: Kathiravelu Pradeeban
            Priority: Critical


To reproduce:

1. Create a json file called small.json with the below line:

{"_id":{"$oid":"56a784b76952647b7b51c562"},"provenance":{"image":{"case_id":"TCGA-TS2","subject_id":"TCGA"}}}


2. Create a Mongo DB with the small.json as below:

mongoimport --db users --collection contacts --type small.json


3. Create a Mongo Query involving the nested data to confirm everything is fine.

use users;
db.contacts.find({ "provenance.image.case_id": "TCGA-TS2"});

{ "_id" : ObjectId("56a784b76952647b7b51c562"), "provenance" : { "image" : { 
"case_id" : "TCGA-TS2", "subject_id" : "TCGA" } } }


4. Create a Drill query for the same:

SELECT camic.provenance.image.case_id caseid
FROM mongo.users.`contacts` camic
WHERE caseid = 'TCGA-TS2"';

The above query fails with the below error message.

Error: SYSTEM ERROR: NumberFormatException: TCGA-TS2"

Fragment 0:0

[Error Id: 1059e3ec-6241-4b4c-a2de-33c4a44c64fe on llovizna:31010] 
(state=,code=0)





"tail -f sqlline.log" indicates the below:


2016-08-18 16:10:05,684 [2849e9e2-1bef-6389-2f2c-9f8d1af595fe:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
2849e9e2-1bef-6389-2f2c-9f8d1af595fe: SELECT camic.provenance.image.case_id 
caseid
FROM mongo.users.`contacts` camic
WHERE caseid = 'TCGA-TS2"'
2016-08-18 16:10:06,810 [2849e9e2-1bef-6389-2f2c-9f8d1af595fe:frag:0:0] INFO  
o.a.d.e.s.m.MongoScanBatchCreator - Number of record readers initialized : 1
2016-08-18 16:10:06,810 [2849e9e2-1bef-6389-2f2c-9f8d1af595fe:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2849e9e2-1bef-6389-2f2c-9f8d1af595fe:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2016-08-18 16:10:06,811 [2849e9e2-1bef-6389-2f2c-9f8d1af595fe:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2849e9e2-1bef-6389-2f2c-9f8d1af595fe:0:0: 
State to report: RUNNING
2016-08-18 16:10:06,811 [2849e9e2-1bef-6389-2f2c-9f8d1af595fe:frag:0:0] INFO  
o.a.d.e.s.mongo.MongoRecordReader - Filters Applied : Document{{}}
2016-08-18 16:10:06,811 [2849e9e2-1bef-6389-2f2c-9f8d1af595fe:frag:0:0] INFO  
o.a.d.e.s.mongo.MongoRecordReader - Fields Selected :Document{{_id=0, caseid=1, 
provenance=1}}
2016-08-18 16:10:06,812 [2849e9e2-1bef-6389-2f2c-9f8d1af595fe:frag:0:0] WARN  
o.a.d.e.e.ExpressionTreeMaterializer - Unable to find value vector of path 
`caseid`, returning null instance.
2016-08-18 16:10:06,815 [2849e9e2-1bef-6389-2f2c-9f8d1af595fe:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2849e9e2-1bef-6389-2f2c-9f8d1af595fe:0:0: 
State change requested RUNNING --> FAILED
2016-08-18 16:10:06,816 [2849e9e2-1bef-6389-2f2c-9f8d1af595fe:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2849e9e2-1bef-6389-2f2c-9f8d1af595fe:0:0: 
State change requested FAILED --> FINISHED
2016-08-18 16:10:06,817 [2849e9e2-1bef-6389-2f2c-9f8d1af595fe:frag:0:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NumberFormatException: 
TCGA-TS2"

Fragment 0:0

[Error Id: 1059e3ec-6241-4b4c-a2de-33c4a44c64fe on llovizna:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
NumberFormatException: TCGA-TS2"

Fragment 0:0

[Error Id: 1059e3ec-6241-4b4c-a2de-33c4a44c64fe on llovizna:31010]
        at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
 ~[drill-common-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:318)
 [drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:185)
 [drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:287)
 [drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.6.0.jar:1.6.0]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0]
        at java.lang.Thread.run(Thread.java:744) [na:1.8.0]
Caused by: java.lang.NumberFormatException: TCGA-TS2"
        at 
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI(StringFunctionHelpers.java:95)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varTypesToInt(StringFunctionHelpers.java:120)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.test.generated.FiltererGen13.doSetup(FilterTemplate2.java:45)
 ~[na:na]
        at 
org.apache.drill.exec.test.generated.FiltererGen13.setup(FilterTemplate2.java:54)
 ~[na:na]
        at 
org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer(FilterRecordBatch.java:197)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema(FilterRecordBatch.java:109)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:94)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:81)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:257)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:251)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at java.security.AccessController.doPrivileged(Native Method) 
~[na:1.8.0]
        at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0]
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
 ~[hadoop-common-2.7.1.jar:na]
        at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:251)
 [drill-java-exec-1.6.0.jar:1.6.0]
        ... 4 common frames omitted
2016-08-18 16:10:06,854 [CONTROL-rpc-event-queue] WARN  
o.a.drill.exec.work.foreman.Foreman - Dropping request to move to COMPLETED 
state as query is already at FAILED state (which is terminal).
2016-08-18 16:10:06,854 [CONTROL-rpc-event-queue] WARN  
o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel fragment. 
2849e9e2-1bef-6389-2f2c-9f8d1af595fe:0:0 does not exist.
2016-08-18 16:10:06,855 [USER-rpc-event-queue] INFO  
o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#9] Query failed:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
NumberFormatException: TCGA-TS2"

Fragment 0:0

[Error Id: 1059e3ec-6241-4b4c-a2de-33c4a44c64fe on llovizna:31010]
        at 
org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
 [drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113) 
[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
 [drill-rpc-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
 [drill-rpc-1.6.0.jar:1.6.0]
        at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67) 
[drill-rpc-1.6.0.jar:1.6.0]
        at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374) 
[drill-rpc-1.6.0.jar:1.6.0]
        at 
org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89)
 [drill-rpc-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252) 
[drill-rpc-1.6.0.jar:1.6.0]
        at 
org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123) 
[drill-rpc-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285) 
[drill-rpc-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257) 
[drill-rpc-1.6.0.jar:1.6.0]
        at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
 [netty-codec-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
 [netty-handler-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
 [netty-codec-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
 [netty-codec-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
        at 
io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618)
 [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
        at 
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) 
[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) 
[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
 [netty-common-4.0.27.Final.jar:4.0.27.Final]
        at java.lang.Thread.run(Thread.java:744) [na:1.8.0]




5. Please note the below returns the correct output:

SELECT camic.provenance.image.case_id caseid
FROM mongo.users.`contacts` camic;


+-----------+
|  caseid   |
+-----------+
| TCGA-TS2  |
+-----------+
1 row selected (1,135 seconds)


So the issue is with the WHERE clause:
WHERE caseid = 'TCGA-TS2"';



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to