Lucas Alvarez Argüero created DRILL-5411:
--------------------------------------------
Summary: Getting 0 rows when there are more than 100000 in the
mongoDB collection
Key: DRILL-5411
URL: https://issues.apache.org/jira/browse/DRILL-5411
Project: Apache Drill
Issue Type: Bug
Components: Storage - MongoDB
Affects Versions: 1.10.0
Environment: VM1("ubuntu/trusty64"): mongo1
• mongoS (mongo server)
• MongoD shard1 (Primary, secondary,secondary)
• Mongo config server
• Drillbit
VM2("ubuntu/trusty64"): mongo2
• MongoD shard2 (Primary, secondary,secondary)
• Mongo config server
• Drillbit
VM3("ubuntu/trusty64"): mongo3
• MongoD shard3 (Primary, secondary,secondary)
• Mongo config server
• Drillbit
VM4("ubuntu/trusty64"): zk1
• Zookeeper in quorum
VM5("ubuntu/trusty64"): zk2
• Zookeeper in quorum
VM6("ubuntu/trusty64"): zk3
• Zookeeper in quorum
Reporter: Lucas Alvarez Argüero
Getting 0 rows when there are more than 100000 in the mongoDB collection
Drills works perfectly when I am using mongo as storage when there are less
than 100000(aprox) documents in the collection (partitioned) but when there are
more documents, drill return zero rows but still can count all documents (but
it can’t count documents using where).
Less than 100000:
select v.measInfo_id,v.endTime from mongo.mandarinaTime3.MeasValue v limit
3;
+--------------+-------------+
| measInfo_id | endTime |
+--------------+-------------+
| [B@1a7d4b45 | 2016-09-19 |
| [B@17d8ac99 | 2016-09-19 |
| [B@122b7d0a | 2016-09-19 |
+--------------+-------------+
3 rows selected (0.313 seconds)
More than 100000:
0: jdbc:drill:> select v.measInfo_id,v.endTime from
mongo.mandarinaTime3.MeasValue v limit 3;
+--------------+----------+
| measInfo_id | endTime |
+--------------+----------+
+--------------+----------+
No rows selected (0.341 seconds)
0: jdbc:drill:> select count() from mongo.mandarinaTime3.MeasValue v ;
+---------+
| EXPR$0 |
+---------+
| 502068 |
+---------+
1 row selected (0.426 seconds)
0: jdbc:drill:> select count() from mongo.mandarinaTime3.MeasValue v Where
endtime='2016-09-19';
+---------+
| EXPR$0 |
+---------+
| 0 |
+---------+
1 row selected (0.98 seconds)
If the collection isn’t partitioned, drill also works perfectly
drill mongo plugin:
{
"type": "mongo",
"connection": "mongodb://mongo1:27017/",
"enabled": true
}
mongo sharded collection:
{ "_id" : "mandarinaTime3", "primary" : "b", "partitioned" : true }
mandarinaTime3.MeasCollecFile
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
b 1
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : {
"$maxKey" : 1 } } on : b Timestamp(1, 0)
mandarinaTime3.MeasInfo
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
a 1
b 1
c 1
{ "_id" : { "$minKey" : 1 } } -->> { "_id" :
ObjectId("58e364dddc7a033f5c08c7c6") } on : a Timestamp(2, 0)
{ "_id" : ObjectId("58e364dddc7a033f5c08c7c6") } -->> {
"_id" : ObjectId("58e364e0dc7a033f5c08c8b0") } on : c Timestamp(3, 0)
{ "_id" : ObjectId("58e364e0dc7a033f5c08c8b0") } -->> {
"_id" : { "$maxKey" : 1 } } on : b Timestamp(3, 1)
mandarinaTime3.MeasValue
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
a 7
b 7
c 7
too many chunks to print, use verbose if you want to
force print
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)