Thanks Jinfeng & Neeraja for looking into this. We will look into the above mentioned issues.
On Sat, Sep 27, 2014 at 8:28 AM, Neeraja Rentachintala < nrentachint...@maprtech.com> wrote: > I have played with the plugin as well today and overall its very good. > > I tried the queries > http://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/ on > the zip code dataset and all the aggregate queries worked. > > > ----------- > > select sum(pop) from zipcodes where city='SEATTLE’; > > select state, city, sum(pop) from zipcodes group by state,city order by > sum(pop) asc limit 1; > > select state,city,avg(pop) from zipcodes group by state, city; > > select city, sum(pop) from zipcodes group by city order by sum(pop) asc > limit 1; > > select state,sum(pop) from zipcodes group by state having sum(pop) > > 10000000; > > > ---------- > > > I however noticed issues with querying repeating elements (used USDA > nutrition dataset), especially more than one level nested as well as JOINs > (example queries are below) > > ------------------ > > 0: jdbc:drill:zk=local> SELECT t1.first_name FROM mongo.employee.`empinfo` > t1 JOIN mongo.employee.`empinfo` t2 ON t1.`employee_id` = t2.`employee_id`; > > Query failed: Failure while setting up Foreman. Internal error: Error > while applying rule DrillPushProjIntoScan, args > [rel#12606:ProjectRel.NONE.ANY([]).[](child=rel#12598:Subset#0.ENUMERABLE.ANY([]).[],employee_id=$1,first_name=$2), > rel#12594:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[mongo, > employee, empinfo])] [08f4eedd-f5c9-4ebf-8d5b-d9249b79ca32] > > > 0: jdbc:drill:zk=local> select t.nutrients from mongo.usda.nutrition t > limit 1; > > Query failed: Screen received stop request sent. You tried to write a > BigInt type when you are using a ValueWriter of type > NullableFloat8WriterImpl. [dc44e277-1b1d-4f00-b60e-9f06b883e7c5] > > > Error: exception while executing query: Failure while trying to get next > result batch. (state=,code=0) > > 0: jdbc:drill:zk=local> select t.nutrients[0].units from > mongo.usda.nutrition t limit 1; > > Query failed: Screen received stop request sent. You tried to write a > BigInt type when you are using a ValueWriter of type > NullableFloat8WriterImpl. [a285c85e-4607-48fc-97af-41b5726459e2] > > > Error: exception while executing query: Failure while trying to get next > result batch. (state=,code=0) > > > > On Fri, Sep 26, 2014 at 6:07 PM, Jinfeng Ni <j...@maprtech.com> wrote: > >> >> ----------------------------------------------------------- >> This is an automatically generated e-mail. To reply, visit: >> https://reviews.apache.org/r/25996/#review54756 >> ----------------------------------------------------------- >> >> Ship it! >> >> >> I did not do a detail code review; let that task to Steven. I mainly >> played with this Mongo plugin. Overall it looks good. >> >> Basically, I start a mongodb instance, import the data, and run several >> single table queryies, and all of them work perfectly. >> >> Some issues I saw when playing around : >> >> 1. The result of select * seems not the expect answer : it would return a >> map containing all the columns: >> >> SELECT * FROM mongo.employee.`empinfo` limit 2; >> +------------+ >> | * | >> +------------+ >> | { "employee_id" : 1101 , "full_name" : "Steve Eurich" , "first_name" : >> "Steve" , "last_name" : "Eurich" , "position_id" : 16 , "position" : "Store >> T" , "isFTE" : true} | >> | { "employee_id" : 1102 , "full_name" : "Mary Pierson" , "first_name" : >> "Mary" , "last_name" : "Pierson" , "position_id" : 16 , "position" : "Store >> T" , "isFTE" : true} | >> +------------+ >> 2 rows selected (0.084 seconds) >> >> In contrast, here is the result when Drill queries a .json file: >> >> select * from cp.`employee.json` limit 2; >> >> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >> | employee_id | full_name | first_name | last_name | position_id | >> position_title | store_id | department_id | birth_date | hire_date | >> salary | supervisor_id | education_level | marital_status | gender | >> management_role | >> >> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >> | 1 | Sheri Nowmer | Sheri | Nowmer | 1 | >> President | 0 | 1 | 1961-08-26 | 1994-12-01 >> 00:00:00.0 | 80000.0 | 0 | Graduate Degree | S >> | F | Senior Management | >> | 2 | Derrick Whelply | Derrick | Whelply | 2 | >> VP Country Manager | 0 | 1 | 1915-07-03 | 1994-12-01 >> 00:00:00.0 | 40000.0 | 1 | Graduate Degree | M >> | M | Senior Management | >> >> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >> 2 rows selected (0.39 seconds) >> >> >> 2. Join two mongodb tables would fail. >> >> SELECT t1.first_name, t2.last_name FROM mongo.employee.`empinfo` t1, >> mongo.employee.`empinfo` t2 where t1.`employee_id` = t2.`employee_id` limit >> 1; >> Query failed: Failure while setting up Foreman. Internal error: while >> converting `t1`.`employee_id` = `t2`.`employee_id` >> [39eb6c88-fd21-4514-8903-48d99210b88d] >> >> 3. Join a mongodb table with a table with other storage engine would fail >> with CanNotPlanException: >> >> SELECT t1.first_name, t2.last_name FROM mongo.employee.`empinfo` t1, >> mongo.employee.`empinfo` t2 where t1.`employee_id` = t2.`employee_id` limit >> 1; >> Query failed: Failure while setting up Foreman. Internal error: while >> converting `t1`.`employee_id` = `t2`.`employee_id` >> [39eb6c88-fd21-4514-8903-48d99210b88d] >> >> Error: exception while executing query: Failure while trying to get next >> result batch. (state=,code=0) >> 0: jdbc:drill:zk=local> SELECT t1.first_name, t1.last_name FROM >> mongo.employee.`empinfo` as t1, cp.`employee.json` t2 where t1.employee_id >> = t2.employee_id limit 10; >> Query failed: Failure while parsing sql. Node >> [rel#2496:Subset#5.LOGICAL.ANY([]).[]] could not be implemented; planner >> state: >> >> Root: rel#2496:Subset#5.LOGICAL.ANY([]).[] >> Original rel: >> ...... >> >> 4. Select *, regular_column from mongodb would return the regular_column >> as null. >> >> 0: jdbc:drill:zk=local> SELECT first_name FROM mongo.employee.`empinfo` >> limit 2; >> +------------+ >> | first_name | >> +------------+ >> | Steve | >> | Mary | >> +------------+ >> 2 rows selected (0.084 seconds) >> 0: jdbc:drill:zk=local> SELECT *, first_name FROM >> mongo.employee.`empinfo` limit 2; >> +------------+------------+ >> | * | first_name | >> +------------+------------+ >> | { "employee_id" : 1101 , "full_name" : "Steve Eurich" , "first_name" : >> "Steve" , "last_name" : "Eurich" , "position_id" : 16 , "position" : "Store >> T" , "isFTE" : true} | null | >> | { "employee_id" : 1102 , "full_name" : "Mary Pierson" , "first_name" : >> "Mary" , "last_name" : "Pierson" , "position_id" : 16 , "position" : "Store >> T" , "isFTE" : true} | null | >> +------------+------------+ >> >> >> >> I think it would be fine to fix those issues in the next release. >> >> >> PS: could you please re-build a patch after rebasing on the recent master >> branch? >> >> - Jinfeng Ni >> >> >> On Sept. 24, 2014, 11:06 a.m., Anil Kumar B wrote: >> > >> > ----------------------------------------------------------- >> > This is an automatically generated e-mail. To reply, visit: >> > https://reviews.apache.org/r/25996/ >> > ----------------------------------------------------------- >> > >> > (Updated Sept. 24, 2014, 11:06 a.m.) >> > >> > >> > Review request for drill, Aditya Kishore, Jacques Nadeau, and Kamesh B. >> > >> > >> > Repository: drill-git >> > >> > >> > Description >> > ------- >> > >> > Mongo storage plugin support: The features which we implemented as part >> of this is as follows. >> > 1) Support for sharded(chunk wise), shared-replicated(chunk wise), >> replicated, stand-alone >> > 2) Predicate pushdown >> > 3) Mongo PStore >> > >> > MongoRecordReader uses JsonReaderWithState in the case of non-star >> queries. >> > >> > >> > Diffs >> > ----- >> > >> > contrib/pom.xml 728038a >> > contrib/storage-mongo/pom.xml PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/DrillMongoConstants.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoCnxnManager.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoCompareFunctionProcessor.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoFilterBuilder.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoGroupScan.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoPushDownFilterForScan.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoScanBatchCreator.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoScanSpec.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePlugin.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePluginConfig.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoSubScan.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoUtils.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/common/ChunkInfo.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/common/MongoCompareOp.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/config/MongoPStore.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/config/MongoPStoreProvider.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema/MongoDatabaseSchema.java >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema/MongoSchemaFactory.java >> PRE-CREATION >> > >> contrib/storage-mongo/src/main/resources/bootstrap-storage-plugins.json >> PRE-CREATION >> > contrib/storage-mongo/src/main/resources/drill-module.conf >> PRE-CREATION >> > >> >> contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/TestMongoChunkAssignment.java >> PRE-CREATION >> > distribution/pom.xml cd5df0d >> > distribution/src/assemble/bin.xml 86e3802 >> > exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java >> 933bfbe >> > >> >> exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java >> 4fa61e1 >> > >> >> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java >> 4e12b8b >> > >> >> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReaderWithState.java >> ef995f8 >> > >> > Diff: https://reviews.apache.org/r/25996/diff/ >> > >> > >> > Testing >> > ------- >> > >> > 1) Tested various set of queries on sharded, replicated and stand-alone >> modes. >> > >> > 2) Test Environment details: We created mongo cluster with 2 shards >> with a collections consists of 35 chunks(18 chunks are one shard and >> remaining chunks on on other shard). Below are the few queries which we >> tested in all the environments. >> > >> > a) SELECT * FROM mongo.employee.`empinfo` limit 10; >> > >> > b) SELECT first_name, last_name FROM mongo.employee.`empinfo` >> limit 10; >> > >> > c) SELECT first_name, last_name FROM mongo.employee.`empinfo` >> where employee_id = 1111; >> > >> > d) SELECT * FROM mongo.employee.`empinfo` where full_name = 'Phil >> Munoz'; >> > >> > e) SELECT first_name, last_name, position_id FROM >> mongo.employee.`empinfo` where employee_id = 1111 OR position_id = 16; >> > >> > g) SELECT first_name, last_name FROM mongo.employee.`empinfo` >> where isFTE = true; >> > >> > h) SELECT first_name, last_name, position_id FROM >> mongo.employee.`empinfo` where employee_id = 1107 AND position_id = 17 AND >> last_name = 'Yonce'; >> > >> > >> > 3) PStore functionality not fully tested. >> > >> > >> > Thanks, >> > >> > Anil Kumar B >> > >> > >> >> > -- Kamesh.