sounds good. let me file a JIRA. On Tue, Sep 30, 2014 at 5:26 PM, AnilKumar B <akumarb2...@gmail.com> wrote:
> Hi Neeraja, > > Since from last 1 week, there is no code changes. We will look into this > issue. If possible, can you please raise jira for this? > > As of now we are working on following issues. > 1) Joins on (mongo vs mongo) and (mongo vs other storages) > 2) select *, (some other fields) (Jinfeng's review comment) > > Next 10 days, we both are on vacation, so once we come back, we will > resolve these issues. > > > Thanks & Regards, > B Anil Kumar. > > On Wed, Oct 1, 2014 at 5:03 AM, Neeraja Rentachintala < > nrentachint...@maprtech.com> wrote: > >> Btw I got the dataset JSON from >> http://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/ >> >> On Tue, Sep 30, 2014 at 4:30 PM, Neeraja Rentachintala < >> nrentachint...@maprtech.com> wrote: >> >>> Hi Anil, Kamesh >>> >>> I was trying MongoDB plugin with the following queries and all of them >>> succeeded. This is using the zip codes dataset. >>> However intermittently (now consistently) , these are failing. I built >>> Kamesh branch and also tried from the latest 0.6 master (which merged the >>> plugin). >>> Do you have any insights into this. I can file JIRA if you like. >>> >>> select state,sum(pop) from zipcodes group by state having sum(pop) > >>> 10000000 order by sum(pop) desc; >>> >>> select state,city,avg(pop) from zipcodes group by state, city; >>> >>> select city, sum(pop) from zipcodes where state is not null group by >>> city order by sum(pop) desc limit 1; >>> >>> select city, sum(pop) from zipcodes where state is not null group by >>> city order by sum(pop) asc limit 1; >>> >>> --------------- >>> >>> >>> Errors >>> >>> ------------------- >>> >>> select sum(pop) from zipcodes where city=‘CHICAGO’; >>> >>> 0: jdbc:drill:zk=local> select state, city, sum(pop) from zipcodes group >>> by state, city order by sum(pop) limit 5; >>> >>> Query failed: Failure while running fragment. Invalid value for boolean: >>> 15338 [1366240a-125d-4646-962a-1734f84b03b3] >>> >>> >>> Error: exception while executing query: Failure while trying to get next >>> result batch. (state=,code=0) >>> >>> 0: jdbc:drill:zk=local> select state, city, sum(pop) from zipcodes where >>> state is not null and city is not null group by state, city order by >>> sum(pop) limit 5; >>> >>> Query failed: Failure while running fragment. Invalid value for boolean: >>> 15338 [53a1c241-82f0-4413-aa09-39f851d7209a] >>> >>> >>> Error: exception while executing query: Failure while trying to get next >>> result batch. (state=,code=0) >>> >>> 0: jdbc:drill:zk=local> select state, city, sum(pop) from zipcodes group >>> by state,city order by sum(pop) asc limit 1; >>> >>> Query failed: Failure while running fragment. Invalid value for boolean: >>> 15338 [911f48f7-019e-42f2-b0af-1e255416ce76] >>> >>> >>> Error: exception while executing query: Failure while trying to get next >>> result batch. (state=,code=0) >>> >>> 0: jdbc:drill:zk=local> select state,city,avg(pop) from zipcodes group >>> by state, city; >>> >>> Query failed: Failure while running fragment. Invalid value for boolean: >>> 15338 [54c5e480-972a-49b9-860e-60d4a7429366] >>> >>> >>> Error: exception while executing query: Failure while trying to get next >>> result batch. (state=,code=0) >>> >>> 0: jdbc:drill:zk=local> select city, sum(pop) from zipcodes group by >>> city order by sum(pop) asc limit 1; >>> >>> Query failed: Failure while running fragment. Invalid value for boolean: >>> 15338 [b5374f4a-3ee6-47ac-9dc1-fcebed555518] >>> >>> >>> Error: exception while executing query: Failure while trying to get next >>> result batch. (state=,code=0) >>> >>> 0: jdbc:drill:zk=local> select state,sum(pop) from zipcodes group by >>> state having sum(pop) > 10000000; >>> >>> Query failed: Failure while running fragment. Invalid value for boolean: >>> 15338 [5875169d-dcb8-419f-991e-d38d41dddcbb] >>> >>> >>> Error: exception while executing query: Failure while trying to get next >>> result batch. (state=,code=0) >>> >>> On Sat, Sep 27, 2014 at 1:07 AM, Kamesh <kamesh.had...@gmail.com> wrote: >>> >>>> Thanks Jinfeng & Neeraja for looking into this. >>>> We will look into the above mentioned issues. >>>> >>>> >>>> >>>> On Sat, Sep 27, 2014 at 8:28 AM, Neeraja Rentachintala < >>>> nrentachint...@maprtech.com> wrote: >>>> >>>>> I have played with the plugin as well today and overall its very good. >>>>> >>>>> I tried the queries >>>>> http://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/ >>>>> on the zip code dataset and all the aggregate queries worked. >>>>> >>>>> >>>>> ----------- >>>>> >>>>> select sum(pop) from zipcodes where city='SEATTLE’; >>>>> >>>>> select state, city, sum(pop) from zipcodes group by state,city order >>>>> by sum(pop) asc limit 1; >>>>> >>>>> select state,city,avg(pop) from zipcodes group by state, city; >>>>> >>>>> select city, sum(pop) from zipcodes group by city order by sum(pop) >>>>> asc limit 1; >>>>> >>>>> select state,sum(pop) from zipcodes group by state having sum(pop) > >>>>> 10000000; >>>>> >>>>> >>>>> ---------- >>>>> >>>>> >>>>> I however noticed issues with querying repeating elements (used USDA >>>>> nutrition dataset), especially more than one level nested as well as JOINs >>>>> (example queries are below) >>>>> >>>>> ------------------ >>>>> >>>>> 0: jdbc:drill:zk=local> SELECT t1.first_name FROM >>>>> mongo.employee.`empinfo` t1 JOIN mongo.employee.`empinfo` t2 ON >>>>> t1.`employee_id` = t2.`employee_id`; >>>>> >>>>> Query failed: Failure while setting up Foreman. Internal error: Error >>>>> while applying rule DrillPushProjIntoScan, args >>>>> [rel#12606:ProjectRel.NONE.ANY([]).[](child=rel#12598:Subset#0.ENUMERABLE.ANY([]).[],employee_id=$1,first_name=$2), >>>>> rel#12594:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[mongo, >>>>> employee, empinfo])] [08f4eedd-f5c9-4ebf-8d5b-d9249b79ca32] >>>>> >>>>> >>>>> 0: jdbc:drill:zk=local> select t.nutrients from mongo.usda.nutrition t >>>>> limit 1; >>>>> >>>>> Query failed: Screen received stop request sent. You tried to write a >>>>> BigInt type when you are using a ValueWriter of type >>>>> NullableFloat8WriterImpl. [dc44e277-1b1d-4f00-b60e-9f06b883e7c5] >>>>> >>>>> >>>>> Error: exception while executing query: Failure while trying to get >>>>> next result batch. (state=,code=0) >>>>> >>>>> 0: jdbc:drill:zk=local> select t.nutrients[0].units from >>>>> mongo.usda.nutrition t limit 1; >>>>> >>>>> Query failed: Screen received stop request sent. You tried to write a >>>>> BigInt type when you are using a ValueWriter of type >>>>> NullableFloat8WriterImpl. [a285c85e-4607-48fc-97af-41b5726459e2] >>>>> >>>>> >>>>> Error: exception while executing query: Failure while trying to get >>>>> next result batch. (state=,code=0) >>>>> >>>>> >>>>> >>>>> On Fri, Sep 26, 2014 at 6:07 PM, Jinfeng Ni <j...@maprtech.com> wrote: >>>>> >>>>>> >>>>>> ----------------------------------------------------------- >>>>>> This is an automatically generated e-mail. To reply, visit: >>>>>> https://reviews.apache.org/r/25996/#review54756 >>>>>> ----------------------------------------------------------- >>>>>> >>>>>> Ship it! >>>>>> >>>>>> >>>>>> I did not do a detail code review; let that task to Steven. I mainly >>>>>> played with this Mongo plugin. Overall it looks good. >>>>>> >>>>>> Basically, I start a mongodb instance, import the data, and run >>>>>> several single table queryies, and all of them work perfectly. >>>>>> >>>>>> Some issues I saw when playing around : >>>>>> >>>>>> 1. The result of select * seems not the expect answer : it would >>>>>> return a map containing all the columns: >>>>>> >>>>>> SELECT * FROM mongo.employee.`empinfo` limit 2; >>>>>> +------------+ >>>>>> | * | >>>>>> +------------+ >>>>>> | { "employee_id" : 1101 , "full_name" : "Steve Eurich" , >>>>>> "first_name" : "Steve" , "last_name" : "Eurich" , "position_id" : 16 , >>>>>> "position" : "Store T" , "isFTE" : true} | >>>>>> | { "employee_id" : 1102 , "full_name" : "Mary Pierson" , >>>>>> "first_name" : "Mary" , "last_name" : "Pierson" , "position_id" : 16 , >>>>>> "position" : "Store T" , "isFTE" : true} | >>>>>> +------------+ >>>>>> 2 rows selected (0.084 seconds) >>>>>> >>>>>> In contrast, here is the result when Drill queries a .json file: >>>>>> >>>>>> select * from cp.`employee.json` limit 2; >>>>>> >>>>>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >>>>>> | employee_id | full_name | first_name | last_name | position_id | >>>>>> position_title | store_id | department_id | birth_date | hire_date | >>>>>> salary | supervisor_id | education_level | marital_status | gender >>>>>> | >>>>>> management_role | >>>>>> >>>>>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >>>>>> | 1 | Sheri Nowmer | Sheri | Nowmer | 1 >>>>>> | President | 0 | 1 | 1961-08-26 | 1994-12-01 >>>>>> 00:00:00.0 | 80000.0 | 0 | Graduate Degree | S >>>>>> | F | Senior Management | >>>>>> | 2 | Derrick Whelply | Derrick | Whelply | 2 >>>>>> | VP Country Manager | 0 | 1 | 1915-07-03 | >>>>>> 1994-12-01 00:00:00.0 | 40000.0 | 1 | Graduate Degree | M >>>>>> | M | Senior Management | >>>>>> >>>>>> +-------------+------------+------------+------------+-------------+----------------+------------+---------------+------------+------------+------------+---------------+-----------------+----------------+------------+-----------------+ >>>>>> 2 rows selected (0.39 seconds) >>>>>> >>>>>> >>>>>> 2. Join two mongodb tables would fail. >>>>>> >>>>>> SELECT t1.first_name, t2.last_name FROM mongo.employee.`empinfo` t1, >>>>>> mongo.employee.`empinfo` t2 where t1.`employee_id` = t2.`employee_id` >>>>>> limit >>>>>> 1; >>>>>> Query failed: Failure while setting up Foreman. Internal error: while >>>>>> converting `t1`.`employee_id` = `t2`.`employee_id` >>>>>> [39eb6c88-fd21-4514-8903-48d99210b88d] >>>>>> >>>>>> 3. Join a mongodb table with a table with other storage engine would >>>>>> fail with CanNotPlanException: >>>>>> >>>>>> SELECT t1.first_name, t2.last_name FROM mongo.employee.`empinfo` t1, >>>>>> mongo.employee.`empinfo` t2 where t1.`employee_id` = t2.`employee_id` >>>>>> limit >>>>>> 1; >>>>>> Query failed: Failure while setting up Foreman. Internal error: while >>>>>> converting `t1`.`employee_id` = `t2`.`employee_id` >>>>>> [39eb6c88-fd21-4514-8903-48d99210b88d] >>>>>> >>>>>> Error: exception while executing query: Failure while trying to get >>>>>> next result batch. (state=,code=0) >>>>>> 0: jdbc:drill:zk=local> SELECT t1.first_name, t1.last_name FROM >>>>>> mongo.employee.`empinfo` as t1, cp.`employee.json` t2 where >>>>>> t1.employee_id >>>>>> = t2.employee_id limit 10; >>>>>> Query failed: Failure while parsing sql. Node >>>>>> [rel#2496:Subset#5.LOGICAL.ANY([]).[]] could not be implemented; planner >>>>>> state: >>>>>> >>>>>> Root: rel#2496:Subset#5.LOGICAL.ANY([]).[] >>>>>> Original rel: >>>>>> ...... >>>>>> >>>>>> 4. Select *, regular_column from mongodb would return the >>>>>> regular_column as null. >>>>>> >>>>>> 0: jdbc:drill:zk=local> SELECT first_name FROM >>>>>> mongo.employee.`empinfo` limit 2; >>>>>> +------------+ >>>>>> | first_name | >>>>>> +------------+ >>>>>> | Steve | >>>>>> | Mary | >>>>>> +------------+ >>>>>> 2 rows selected (0.084 seconds) >>>>>> 0: jdbc:drill:zk=local> SELECT *, first_name FROM >>>>>> mongo.employee.`empinfo` limit 2; >>>>>> +------------+------------+ >>>>>> | * | first_name | >>>>>> +------------+------------+ >>>>>> | { "employee_id" : 1101 , "full_name" : "Steve Eurich" , >>>>>> "first_name" : "Steve" , "last_name" : "Eurich" , "position_id" : 16 , >>>>>> "position" : "Store T" , "isFTE" : true} | null | >>>>>> | { "employee_id" : 1102 , "full_name" : "Mary Pierson" , >>>>>> "first_name" : "Mary" , "last_name" : "Pierson" , "position_id" : 16 , >>>>>> "position" : "Store T" , "isFTE" : true} | null | >>>>>> +------------+------------+ >>>>>> >>>>>> >>>>>> >>>>>> I think it would be fine to fix those issues in the next release. >>>>>> >>>>>> >>>>>> PS: could you please re-build a patch after rebasing on the recent >>>>>> master branch? >>>>>> >>>>>> - Jinfeng Ni >>>>>> >>>>>> >>>>>> On Sept. 24, 2014, 11:06 a.m., Anil Kumar B wrote: >>>>>> > >>>>>> > ----------------------------------------------------------- >>>>>> > This is an automatically generated e-mail. To reply, visit: >>>>>> > https://reviews.apache.org/r/25996/ >>>>>> > ----------------------------------------------------------- >>>>>> > >>>>>> > (Updated Sept. 24, 2014, 11:06 a.m.) >>>>>> > >>>>>> > >>>>>> > Review request for drill, Aditya Kishore, Jacques Nadeau, and >>>>>> Kamesh B. >>>>>> > >>>>>> > >>>>>> > Repository: drill-git >>>>>> > >>>>>> > >>>>>> > Description >>>>>> > ------- >>>>>> > >>>>>> > Mongo storage plugin support: The features which we implemented as >>>>>> part of this is as follows. >>>>>> > 1) Support for sharded(chunk wise), shared-replicated(chunk wise), >>>>>> replicated, stand-alone >>>>>> > 2) Predicate pushdown >>>>>> > 3) Mongo PStore >>>>>> > >>>>>> > MongoRecordReader uses JsonReaderWithState in the case of non-star >>>>>> queries. >>>>>> > >>>>>> > >>>>>> > Diffs >>>>>> > ----- >>>>>> > >>>>>> > contrib/pom.xml 728038a >>>>>> > contrib/storage-mongo/pom.xml PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/DrillMongoConstants.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoCnxnManager.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoCompareFunctionProcessor.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoFilterBuilder.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoGroupScan.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoPushDownFilterForScan.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoScanBatchCreator.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoScanSpec.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePlugin.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoStoragePluginConfig.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoSubScan.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoUtils.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/common/ChunkInfo.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/common/MongoCompareOp.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/config/MongoPStore.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/config/MongoPStoreProvider.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema/MongoDatabaseSchema.java >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema/MongoSchemaFactory.java >>>>>> PRE-CREATION >>>>>> > >>>>>> contrib/storage-mongo/src/main/resources/bootstrap-storage-plugins.json >>>>>> PRE-CREATION >>>>>> > contrib/storage-mongo/src/main/resources/drill-module.conf >>>>>> PRE-CREATION >>>>>> > >>>>>> >>>>>> contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/TestMongoChunkAssignment.java >>>>>> PRE-CREATION >>>>>> > distribution/pom.xml cd5df0d >>>>>> > distribution/src/assemble/bin.xml 86e3802 >>>>>> > >>>>>> exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java >>>>>> 933bfbe >>>>>> > >>>>>> >>>>>> exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java >>>>>> 4fa61e1 >>>>>> > >>>>>> >>>>>> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java >>>>>> 4e12b8b >>>>>> > >>>>>> >>>>>> exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReaderWithState.java >>>>>> ef995f8 >>>>>> > >>>>>> > Diff: https://reviews.apache.org/r/25996/diff/ >>>>>> > >>>>>> > >>>>>> > Testing >>>>>> > ------- >>>>>> > >>>>>> > 1) Tested various set of queries on sharded, replicated and >>>>>> stand-alone modes. >>>>>> > >>>>>> > 2) Test Environment details: We created mongo cluster with 2 shards >>>>>> with a collections consists of 35 chunks(18 chunks are one shard and >>>>>> remaining chunks on on other shard). Below are the few queries which we >>>>>> tested in all the environments. >>>>>> > >>>>>> > a) SELECT * FROM mongo.employee.`empinfo` limit 10; >>>>>> > >>>>>> > b) SELECT first_name, last_name FROM mongo.employee.`empinfo` >>>>>> limit 10; >>>>>> > >>>>>> > c) SELECT first_name, last_name FROM mongo.employee.`empinfo` >>>>>> where employee_id = 1111; >>>>>> > >>>>>> > d) SELECT * FROM mongo.employee.`empinfo` where full_name = >>>>>> 'Phil Munoz'; >>>>>> > >>>>>> > e) SELECT first_name, last_name, position_id FROM >>>>>> mongo.employee.`empinfo` where employee_id = 1111 OR position_id = 16; >>>>>> > >>>>>> > g) SELECT first_name, last_name FROM mongo.employee.`empinfo` >>>>>> where isFTE = true; >>>>>> > >>>>>> > h) SELECT first_name, last_name, position_id FROM >>>>>> mongo.employee.`empinfo` where employee_id = 1107 AND position_id = 17 >>>>>> AND >>>>>> last_name = 'Yonce'; >>>>>> > >>>>>> > >>>>>> > 3) PStore functionality not fully tested. >>>>>> > >>>>>> > >>>>>> > Thanks, >>>>>> > >>>>>> > Anil Kumar B >>>>>> > >>>>>> > >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Kamesh. >>>> >>> >>> >> >