Here it is: https://maprdrill.atlassian.net/browse/MD-204?filter=-2
Thanks Sudhakar Thota On Apr 2, 2015, at 1:36 PM, Andries Engelbrecht <[email protected]> wrote: > Cool, thx for testing. > > Best to file a JIRA. > > —Andries > > On Apr 2, 2015, at 1:27 PM, Sudhakar Thota <[email protected]> wrote: > >> Vince/Andries, >> >> Perhaps this could be a bug. I get the same results. >> >> But the plan is very different, the UnionExchange is set up immediately >> after the scan operation in successful case( Case -1 ), where as >> UnionExchange is happening after scan->project (Case -2). >> >> Case -1.Successful case. >> >> 0: jdbc:drill:> explain plan for select to_timestamp(t.t, >> 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select * from >> dfs.sthota_prq.`/tstamp_test/*.parquet` limit 13015351) t; >> +------------+------------+ >> | text | json | >> +------------+------------+ >> | 00-00 Screen >> 00-01 Project(EXPR$0=[TO_TIMESTAMP(ITEM($0, 't'), >> 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''')]) >> 00-02 SelectionVectorRemover >> 00-03 Limit(fetch=[13015351]) >> 00-04 UnionExchange >> 01-01 Scan(groupscan=[ParquetGroupScan >> [entries=[ReadEntryWithPath >> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_2_0.parquet], >> ReadEntryWithPath >> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_1_0.parquet], >> ReadEntryWithPath >> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_0_0.parquet]], >> selectionRoot=/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test, >> numFiles=3, columns=[`*`]]]) >> | { >> "head" : { >> "version" : 1, >> "generator" : { >> "type" : "ExplainHandler", >> "info" : "" >> }, >> "type" : "APACHE_DRILL_PHYSICAL", >> "options" : [ ], >> "queue" : 0, >> "resultMode" : "EXEC" >> }, >> >> Case -2. Unsuccessful case: >> >> 0: jdbc:drill:> explain plan for select to_timestamp(t.t, >> 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select * from >> dfs.sthota_prq.`/tstamp_test/*.parquet` ) t; >> +------------+------------+ >> | text | json | >> +------------+------------+ >> | 00-00 Screen >> 00-01 UnionExchange >> 01-01 Project(EXPR$0=[TO_TIMESTAMP(ITEM($0, 't'), >> 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''')]) >> 01-02 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath >> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_2_0.parquet], >> ReadEntryWithPath >> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_1_0.parquet], >> ReadEntryWithPath >> [path=maprfs:/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test/1_0_0.parquet]], >> selectionRoot=/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test, >> numFiles=3, columns=[`*`]]]) >> | { >> "head" : { >> "version" : 1, >> "generator" : { >> "type" : "ExplainHandler", >> "info" : "" >> }, >> "type" : "APACHE_DRILL_PHYSICAL", >> "options" : [ ], >> "queue" : 0, >> "resultMode" : "EXEC" >> }, >> >> Thanks >> Sudhakar Thota >> >> >> On Apr 2, 2015, at 12:01 PM, Vince Gonzalez <[email protected]> wrote: >> >>> Ok, will do. Thanks. >>> >>> On Thu, Apr 2, 2015 at 2:49 PM, Andries Engelbrecht < >>> [email protected]> wrote: >>> >>>> Compare the query plans and you probably want to look at the log file to >>>> see what fails and post here. >>>> >>>> >>>> >>>> —Andries >>>> >>>> >>>> On Apr 1, 2015, at 12:54 PM, Vince Gonzalez <[email protected]> >>>> wrote: >>>> >>>>> Is this a bug? >>>>> >>>>> Created a parquet table (using CTAS) with one column containing text >>>>> timestamps. >>>>> >>>>> 0: jdbc:drill:zk=localhost:2181> select * from tstamp_test limit 1; >>>>> +------------+ >>>>> | t | >>>>> +------------+ >>>>> | 2015-01-27T13:43:53.000Z | >>>>> +------------+ >>>>> 1 row selected (0.119 seconds) >>>>> >>>>> The below queries, identical apart from the limit clause, behave >>>>> differently. The one with the limit clause works, the one without >>>> doesn't. >>>>> The limit is larger than the total number of rows, so in both cases we >>>>> should be processing all rows. >>>>> >>>>> No limit clause. It fails: >>>>> >>>>> ``` >>>>> 0: jdbc:drill:zk=localhost:2181> select to_timestamp(t.t, >>>>> 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select t from tstamp_test) as >>>> t; >>>>> Query failed: RemoteRpcException: Failure while trying to start remote >>>>> fragment, Expression has syntax error! line 1:30:mismatched input 'T' >>>>> expecting CParen [ 7d30d753-0822-4820-afd0-b7e7fe5e639c on >>>>> 192.168.99.1:31010 ] >>>>> ``` >>>>> >>>>> Limit clause in the subselect (larger than the number of rows in the >>>> table) >>>>> succeeds. >>>>> >>>>> ``` >>>>> 0: jdbc:drill:zk=localhost:2181> select to_timestamp(t.t, >>>>> 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select t from tstamp_test limit >>>>> 100000000) as t; >>>>> ... >>>>> | 2015-02-17 07:18:00.0 | >>>>> +------------+ >>>>> 13,015,350 rows selected (105.257 seconds) >>>>> ``` >>>>> >>>>> Data can be downloaded here: >>>>> >>>>> https://s3.amazonaws.com/vgonzalez/data/tstamp_test.tar.gz >>>> >>>> >> >
