Could you please file a public jira. That link is to an internal issue. On Thu, Apr 2, 2015 at 1:52 PM, Sudhakar Thota <[email protected]> wrote:
> Here it is: > > https://maprdrill.atlassian.net/browse/MD-204?filter=-2 > > Thanks > Sudhakar Thota > > > On Apr 2, 2015, at 1:36 PM, Andries Engelbrecht <[email protected]> > wrote: > > > Cool, thx for testing. > > > > Best to file a JIRA. > > > > —Andries > > > > On Apr 2, 2015, at 1:27 PM, Sudhakar Thota <[email protected]> wrote: > > > >> Vince/Andries, > >> > >> Perhaps this could be a bug. I get the same results. > >> > >> But the plan is very different, the UnionExchange is set up immediately > after the scan operation in successful case( Case -1 ), where as > UnionExchange is happening after scan->project (Case -2). > >> > >> Case -1.Successful case. > >> > >> 0: jdbc:drill:> explain plan for select to_timestamp(t.t, > 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select * from > dfs.sthota_prq.`/tstamp_test/*.parquet` limit 13015351) t; > >> +------------+------------+ > >> | text | json | > >> +------------+------------+ > >> | 00-00 Screen > >> 00-01 Project(EXPR$0=[TO_TIMESTAMP(ITEM($0, 't'), > 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''')]) > >> 00-02 SelectionVectorRemover > >> 00-03 Limit(fetch=[13015351]) > >> 00-04 UnionExchange > >> 01-01 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:/mapr/ > demo.mapr.com/user/sthota/parquet/tstamp_test/1_2_0.parquet], > ReadEntryWithPath [path=maprfs:/mapr/ > demo.mapr.com/user/sthota/parquet/tstamp_test/1_1_0.parquet], > ReadEntryWithPath [path=maprfs:/mapr/ > demo.mapr.com/user/sthota/parquet/tstamp_test/1_0_0.parquet]], > selectionRoot=/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test, > numFiles=3, columns=[`*`]]]) > >> | { > >> "head" : { > >> "version" : 1, > >> "generator" : { > >> "type" : "ExplainHandler", > >> "info" : "" > >> }, > >> "type" : "APACHE_DRILL_PHYSICAL", > >> "options" : [ ], > >> "queue" : 0, > >> "resultMode" : "EXEC" > >> }, > >> > >> Case -2. Unsuccessful case: > >> > >> 0: jdbc:drill:> explain plan for select to_timestamp(t.t, > 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select * from > dfs.sthota_prq.`/tstamp_test/*.parquet` ) t; > >> +------------+------------+ > >> | text | json | > >> +------------+------------+ > >> | 00-00 Screen > >> 00-01 UnionExchange > >> 01-01 Project(EXPR$0=[TO_TIMESTAMP(ITEM($0, 't'), > 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''')]) > >> 01-02 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:/mapr/ > demo.mapr.com/user/sthota/parquet/tstamp_test/1_2_0.parquet], > ReadEntryWithPath [path=maprfs:/mapr/ > demo.mapr.com/user/sthota/parquet/tstamp_test/1_1_0.parquet], > ReadEntryWithPath [path=maprfs:/mapr/ > demo.mapr.com/user/sthota/parquet/tstamp_test/1_0_0.parquet]], > selectionRoot=/mapr/demo.mapr.com/user/sthota/parquet/tstamp_test, > numFiles=3, columns=[`*`]]]) > >> | { > >> "head" : { > >> "version" : 1, > >> "generator" : { > >> "type" : "ExplainHandler", > >> "info" : "" > >> }, > >> "type" : "APACHE_DRILL_PHYSICAL", > >> "options" : [ ], > >> "queue" : 0, > >> "resultMode" : "EXEC" > >> }, > >> > >> Thanks > >> Sudhakar Thota > >> > >> > >> On Apr 2, 2015, at 12:01 PM, Vince Gonzalez <[email protected]> > wrote: > >> > >>> Ok, will do. Thanks. > >>> > >>> On Thu, Apr 2, 2015 at 2:49 PM, Andries Engelbrecht < > >>> [email protected]> wrote: > >>> > >>>> Compare the query plans and you probably want to look at the log file > to > >>>> see what fails and post here. > >>>> > >>>> > >>>> > >>>> —Andries > >>>> > >>>> > >>>> On Apr 1, 2015, at 12:54 PM, Vince Gonzalez <[email protected] > > > >>>> wrote: > >>>> > >>>>> Is this a bug? > >>>>> > >>>>> Created a parquet table (using CTAS) with one column containing text > >>>>> timestamps. > >>>>> > >>>>> 0: jdbc:drill:zk=localhost:2181> select * from tstamp_test limit 1; > >>>>> +------------+ > >>>>> | t | > >>>>> +------------+ > >>>>> | 2015-01-27T13:43:53.000Z | > >>>>> +------------+ > >>>>> 1 row selected (0.119 seconds) > >>>>> > >>>>> The below queries, identical apart from the limit clause, behave > >>>>> differently. The one with the limit clause works, the one without > >>>> doesn't. > >>>>> The limit is larger than the total number of rows, so in both cases > we > >>>>> should be processing all rows. > >>>>> > >>>>> No limit clause. It fails: > >>>>> > >>>>> ``` > >>>>> 0: jdbc:drill:zk=localhost:2181> select to_timestamp(t.t, > >>>>> 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select t from tstamp_test) > as > >>>> t; > >>>>> Query failed: RemoteRpcException: Failure while trying to start > remote > >>>>> fragment, Expression has syntax error! line 1:30:mismatched input 'T' > >>>>> expecting CParen [ 7d30d753-0822-4820-afd0-b7e7fe5e639c on > >>>>> 192.168.99.1:31010 ] > >>>>> ``` > >>>>> > >>>>> Limit clause in the subselect (larger than the number of rows in the > >>>> table) > >>>>> succeeds. > >>>>> > >>>>> ``` > >>>>> 0: jdbc:drill:zk=localhost:2181> select to_timestamp(t.t, > >>>>> 'YYYY-MM-dd''T''HH:mm:ss.SSS''Z''') FROM (select t from tstamp_test > limit > >>>>> 100000000) as t; > >>>>> ... > >>>>> | 2015-02-17 07:18:00.0 | > >>>>> +------------+ > >>>>> 13,015,350 rows selected (105.257 seconds) > >>>>> ``` > >>>>> > >>>>> Data can be downloaded here: > >>>>> > >>>>> https://s3.amazonaws.com/vgonzalez/data/tstamp_test.tar.gz > >>>> > >>>> > >> > > > > -- Steven Phillips Software Engineer mapr.com
