[jira] [Work started] (HIVE-15633) Hive/Druid integration: Exception when time filter is not in datasource range

Jesus Camacho Rodriguez (JIRA) Mon, 16 Jan 2017 06:14:48 -0800

     [ 
https://issues.apache.org/jira/browse/HIVE-15633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Work on HIVE-15633 started by Jesus Camacho Rodriguez.
------------------------------------------------------
> Hive/Druid integration: Exception when time filter is not in datasource range
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-15633
>                 URL: https://issues.apache.org/jira/browse/HIVE-15633
>             Project: Hive
>          Issue Type: Bug
>          Components: Druid integration
>    Affects Versions: 2.2.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>
> When _metadataList.isEmpty()_ (L222 in DruidQueryBasedInputFormat) returns 
> true, we throw an Exception. However, this is true if query filters on range 
> that is not within datasource timestamp ranges. Thus, we should only throw 
> the Exception if _metadataList_ is null.
> Issue can be reproduced with the following query if timestamp values are all 
> greater or equal than '1999-11-01 00:00:00':
> {code:sql}
> SELECT COUNT(`__time`)
> FROM store_sales_sold_time_subset
> WHERE `__time` < '1999-11-01 00:00:00';
> {code}
> {noformat}
> Status: Failed
> Vertex failed, vertexName=Map 1, vertexId=vertex_1484282558103_0067_2_00, 
> diagnostics=[Vertex vertex_1484282558103_0067_2_00 [Map 1] killed/failed due 
> to:ROOT_INPUT_INIT_FAILURE, Vertex Input: store_sales_sold_time_subset 
> initializer failed, vertex=vertex_1484282558103_0067_2_00 [Map 1], 
> java.io.IOException: Connected to Druid but could not retrieve datasource 
> information
>       at 
> org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.splitSelectQuery(DruidQueryBasedInputFormat.java:224)
>       at 
> org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getInputSplits(DruidQueryBasedInputFormat.java:140)
>       at 
> org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getSplits(DruidQueryBasedInputFormat.java:92)
>       at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:367)
>       at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:485)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:196)
>       at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
>       at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>       at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
>       at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HIVE-15633) Hive/Druid integration: Exception when time filter is not in datasource range

Reply via email to