[jira] [Created] (HIVE-24421) DruidOutputFormat and DruidStorageHandler use different filesystem causing issues in data loading

2020-11-24 Thread Nishant Bangarwa (Jira)
Nishant Bangarwa created HIVE-24421:
---

 Summary: DruidOutputFormat and DruidStorageHandler use different 
filesystem causing issues in data loading
 Key: HIVE-24421
 URL: https://issues.apache.org/jira/browse/HIVE-24421
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24420) Druid test failures

2020-11-24 Thread Nishant Bangarwa (Jira)
Nishant Bangarwa created HIVE-24420:
---

 Summary: Druid test failures 
 Key: HIVE-24420
 URL: https://issues.apache.org/jira/browse/HIVE-24420
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Test Result (11 failures / ±0)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_timestamptz2]
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_dynamic_partition]
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_expressions]
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_extractTime]
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_floorTime]
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_mv]
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_semijoin_reduction_all_types]
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test1]
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_alter]
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_insert]
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_ts]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23770) Druid filter translation unable to handle inverted between

2020-06-28 Thread Nishant Bangarwa (Jira)
Nishant Bangarwa created HIVE-23770:
---

 Summary: Druid filter translation unable to handle inverted between
 Key: HIVE-23770
 URL: https://issues.apache.org/jira/browse/HIVE-23770
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Druid filter translation happens in Calcite and does not uses HiveBetween 
inverted flag for translation this misses a negation in the planned query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23184) Upgrade druid to 0.17.1

2020-04-13 Thread Nishant Bangarwa (Jira)
Nishant Bangarwa created HIVE-23184:
---

 Summary: Upgrade druid to 0.17.1
 Key: HIVE-23184
 URL: https://issues.apache.org/jira/browse/HIVE-23184
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Upgrade to druid latest release 0.17.1




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22933) Allow connecting kerberos-enabled Hive to connect to a non-kerberos druid cluster

2020-02-26 Thread Nishant Bangarwa (Jira)
Nishant Bangarwa created HIVE-22933:
---

 Summary: Allow connecting kerberos-enabled Hive to connect to a 
non-kerberos druid cluster
 Key: HIVE-22933
 URL: https://issues.apache.org/jira/browse/HIVE-22933
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Currently, If kerberos is enabled for hive, it can only connect to external 
druid clusters which are kerberos enabled, Since the Druid client used to 
connect to druid is always KerberosHTTPClient, This task is to allow a kerberos 
enabled hiverserver2 to connect to non-kerberized druid cluster. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22395) Add ability to read Druid metastore password from jceks

2019-10-23 Thread Nishant Bangarwa (Jira)
Nishant Bangarwa created HIVE-22395:
---

 Summary: Add ability to read Druid metastore password from jceks
 Key: HIVE-22395
 URL: https://issues.apache.org/jira/browse/HIVE-22395
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22394) Duplicate Jars in druid classpath causing issues

2019-10-23 Thread Nishant Bangarwa (Jira)
Nishant Bangarwa created HIVE-22394:
---

 Summary: Duplicate Jars in druid classpath causing issues
 Key: HIVE-22394
 URL: https://issues.apache.org/jira/browse/HIVE-22394
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


hive-druid-handler jar has shaded version of druid classes, druid-hdfs-storage 
also has non-shaded classes. 

{code} 
[hive@hiveserver2-1 lib]$ ls |grep druid
calcite-druid-1.19.0.7.0.2.0-163.jar
druid-bloom-filter-0.15.1.7.0.2.0-163.jar
druid-hdfs-storage-0.15.1.7.0.2.0-163.jar
hive-druid-handler-3.1.2000.7.0.2.0-163.jar
hive-druid-handler.jar
{code}

Exception below - 
{code}
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
java.lang.RuntimeException: java.lang.NoClassDefFoundError: Could not 
initialize class org.apache.hadoop.fs.HadoopFsWrapper
  at 
org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
  at 
org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
  at 
org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
  at 
org.apache.hadoop.hive.druid.io.DruidRecordWriter.pushSegments(DruidRecordWriter.java:177)
  ... 22 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: 
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.hadoop.fs.HadoopFsWrapper
  at 
org.apache.hive.druid.org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:765)
  at 
org.apache.hive.druid.org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$push$1(AppenderatorImpl.java:630)
  at 
org.apache.hive.druid.com.google.common.util.concurrent.Futures$1.apply(Futures.java:713)
  at 
org.apache.hive.druid.com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861)
  ... 3 more
Caused by: java.lang.RuntimeException: java.lang.NoClassDefFoundError: Could 
not initialize class org.apache.hadoop.fs.HadoopFsWrapper
  at 
org.apache.hive.druid.org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:96)
  at 
org.apache.hive.druid.org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:114)
  at 
org.apache.hive.druid.org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:104)
  at 
org.apache.hive.druid.org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:743)
  ... 6 more
Caused by: java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.hadoop.fs.HadoopFsWrapper
  at 
org.apache.hive.druid.org.apache.druid.storage.hdfs.HdfsDataSegmentPusher.copyFilesWithChecks(HdfsDataSegmentPusher.java:163)
  at 
org.apache.hive.druid.org.apache.druid.storage.hdfs.HdfsDataSegmentPusher.push(HdfsDataSegmentPusher.java:145)
  at 
org.apache.hive.druid.org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$mergeAndPush$4(AppenderatorImpl.java:747)
  at 
org.apache.hive.druid.org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:86)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-21628) Use druid-s3-extensions when using S3 as druid deep storage

2019-04-18 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-21628:
---

 Summary: Use druid-s3-extensions when using S3 as druid deep 
storage
 Key: HIVE-21628
 URL: https://issues.apache.org/jira/browse/HIVE-21628
 Project: Hive
  Issue Type: Task
Reporter: Nishant Bangarwa


Currently DruidStorageHandler always use druid-hdfs-extensions for S3 as well 
as HDFS.
HDFS extension, pushes the segment to an intermediate directory and then does 
rename to copy it to final path. 
1) The rename causes additional copy of data over, which is avoided by druid-s3 
extension
2) rename may fail when the pushed file is not yet available due to eventual 
consistent model of S3. Refer exception below - 

{code} 
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
java.io.FileNotFoundException: No such file or directory: 
s3a://edws-nishant-test/druid/druid-1555443464-ggdf/data/workingDirectory/.staging-hive_20190417170114_a7fb3dcd-623b-46ca-bb87-9aac2fb50c6c/intermediateSegmentDir/default.cmv_basetable_d_7/11b3ceeb8d2843508336aac3347687cb/0_index.zip
at 
org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
at 
org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
at 
org.apache.hive.druid.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
at 
org.apache.hadoop.hive.druid.io.DruidRecordWriter.pushSegments(DruidRecordWriter.java:184)
... 22 more
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: No such 
file or directory: 
s3a://edws-nishant-test/druid/druid-1555443464-ggdf/data/workingDirectory/.staging-hive_20190417170114_a7fb3dcd-623b-46ca-bb87-9aac2fb50c6c/intermediateSegmentDir/default.cmv_basetable_d_7/11b3ceeb8d2843508336aac3347687cb/0_index.zip
at 
org.apache.hive.druid.com.google.common.base.Throwables.propagate(Throwables.java:160)
at 
org.apache.hive.druid.io.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:665)
at 
org.apache.hive.druid.io.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$push$0(AppenderatorImpl.java:528)
at 
org.apache.hive.druid.com.google.common.util.concurrent.Futures$1.apply(Futures.java:713)
at 
org.apache.hive.druid.com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861)
... 3 more
Caused by: java.io.FileNotFoundException: No such file or directory: 
s3a://edws-nishant-test/druid/druid-1555443464-ggdf/data/workingDirectory/.staging-hive_20190417170114_a7fb3dcd-623b-46ca-bb87-9aac2fb50c6c/intermediateSegmentDir/default.cmv_basetable_d_7/11b3ceeb8d2843508336aac3347687cb/0_index.zip
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321)
at 
org.apache.hadoop.fs.FileSystem.getFileLinkStatus(FileSystem.java:2727)
at org.apache.hadoop.fs.FileSystem.rename(FileSystem.java:1560)
at org.apache.hadoop.fs.HadoopFsWrapper.rename(HadoopFsWrapper.java:53)
at 
org.apache.hive.druid.io.druid.storage.hdfs.HdfsDataSegmentPusher.copyFilesWithChecks(HdfsDataSegmentPusher.java:168)
at 
org.apache.hive.druid.io.druid.storage.hdfs.HdfsDataSegmentPusher.push(HdfsDataSegmentPusher.java:149)
at 
org.apache.hive.druid.io.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$mergeAndPush$3(AppenderatorImpl.java:647)
at 
org.apache.hive.druid.io.druid.java.util.common.RetryUtils.retry(RetryUtils.java:63)
at 
org.apache.hive.druid.io.druid.java.util.common.RetryUtils.retry(RetryUtils.java:81)
at 
org.apache.hive.druid.io.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:638)
... 6 more
{code}   

This task is add the ability to switch to using druid-s3-extension when using 
S3A file scheme for druid storage directory. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21612) Upgrade druid to 0.14.0-incubating

2019-04-12 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-21612:
---

 Summary: Upgrade druid to 0.14.0-incubating
 Key: HIVE-21612
 URL: https://issues.apache.org/jira/browse/HIVE-21612
 Project: Hive
  Issue Type: Task
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Druid 0.14.0-incubating is released. 
This task is to upgrade hive to use 0.14.0-incubating version of druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20709) ASF License issue in HiveJDBCImplementor

2018-10-08 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20709:
---

 Summary: ASF License issue in HiveJDBCImplementor
 Key: HIVE-20709
 URL: https://issues.apache.org/jira/browse/HIVE-20709
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Lines that start with ? in the ASF License  report indicate files that do 
not have an Apache license header:
 !? 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14277/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/jdbc/HiveJdbcImplementor.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20700) Add config to disable rollup for druid

2018-10-05 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20700:
---

 Summary: Add config to disable rollup for druid
 Key: HIVE-20700
 URL: https://issues.apache.org/jira/browse/HIVE-20700
 Project: Hive
  Issue Type: New Feature
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Add a table property - 'druid.rollup' to allow disabling rollup for druid 
tables. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20698) Better error instead of NPE when timestamp is null for any row when ingesting to druid

2018-10-05 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20698:
---

 Summary: Better error instead of NPE when timestamp is null for 
any row when ingesting to druid
 Key: HIVE-20698
 URL: https://issues.apache.org/jira/browse/HIVE-20698
 Project: Hive
  Issue Type: Improvement
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Currently when ingesting data to druid we get a wierd NPE when timestamp is 
null for any row. 
We should provide an error with a better message which helps user to know what 
is actually wrong. 

{code} 
Caused by: java.lang.NullPointerException
  at 
org.apache.hadoop.hive.druid.serde.DruidSerDe.serialize(DruidSerDe.java:364)
  at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:957)
  at 
org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:111)
  at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
  at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
  at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:480)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20687) Cancel Running Druid Query when a hive query is cancelled.

2018-10-03 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20687:
---

 Summary: Cancel Running Druid Query when a hive query is 
cancelled. 
 Key: HIVE-20687
 URL: https://issues.apache.org/jira/browse/HIVE-20687
 Project: Hive
  Issue Type: Improvement
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


https://issues.apache.org/jira/browse/HIVE-20686 ensures that hive query id is 
passed to druid. 
Druid also supports query cancellation by query id. 
Queries can be cancelled explicitly using their queryId by sending a DELETE 
request to following endpoint on the broker or router - 
{code} 
DELETE /druid/v2/{queryId}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20686) Sync QueryIDs across hive and druid

2018-10-03 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20686:
---

 Summary: Sync QueryIDs across hive and druid
 Key: HIVE-20686
 URL: https://issues.apache.org/jira/browse/HIVE-20686
 Project: Hive
  Issue Type: Improvement
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


For the queries that hive passes to druid, pass on additional queryID as query 
context. 
It will be useful in tracing query level metrics across druid and hive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20684) Analyze table compute stats fails for tables containing timestamp with local time zone column

2018-10-03 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20684:
---

 Summary: Analyze table compute stats fails for tables containing 
timestamp with local time zone column
 Key: HIVE-20684
 URL: https://issues.apache.org/jira/browse/HIVE-20684
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Analyze table druid_table compute statistics for columns;

Reference Exception - 
{code} 
org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: Only 
integer/long/timestamp/date/float/double/string/binary/boolean/decimal
type argument is accepted but timestamp with local time zone is passed.
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFComputeStats.getEvaluator(GenericUDAFComputeStats.java:105)
at 
org.apache.hadoop.hive.ql.udf.generic.AbstractGenericUDAFResolver.getEvaluator(AbstractGenericUDAFResolver.java:48)
at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.getGenericUDAFEvaluator(FunctionRegistry.java:1043)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getGenericUDAFEvaluator(SemanticAnalyzer.java:4817)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:5482)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:6496)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10617)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11557)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11427)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:12229)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12319)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11802)
{code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2018-10-03 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20683:
---

 Summary: Add the Ability to push Dynamic Between and Bloom filters 
to Druid
 Key: HIVE-20683
 URL: https://issues.apache.org/jira/browse/HIVE-20683
 Project: Hive
  Issue Type: New Feature
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
filter for filtering one side of semi-join.
Druid 0.13.0 will have support for Bloom filters (Added via 
https://github.com/apache/incubator-druid/pull/6222)

Implementation details - 
# Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
# DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
# During execution phase, before sending the query to druid in 
DruidQueryBasedRecordReader we will deserialize this filter, translate it into 
a DruidDimFilter and add it to existing DruidQuery.  Tez executor already 
ensures that when we start reading results from the record reader, all the 
dynamic values are initialized. 
# Explaining a druid query also prints the query sent to druid as 
{{druid.json.query}}. We also need to make sure to update the druid query with 
the filters. During explain we do not have the actual values for the dynamic 
values, so instead of values we will print the dynamic expression itself as 
part of druid query. 

Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20626) Log more details when druid metastore transaction fails in callback

2018-09-24 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20626:
---

 Summary: Log more details when druid metastore transaction fails 
in callback
 Key: HIVE-20626
 URL: https://issues.apache.org/jira/browse/HIVE-20626
 Project: Hive
  Issue Type: Task
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Below exception does not give much details on what is the actual cause of the 
error. 
We also need to log the callback exception when we get it. 
{code} 
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
MetaException(message:Transaction failed do to exception being thrown from 
within the callback. See cause for the original exception.)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:932) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:937) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at 
org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4954) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:428) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2668) 
~[hive-exec-3.1.0.3.0.0.0-1634.jar:3.1.0.3.0.0.0-1634]
{code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20546) Upgrade to Druid 0.13.0

2018-09-12 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20546:
---

 Summary: Upgrade to Druid 0.13.0
 Key: HIVE-20546
 URL: https://issues.apache.org/jira/browse/HIVE-20546
 Project: Hive
  Issue Type: Task
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


This task is to upgrade to druid 0.13.0 when it is released. Note that it will 
hopefully be first apache release for Druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20539) Remove dependency on com.metamx.java-util

2018-09-11 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20539:
---

 Summary: Remove dependency on com.metamx.java-util
 Key: HIVE-20539
 URL: https://issues.apache.org/jira/browse/HIVE-20539
 Project: Hive
  Issue Type: Task
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


java-util was moved from com.metamx to druid code repository. 
Currently we are packing both com.metamx.java-jtil and io.druid.java-util, 
This task is to remove the dependency on com.metamx.java-util



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20469) Do not rollup PK/FK columns when indexing to druid.

2018-08-27 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20469:
---

 Summary: Do not rollup PK/FK columns when indexing to druid. 
 Key: HIVE-20469
 URL: https://issues.apache.org/jira/browse/HIVE-20469
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


When indexing data to druid if a numeric column has a PK/FK constraint. 
We need to make sure it is not indexed as a metric and rolled up when indexing 
to druid. 

Thanks [~t3rmin4t0r] for recommending this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20468) Add ability to skip creating druid bitmap indexes for specific string dimensions

2018-08-27 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20468:
---

 Summary: Add ability to skip creating druid bitmap indexes for 
specific string dimensions
 Key: HIVE-20468
 URL: https://issues.apache.org/jira/browse/HIVE-20468
 Project: Hive
  Issue Type: New Feature
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Currently we create bitmap index for all druid dimensions. 
For some columns (e.g Free form text, high cardinality columns that are rarely 
filtered upon), It may be beneficial to skip creating druid bitmap index and 
save disk space.  

In druid https://github.com/apache/incubator-druid/pull/5402 added support for 
creating string dimension columns without bitmap indexes. 
This task is to add similar option when indexing data from hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20449) DruidMiniTests - Move creation of druid table from allTypesOrc to test setup phase

2018-08-23 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20449:
---

 Summary: DruidMiniTests - Move creation of druid table from 
allTypesOrc to test setup phase
 Key: HIVE-20449
 URL: https://issues.apache.org/jira/browse/HIVE-20449
 Project: Hive
  Issue Type: Improvement
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Multiple druid tests end up creating a Druid table from allTypesOrc table. 
Moving this table creation to a pre-test setup phase would avoid redundant work 
in tests and possibly help in reducing test runtimes. 

Thanks, [~jcamachorodriguez] for suggesting this improvement. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20353) Follow redirects when hive connects to a passive druid overlord/coordinator

2018-08-09 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20353:
---

 Summary: Follow redirects when hive connects to a passive druid 
overlord/coordinator
 Key: HIVE-20353
 URL: https://issues.apache.org/jira/browse/HIVE-20353
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


When we have multiple druid coordinators/overlords and hive tries to connect to 
a passive one, it will get a redirect. Currently the http client in druid 
storage handler does not follow redirects. We need to check if there is a 
redirect and follow that for druid overlord/coordinator



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20349) Implement Retry Logic in HiveDruidSplit for Scan Queries

2018-08-09 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20349:
---

 Summary: Implement Retry Logic in HiveDruidSplit for Scan Queries
 Key: HIVE-20349
 URL: https://issues.apache.org/jira/browse/HIVE-20349
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


while distributing druid scan query we check where the segments are loaded and 
then each HiveDruidSplit directly queries the historical node. 
There are few cases when we need to retry and refetch the segments. 

# The segment is loaded on multiple historical nodes and one of them went down. 
in this case when we do not get response from one segment, we query the next 
replica. 
# The segment was loaded onto a realtime task and was handed over, when we 
query the realtime task has already finished. In this case there is no replica. 
The Split needs to query the broker again for the location of the segment and 
then send the query to correct historical node. 

This is also the root cause of failure of druidkafkamini_basic.q test, where 
the segment handover happens before the scan query is executed.

Note: This is not a problem when we are directly querying Druid brokers as the 
broker handles the retry logic. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20341) Druid Needs Explicit CASTs from Timestamp to STRING when the output of timestamp function is used as String

2018-08-08 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20341:
---

 Summary: Druid Needs Explicit CASTs from Timestamp to STRING when 
the output of timestamp function is used as String
 Key: HIVE-20341
 URL: https://issues.apache.org/jira/browse/HIVE-20341
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa


Druid timestamp expression functions returns numeric values in form of millis 
since epoch. 
Functions that use the output of the timestamp functions as String return 
different values for tables stored in HIVE and Druid.
{code}
SELECT SUBSTRING(to_date(datetime0),4) FROM tableau_orc.calcs;
| 4-07-25  |

SELECT SUBSTRING(to_date(datetime0),4) FROM druid_tableau.calcs;
| 002240  |

SELECT CONCAT(to_date(datetime0),' 00:00:00') FROM tableau_orc.calcs;
| 2004-07-17 00:00:00  |

SELECT CONCAT(to_date(datetime0),' 00:00:00') FROM druid_tableau.calcs;
| 109045440 00:00:00  |
{code}

We need to add explicit CAST to String before generating Druid expressions.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20297) Column Level Stats for Druid Tables

2018-08-02 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20297:
---

 Summary: Column Level Stats for Druid Tables
 Key: HIVE-20297
 URL: https://issues.apache.org/jira/browse/HIVE-20297
 Project: Hive
  Issue Type: Improvement
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


This task is to have correct column level stats for druid in hive metastore. 
- Stats like min/max/cardinality can be gathered using a Druid Segment Metadata 
Query. 
- Druid Query planning we need to ensure that the filters/Aggregations pushed 
inside DruidQuery are accounted for.

Having correct stats would also help optimizer ensure proper join orderings 
when doing federated complex joins between hive/druid. 




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20279) HiveContextAwareRecordReader slows down Druid Scan queries.

2018-07-31 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20279:
---

 Summary: HiveContextAwareRecordReader slows down Druid Scan 
queries. 
 Key: HIVE-20279
 URL: https://issues.apache.org/jira/browse/HIVE-20279
 Project: Hive
  Issue Type: Improvement
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa
 Attachments: scan2.svg

HiveContextAwareRecordReader add lots of overhead for Druid Scan Queries. 
See attached flame graph. 
Looks like the operations for checking for existence of footer/header buffer 
takes most of time For druid and other storage handlers that do not have footer 
buffer we should skip the logic for checking the existence for storage handlers 
atleast. 




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20278) Druid Scan Query avoid copying from List -> Map -> List

2018-07-31 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20278:
---

 Summary: Druid Scan Query avoid copying from List -> Map -> List
 Key: HIVE-20278
 URL: https://issues.apache.org/jira/browse/HIVE-20278
 Project: Hive
  Issue Type: Improvement
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


DruidScanQueryRecordReader gets a compacted List from druid. It then 
converts that list into a Map as DruidWritable where key is the 
column name. 
At the second stage DruidSerde takes this DruidWritable and creates a List out 
out of the map again. We can avoid the map creation part by reading the list 
sent by druid directly in the DruidSerde.deserialize() method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20035) write booleans as long when serializing to druid

2018-06-29 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20035:
---

 Summary: write booleans as long when serializing to druid
 Key: HIVE-20035
 URL: https://issues.apache.org/jira/browse/HIVE-20035
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Druid expressions do not support booleans yet. 
In druid expressions booleans are treated and parsed from longs, however when 
we store booleans from hive they are serialized as 'true' and 'false' string 
values. 
Need to make serialization consistent with deserialization and write long 
values when sending data to druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20014) Druid SECOND/HOUR/MINUTE does not return correct values when applied to String Columns

2018-06-27 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20014:
---

 Summary: Druid SECOND/HOUR/MINUTE does not return correct values 
when applied to String Columns
 Key: HIVE-20014
 URL: https://issues.apache.org/jira/browse/HIVE-20014
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa



Query SELECT  MINUTE(`time1`) FROM calcs; returns null when the String column 
only contains timestamp and does not contain any date information in the 
column. The Druid parser fails to parse the time string values and returns 
null. 

{code} 
1: jdbc:hive2://ctr-e138-1518143905142-379982> SELECT  MINUTE(`time1`) FROM 
calcs;
INFO  : Compiling 
command(queryId=hive_20180627145215_05147329-b8d8-491c-9bab-6fd5045542db): 
SELECT  MINUTE(`time1`) FROM calcs
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:vc, 
type:int, comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20180627145215_05147329-b8d8-491c-9bab-6fd5045542db); Time 
taken: 0.134 seconds
INFO  : Executing 
command(queryId=hive_20180627145215_05147329-b8d8-491c-9bab-6fd5045542db): 
SELECT  MINUTE(`time1`) FROM calcs
INFO  : Completed executing 
command(queryId=hive_20180627145215_05147329-b8d8-491c-9bab-6fd5045542db); Time 
taken: 0.002 seconds
INFO  : OK
+---+
|  vc   |
+---+
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
| NULL  |
+---+
17 rows selected (0.266 seconds)
1: jdbc:hive2://ctr-e138-1518143905142-379982> SELECT time1 from calcs;
INFO  : Compiling 
command(queryId=hive_20180627145225_93b872de-a698-4859-9730-983eede6935d): 
SELECT time1 from calcs
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:time1, 
type:string, comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20180627145225_93b872de-a698-4859-9730-983eede6935d); Time 
taken: 0.116 seconds
INFO  : Executing 
command(queryId=hive_20180627145225_93b872de-a698-4859-9730-983eede6935d): 
SELECT time1 from calcs
INFO  : Completed executing 
command(queryId=hive_20180627145225_93b872de-a698-4859-9730-983eede6935d); Time 
taken: 0.003 seconds
INFO  : OK
+---+
|   time1   |
+---+
| 22:20:14  |
| 22:50:16  |
| 19:36:22  |
| 19:48:23  |
| 00:05:57  |
| NULL  |
| 04:48:07  |
| NULL  |
| 19:57:33  |
| NULL  |
| 04:40:49  |
| 02:05:25  |
| NULL  |
| NULL  |
| 12:33:57  |
| 18:58:41  |
| 09:33:31  |
+---+
17 rows selected (0.202 seconds)
1: jdbc:hive2://ctr-e138-1518143905142-379982> EXPLAIN SELECT  MINUTE(`time1`) 
FROM calcs;
INFO  : Compiling 
command(queryId=hive_20180627145237_39e53a7e-35cb-4e17-8ccb-884c6f6358cd): 
EXPLAIN SELECT  MINUTE(`time1`) FROM calcs
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:Explain, 
type:string, comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20180627145237_39e53a7e-35cb-4e17-8ccb-884c6f6358cd); Time 
taken: 0.107 seconds
INFO  : Executing 
command(queryId=hive_20180627145237_39e53a7e-35cb-4e17-8ccb-884c6f6358cd): 
EXPLAIN SELECT  MINUTE(`time1`) FROM calcs
INFO  : Starting task [Stage-1:EXPLAIN] in serial mode
INFO  : Completed executing 
command(queryId=hive_20180627145237_39e53a7e-35cb-4e17-8ccb-884c6f6358cd); Time 
taken: 0.003 seconds
INFO  : OK
++
|  Explain   |
++
| Plan optimized by CBO. |
||
| Stage-0|
|   Fetch Operator   |
| limit:-1   |
| Select Operator [SEL_1]|
|   Output:["_col0"] |
|   TableScan [TS_0] |
| 
Output:["vc"],properties:{"druid.fieldNames":"vc","druid.fieldTypes":"int","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"druid_tableau.calcs\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"timestamp_extract(timestamp_parse(\\\"time1\\\",null,'UTC'),'MINUTE','UTC')\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"}
 |
||
++
10 rows selected (0.136 seconds)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20013) Add an Implicit cast to date type for to_date function

2018-06-27 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-20013:
---

 Summary: Add an Implicit cast to date type for to_date function
 Key: HIVE-20013
 URL: https://issues.apache.org/jira/browse/HIVE-20013
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Issue - 
SELECT TO_DATE(date1), TO_DATE(datetime1) FROM druid_table_n1;

Running this query on Druid returns null values when date1 and datetime1 are of 
type String. 

{code} 
INFO  : Executing 
command(queryId=hive_20180627144822_d4395567-e3cb-4b20-b53b-4e5eba2d7dac): 
EXPLAIN SELECT TO_DATE(datetime0) ,TO_DATE(date0) FROM calcs
INFO  : Starting task [Stage-1:EXPLAIN] in serial mode
INFO  : Completed executing 
command(queryId=hive_20180627144822_d4395567-e3cb-4b20-b53b-4e5eba2d7dac); Time 
taken: 0.003 seconds
INFO  : OK
++
|  Explain   |
++
| Plan optimized by CBO. |
||
| Stage-0|
|   Fetch Operator   |
| limit:-1   |
| Select Operator [SEL_1]|
|   Output:["_col0","_col1"] |
|   TableScan [TS_0] |
| 
Output:["vc","vc0"],properties:{"druid.fieldNames":"vc,vc0","druid.fieldTypes":"date,date","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"druid_tableau.calcs\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"timestamp_floor(\\\"datetime0\\\",'P1D','','UTC')\",\"outputType\":\"LONG\"},{\"type\":\"expression\",\"name\":\"vc0\",\"expression\":\"timestamp_floor(\\\"date0\\\",'P1D','','UTC')\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\",\"vc0\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"}
 |
||
++
10 rows selected (0.606 seconds)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19941) Row based Filters added via Hive Ranger policies are not pushed to druid

2018-06-18 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19941:
---

 Summary: Row based Filters added via Hive Ranger policies are not 
pushed to druid
 Key: HIVE-19941
 URL: https://issues.apache.org/jira/browse/HIVE-19941
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Issue is that when applying table mask we add virtual columns, however 
non-native tables do not have virtual columns, we need to skip adding virtual 
columns when generating masking query. 

Stack Trace - 
{code} 
org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:79 Invalid table 
alias or column reference 'BLOCK__OFFSET__INSIDE__FILE'
: (possible column names are: __time, yearmonth, year, month, dayofmonth, 
dayofweek, weekofyear, hour, minute, second, payment_typ
e, fare_amount, surcharge, mta_tax, tip_amount, tolls_amount, total_amount, 
trip_time)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11830)
 ~[hive-exec-2.1.0.2.6.
4.0-91.jar:2.1.0.2.6.4.0-91]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11778)
 ~[hive-exec-2.1.0.2.6.4.0
-91.jar:2.1.0.2.6.4.0-91]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSelectLogicalPlan(CalcitePlanner.java:3780)
 ~[hi
ve-exec-2.1.0.2.6.4.0-91.jar:2.1.0.2.6.4.0-91]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:4117)
 ~[hive-exe
c-2.1.0.2.6.4.0-91.jar:2.1.0.2.6.4.0-91]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:4016)
 ~[hive-exe
c-2.1.0.2.6.4.0-91.jar:2.1.0.2.6.4.0-91]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:4060)
 ~[hive-exe
c-2.1.0.2.6.4.0-91.jar:2.1.0.2.6.4.0-91]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1340)
 ~[hive-exec-2.1.0.2
.6.4.0-91.jar:2.1.0.2.6.4.0-91]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1277)
 ~[hive-exec-2.1.0.2
.6.4.0-91.jar:2.1.0.2.6.4.0-91]
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113) 
~[calcite-core-1.10.0.2.6.4.0-91.jar:1.10.0.2.6.4.0-91
]
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:997)
 ~[calcite-core-1.10.0.2.6.4.0-91.jar
:1.10.0.2.6.4.0-91]
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:149) 
~[calcite-core-1.10.0.2.6.4.0-91.jar:1.10.0.2.6.4.
0-91]
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:106) 
~[calcite-core-1.10.0.2.6.4.0-91.jar:1.10.0.2.6.4.
0-91]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1082)
 ~[hive-exec-2.1.0.2.6.4.0-91.jar:2
.1.0.2.6.4.0-91]
{code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19885) Druid Kafka Ingestion - Allow user to set kafka consumer properties via table properties

2018-06-13 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19885:
---

 Summary: Druid Kafka Ingestion - Allow user to set kafka consumer 
properties via table properties
 Key: HIVE-19885
 URL: https://issues.apache.org/jira/browse/HIVE-19885
 Project: Hive
  Issue Type: Improvement
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Allow users to set kafka consumer properties via table properties. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19762) Druid Queries containing Joins gives wrong results.

2018-06-01 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19762:
---

 Summary: Druid Queries containing Joins gives wrong results. 
 Key: HIVE-19762
 URL: https://issues.apache.org/jira/browse/HIVE-19762
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Druid queries that have joins against self table gives wrong results. 
e.g. 
{code} 
SELECT
username AS `username`,
SUM(double1) AS `sum_double1`
FROM
druid_table_with_nulls `tbl1`
  JOIN (
SELECT
username AS `username`,
SUM(double1) AS `sum_double2`
FROM druid_table_with_nulls
GROUP BY `username`
ORDER BY `sum_double2`
DESC  LIMIT 10
  )
  `tbl2`
ON (`tbl1`.`username` = `tbl2`.`username`)
GROUP BY `tbl1`.`username`;
{code} 

In this case one of the queries is a druid scan query and other is groupBy 
query. 
During planning, the properties of these queries are set to the tableDesc and 
serdeInfo, while setting the map work, we overwrite the properties from the 
properties present in serdeInfo, this causes the scan query results to be 
deserialized using wrong column names and results in Null values. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19604) Incorrect Handling of Boolean in DruidSerde

2018-05-18 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19604:
---

 Summary: Incorrect Handling of Boolean in DruidSerde
 Key: HIVE-19604
 URL: https://issues.apache.org/jira/browse/HIVE-19604
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Results of boolean expressions from Druid are expressed in the form of numeric 
1 or 0. 
When reading the results in DruidSerde both 1 and 0 are translated to String 
and then we call Boolean.valueOf(stringForm), this leads to the boolean being 
read always as false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19452) Avoid Deserializing and Serializing Druid query in DruidRecordReaders

2018-05-07 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19452:
---

 Summary: Avoid Deserializing and Serializing Druid query in 
DruidRecordReaders
 Key: HIVE-19452
 URL: https://issues.apache.org/jira/browse/HIVE-19452
 Project: Hive
  Issue Type: Task
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Druid record reader deserializes and serializes the Druid query before sending 
it to druid. 
This can be avoided and we can stop packaging some of druid dependencies e.g. 
org.antlr from druid-handler selfcontained jar. 




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19451) Druid Query Execution fails with ClassNotFoundException org.antlr.v4.runtime.CharStream

2018-05-07 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19451:
---

 Summary: Druid Query Execution fails with ClassNotFoundException 
org.antlr.v4.runtime.CharStream
 Key: HIVE-19451
 URL: https://issues.apache.org/jira/browse/HIVE-19451
 Project: Hive
  Issue Type: Task
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Stack trace - 
{code}
ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, 
vertexId=vertex_1524814504173_1344_45_00, diagnostics=[Task failed, 
taskId=task_1524814504173_1344_45_00_29, diagnostics=[TaskAttempt 0 failed, 
info=[Error: Error while running task ( failure ) : 
attempt_1524814504173_1344_45_00_29_0:java.lang.RuntimeException: 
java.io.IOException: 
org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
 Cannot construct instance of 
`org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
problem: org/antlr/v4/runtime/CharStream
 at [Source: 
(String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
 241 chars]; line: 1, column: 376] (through reference chain: 
org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: 
org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
 Cannot construct instance of 
`org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
problem: org/antlr/v4/runtime/CharStream
 at [Source: 
(String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
 241 chars]; line: 1, column: 376] (through reference chain: 
org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:438)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:157)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:83)
at 
org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703)
at 
org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662)
at 
org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150)
  

[jira] [Created] (HIVE-19173) Add Storage Handler runtime information as part of DESCRIBE EXTENDED

2018-04-11 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19173:
---

 Summary: Add Storage Handler runtime information as part of 
DESCRIBE EXTENDED
 Key: HIVE-19173
 URL: https://issues.apache.org/jira/browse/HIVE-19173
 Project: Hive
  Issue Type: Task
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Follow up for https://issues.apache.org/jira/browse/HIVE-18976 
Kafka Indexing Service in Druid has a runtime state associated with it. 
Druid publishes this runtime state as KafkaSupervisorReport which has latest 
offsets as reported by Kafka, the consumer lag per partition, as well as the 
aggregate lag of all partitions.

This information is quite useful to know whether a kafka-indexing-service 
backed table has latest info or not. 

This task is to add a this information as part of the output of DESCRIBE 
EXTENDED statement



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19172) NPE due to null EnvironmentContext in DDLTask

2018-04-11 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19172:
---

 Summary: NPE due to null EnvironmentContext in DDLTask
 Key: HIVE-19172
 URL: https://issues.apache.org/jira/browse/HIVE-19172
 Project: Hive
  Issue Type: Task
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Stack Trace -
{code}
2018-04-11T02:52:51,386 ERROR [5f2e24bf-ac93-4977-84fe-aa2c5f674ea4 main] 
exec.DDLTask: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3539)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:392)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1987)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1667)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19107) Wait for druid kafka indexing tasks to start before returning from create table statement

2018-04-04 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19107:
---

 Summary: Wait for druid kafka indexing tasks to start before 
returning from create table statement
 Key: HIVE-19107
 URL: https://issues.apache.org/jira/browse/HIVE-19107
 Project: Hive
  Issue Type: Task
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Follow up for https://issues.apache.org/jira/browse/HIVE-18976 
Above PR adds support to setup druid kafka indexing service from hive. 
However, the create table command submits the kafka supervisor to druid and 
does not wait for the indexing tasks to start. This task is to add a wait by 
checking the supervisor status on druid side. 




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-26 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19049:
---

 Summary: Add support for Alter table add columns for Druid
 Key: HIVE-19049
 URL: https://issues.apache.org/jira/browse/HIVE-19049
 Project: Hive
  Issue Type: Task
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Add support for Alter table add columns for Druid. 
Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19026) Configurable serde for druid kafka indexing

2018-03-22 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-19026:
---

 Summary: Configurable serde for druid kafka indexing 
 Key: HIVE-19026
 URL: https://issues.apache.org/jira/browse/HIVE-19026
 Project: Hive
  Issue Type: Task
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


https://issues.apache.org/jira/browse/HIVE-18976 introduces support for setting 
up druid kafka-indexing service. 
Input serialization should be configurable. for now we can say we only support 
json, but there should be a mechanism to support other formats. Perhaps, we can 
make use of Hive's serde library like LazySimpleSerde etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-03-16 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-18976:
---

 Summary: Add ability to setup Druid Kafka Ingestion from Hive
 Key: HIVE-18976
 URL: https://issues.apache.org/jira/browse/HIVE-18976
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement

e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
druid can start ingesting events from kafka. 
{code:java}
 
CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language string, 
`user` string, added int, deleted int, delta int)
STORED BY 
'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
TBLPROPERTIES (
"druid.segment.granularity" = "HOUR",
"druid.query.granularity" = "MINUTE",
"kafka.bootstrap.servers" = "localhost:9092",
"kafka.topic" = "test-topic",
"druid.kafka.ingest.useEarliestOffset" = "true"
);
{code}

Design - This can be done via a DruidKafkaStreamingStorageHandler that extends 
existing DruidStorageHandler and adds the additional functionality for 
Streaming. 

Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster + 
Single Node Kafka Broker. The broker can be populated with a test topic that 
has some predefined data. 





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18583) Enable DateRangeRules

2018-01-30 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-18583:
---

 Summary: Enable DateRangeRules 
 Key: HIVE-18583
 URL: https://issues.apache.org/jira/browse/HIVE-18583
 Project: Hive
  Issue Type: Improvement
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Enable DateRangeRules to translate druid filters to date ranges. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18569) Hive Druid indexing not dealing with decimals in correct way.

2018-01-29 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-18569:
---

 Summary: Hive Druid indexing not dealing with decimals in correct 
way.
 Key: HIVE-18569
 URL: https://issues.apache.org/jira/browse/HIVE-18569
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Currently, a decimal column is indexed as double in druid.
This should not happen and either the user has to add an explicit cast or we 
can add a flag to enable approximation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18518) Upgrade druid version to 0.11.0

2018-01-23 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-18518:
---

 Summary: Upgrade druid version to 0.11.0
 Key: HIVE-18518
 URL: https://issues.apache.org/jira/browse/HIVE-18518
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


this task is to upgrade to druid version 0.11.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18271) Druid Insert into fails with exception when committing files

2017-12-13 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-18271:
---

 Summary: Druid Insert into fails with exception when committing 
files
 Key: HIVE-18271
 URL: https://issues.apache.org/jira/browse/HIVE-18271
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Exception - 
{code}
03.hwx.site:8020/apps/hive/warehouse/_tmp.all100k_druid_initial_empty to: 
hdfs://ctr-e136-1513029738776-2163-01-03.hwx.site:8020/apps/hive/warehouse/_tmp.all100k_druid_initial_empty.moved)'
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move: 
hdfs://ctr-e136-1513029738776-2163-01-03.hwx.site:8020/apps/hive/warehouse/_tmp.all100k_druid_initial_empty
 to: 
hdfs://ctr-e136-1513029738776-2163-01-03.hwx.site:8020/apps/hive/warehouse/_tmp.all100k_druid_initial_empty.moved
at org.apache.hadoop.hive.ql.exec.Utilities.rename(Utilities.java:1129)
at 
org.apache.hadoop.hive.ql.exec.Utilities.mvFileToFinalPath(Utilities.java:1460)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1135)
at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:765)
at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:770)
at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:588)
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:286)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1987)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1667)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1211)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1204)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:350)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-16752) Enable Unit test - TestDruidRecordWriter.testWrite

2017-05-24 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-16752:
---

 Summary: Enable Unit test - TestDruidRecordWriter.testWrite
 Key: HIVE-16752
 URL: https://issues.apache.org/jira/browse/HIVE-16752
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa


After the changes done in https://issues.apache.org/jira/browse/HIVE-16474 the 
test is failing due to loading of guava classes from hive-exec jar. 
this is because the hive-exec jar is a shaded jar which contains all the 
dependencies. 
For details see - https://github.com/apache/hive/blob/master/ql/pom.xml#L820

"The way shade was configured since 0.13, is to override the default jar for ql 
module with the shaded one but keep the same name."

So when mvn resolves the jar when running the unit test, it sees the shaded jar 
which has guava also. 
To resolve this, there are two ways i could find - 
1) Tweak the order of dependencies in druid 
2) Somehow add a dependency in druid-handler for non-shaded jar, but since it 
has been already overridden, not sure how to do it. 
3) Use a different namespace for guava classes in hive-exec jar.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16576) Fix encoding of intervals when fetching select query candidates from druid

2017-05-03 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-16576:
---

 Summary: Fix encoding of intervals when fetching select query 
candidates from druid
 Key: HIVE-16576
 URL: https://issues.apache.org/jira/browse/HIVE-16576
 Project: Hive
  Issue Type: Bug
  Components: Druid integration
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Debug logs on HIVE side - 
{code}
2017-05-03T23:49:00,672 DEBUG [HttpClient-Netty-Worker-0] 
client.NettyHttpClient: [GET 
http://localhost:8082/druid/v2/datasources/cmv_basetable_druid/candidates?intervals=1900-01-01T00:00:00.000+05:53:20/3000-01-01T00:00:00.000+05:30]
 Got response: 500 Server Error
{code}

Druid exception stack trace - 
{code}
2017-05-03T18:56:58,928 WARN [qtp1651318806-158] 
org.eclipse.jetty.servlet.ServletHandler - 
/druid/v2/datasources/cmv_basetable_druid/candidates
java.lang.IllegalArgumentException: Invalid format: ""1900-01-01T00:00:00.000 
05:53:20"
at 
org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:899)
 ~[joda-time-2.8.2.jar:2.8.2]
at 
org.joda.time.convert.StringConverter.setInto(StringConverter.java:212) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.base.BaseInterval.(BaseInterval.java:200) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.Interval.(Interval.java:193) 
~[joda-time-2.8.2.jar:2.8.2]
at org.joda.time.Interval.parse(Interval.java:69) 
~[joda-time-2.8.2.jar:2.8.2]
at 
io.druid.server.ClientInfoResource.getQueryTargets(ClientInfoResource.java:320) 
~[classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_92]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_92]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_92]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
{code}

Note that intervals being sent as part of the HTTP request URL are not encoded 
properly. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16518) Insert override for druid does not replace all existing segments

2017-04-24 Thread Nishant Bangarwa (JIRA)
Nishant Bangarwa created HIVE-16518:
---

 Summary: Insert override for druid does not replace all existing 
segments
 Key: HIVE-16518
 URL: https://issues.apache.org/jira/browse/HIVE-16518
 Project: Hive
  Issue Type: Bug
Reporter: Nishant Bangarwa
Assignee: Nishant Bangarwa


Insert override for Druid does not replace segments for all intervals. 
It just replaces segments for the intervals which are newly ingested. 
INSERT OVERRIDE TABLE statement on DruidStorageHandler should override all 
existing segments for the table. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)