from:"Nishant Bangarwa \(JIRA\)"

[jira] [Assigned] (HIVE-19291) Three underscores are in the CTAS example of the documentation

2018-04-24 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-19291:
---

Assignee: Nishant Bangarwa

> Three underscores are in the CTAS example of the documentation 
> ---
>
> Key: HIVE-19291
> URL: https://issues.apache.org/jira/browse/HIVE-19291
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Wade Salazar
>Assignee: Nishant Bangarwa
>Priority: Trivial
>
> on the page https://cwiki.apache.org/confluence/display/Hive/Druid+Integration
>  
> {{The following example is provided}}
> {{"}}
> {{CREATE TABLE druid_table_1}}
> {{STORED BY }}{{'org.apache.hadoop.hive.druid.DruidStorageHandler'}}
> {{AS}}
> {{ `metric2`>;}}
> {{"}}
>  
> {{There are 3 underscores in front of the time dimension where the code only 
> executes if 2 underscores are provided}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19173) Add Storage Handler runtime information as part of DESCRIBE EXTENDED

2018-05-01 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19173:

Attachment: HIVE-19173.02.patch

> Add Storage Handler runtime information as part of DESCRIBE EXTENDED
> 
>
> Key: HIVE-19173
> URL: https://issues.apache.org/jira/browse/HIVE-19173
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19173.01.patch, HIVE-19173.02.patch
>
>
> Follow up for https://issues.apache.org/jira/browse/HIVE-18976 
> Kafka Indexing Service in Druid has a runtime state associated with it. 
> Druid publishes this runtime state as KafkaSupervisorReport which has latest 
> offsets as reported by Kafka, the consumer lag per partition, as well as the 
> aggregate lag of all partitions.
> This information is quite useful to know whether a kafka-indexing-service 
> backed table has latest info or not. 
> This task is to add a this information as part of the output of DESCRIBE 
> EXTENDED statement



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19451) Druid Query Execution fails with ClassNotFoundException org.antlr.v4.runtime.CharStream

2018-05-07 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-19451:
---


> Druid Query Execution fails with ClassNotFoundException 
> org.antlr.v4.runtime.CharStream
> ---
>
> Key: HIVE-19451
> URL: https://issues.apache.org/jira/browse/HIVE-19451
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Stack trace - 
> {code}
> ERROR : Status: Failed
> ERROR : Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1524814504173_1344_45_00, diagnostics=[Task failed, 
> taskId=task_1524814504173_1344_45_00_29, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Error while running task ( failure ) : 
> attempt_1524814504173_1344_45_00_29_0:java.lang.RuntimeException: 
> java.io.IOException: 
> org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
>  Cannot construct instance of 
> `org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
> problem: org/antlr/v4/runtime/CharStream
>  at [Source: 
> (String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
>  241 chars]; line: 1, column: 376] (through reference chain: 
> org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: 
> org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
>  Cannot construct instance of 
> `org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
> problem: org/antlr/v4/runtime/CharStream
>  at [Source: 
> (String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
>  241 chars]; line: 1, column: 376] (through reference chain: 
> org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:438)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:157)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(

[jira] [Updated] (HIVE-19451) Druid Query Execution fails with ClassNotFoundException org.antlr.v4.runtime.CharStream

2018-05-07 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19451:

Attachment: HIVE-19451.patch

> Druid Query Execution fails with ClassNotFoundException 
> org.antlr.v4.runtime.CharStream
> ---
>
> Key: HIVE-19451
> URL: https://issues.apache.org/jira/browse/HIVE-19451
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19451.patch
>
>
> Stack trace - 
> {code}
> ERROR : Status: Failed
> ERROR : Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1524814504173_1344_45_00, diagnostics=[Task failed, 
> taskId=task_1524814504173_1344_45_00_29, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Error while running task ( failure ) : 
> attempt_1524814504173_1344_45_00_29_0:java.lang.RuntimeException: 
> java.io.IOException: 
> org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
>  Cannot construct instance of 
> `org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
> problem: org/antlr/v4/runtime/CharStream
>  at [Source: 
> (String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
>  241 chars]; line: 1, column: 376] (through reference chain: 
> org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: 
> org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
>  Cannot construct instance of 
> `org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
> problem: org/antlr/v4/runtime/CharStream
>  at [Source: 
> (String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
>  241 chars]; line: 1, column: 376] (through reference chain: 
> org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:438)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:157)

[jira] [Updated] (HIVE-19451) Druid Query Execution fails with ClassNotFoundException org.antlr.v4.runtime.CharStream

2018-05-07 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19451:

Status: Patch Available  (was: Open)

> Druid Query Execution fails with ClassNotFoundException 
> org.antlr.v4.runtime.CharStream
> ---
>
> Key: HIVE-19451
> URL: https://issues.apache.org/jira/browse/HIVE-19451
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19451.patch
>
>
> Stack trace - 
> {code}
> ERROR : Status: Failed
> ERROR : Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1524814504173_1344_45_00, diagnostics=[Task failed, 
> taskId=task_1524814504173_1344_45_00_29, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Error while running task ( failure ) : 
> attempt_1524814504173_1344_45_00_29_0:java.lang.RuntimeException: 
> java.io.IOException: 
> org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
>  Cannot construct instance of 
> `org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
> problem: org/antlr/v4/runtime/CharStream
>  at [Source: 
> (String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
>  241 chars]; line: 1, column: 376] (through reference chain: 
> org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: 
> org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
>  Cannot construct instance of 
> `org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
> problem: org/antlr/v4/runtime/CharStream
>  at [Source: 
> (String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
>  241 chars]; line: 1, column: 376] (through reference chain: 
> org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:438)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.j

[jira] [Assigned] (HIVE-19452) Avoid Deserializing and Serializing Druid query in DruidRecordReaders

2018-05-07 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-19452:
---


> Avoid Deserializing and Serializing Druid query in DruidRecordReaders
> -
>
> Key: HIVE-19452
> URL: https://issues.apache.org/jira/browse/HIVE-19452
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Druid record reader deserializes and serializes the Druid query before 
> sending it to druid. 
> This can be avoided and we can stop packaging some of druid dependencies e.g. 
> org.antlr from druid-handler selfcontained jar. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19451) Druid Query Execution fails with ClassNotFoundException org.antlr.v4.runtime.CharStream

2018-05-07 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16466882#comment-16466882
 ] 

Nishant Bangarwa commented on HIVE-19451:
-

+cc [~ashutoshc] Attached patch fixes the problem by adding the org.antlr 
classes, It will unblock the failures, 
Also created follow up JIRA to avoid druid query serde - 
https://issues.apache.org/jira/browse/HIVE-19452

> Druid Query Execution fails with ClassNotFoundException 
> org.antlr.v4.runtime.CharStream
> ---
>
> Key: HIVE-19451
> URL: https://issues.apache.org/jira/browse/HIVE-19451
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19451.patch
>
>
> Stack trace - 
> {code}
> ERROR : Status: Failed
> ERROR : Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1524814504173_1344_45_00, diagnostics=[Task failed, 
> taskId=task_1524814504173_1344_45_00_29, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Error while running task ( failure ) : 
> attempt_1524814504173_1344_45_00_29_0:java.lang.RuntimeException: 
> java.io.IOException: 
> org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
>  Cannot construct instance of 
> `org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
> problem: org/antlr/v4/runtime/CharStream
>  at [Source: 
> (String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
>  241 chars]; line: 1, column: 376] (through reference chain: 
> org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: 
> org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
>  Cannot construct instance of 
> `org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
> problem: org/antlr/v4/runtime/CharStream
>  at [Source: 
> (String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
>  241 chars]; line: 1, column: 376] (through reference chain: 
> org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationExc

[jira] [Updated] (HIVE-19451) Druid Query Execution fails with ClassNotFoundException org.antlr.v4.runtime.CharStream

2018-05-09 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19451:

Fix Version/s: 3.0.0

> Druid Query Execution fails with ClassNotFoundException 
> org.antlr.v4.runtime.CharStream
> ---
>
> Key: HIVE-19451
> URL: https://issues.apache.org/jira/browse/HIVE-19451
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19451.patch
>
>
> Stack trace - 
> {code}
> ERROR : Status: Failed
> ERROR : Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1524814504173_1344_45_00, diagnostics=[Task failed, 
> taskId=task_1524814504173_1344_45_00_29, diagnostics=[TaskAttempt 0 
> failed, info=[Error: Error while running task ( failure ) : 
> attempt_1524814504173_1344_45_00_29_0:java.lang.RuntimeException: 
> java.io.IOException: 
> org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
>  Cannot construct instance of 
> `org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
> problem: org/antlr/v4/runtime/CharStream
>  at [Source: 
> (String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
>  241 chars]; line: 1, column: 376] (through reference chain: 
> org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: 
> org.apache.hive.druid.com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
>  Cannot construct instance of 
> `org.apache.hive.druid.io.druid.segment.virtual.ExpressionVirtualColumn`, 
> problem: org/antlr/v4/runtime/CharStream
>  at [Source: 
> (String)"{"queryType":"scan","dataSource":{"type":"table","name":"tpcds_real_bin_partitioned_orc_1000.tpcds_denormalized_druid_table_7mcd"},"intervals":{"type":"segments","segments":[{"itvl":"1998-11-30T00:00:00.000Z/1998-12-01T00:00:00.000Z","ver":"2018-05-03T11:35:22.230Z","part":0}]},"virtualColumns":[{"type":"expression","name":"vc","expression":"\"__time\"","outputType":"LONG"}],"resultFormat":"compactedList","batchSize":20480,"limit":9223372036854775807,"filter":{"type":"bound","dimension":"i_brand"[truncated
>  241 chars]; line: 1, column: 376] (through reference chain: 
> org.apache.hive.druid.io.druid.query.scan.ScanQuery["virtualColumns"]->java.util.ArrayList[0])
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>   at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:438)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(M

[jira] [Assigned] (HIVE-19604) Incorrect Handling of Boolean in DruidSerde

2018-05-18 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-19604:
---


> Incorrect Handling of Boolean in DruidSerde
> ---
>
> Key: HIVE-19604
> URL: https://issues.apache.org/jira/browse/HIVE-19604
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Results of boolean expressions from Druid are expressed in the form of 
> numeric 1 or 0. 
> When reading the results in DruidSerde both 1 and 0 are translated to String 
> and then we call Boolean.valueOf(stringForm), this leads to the boolean being 
> read always as false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19604) Incorrect Handling of Boolean in DruidSerde

2018-05-18 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16480989#comment-16480989
 ] 

Nishant Bangarwa commented on HIVE-19604:
-

+cc [~ashutoshc] please review. 

> Incorrect Handling of Boolean in DruidSerde
> ---
>
> Key: HIVE-19604
> URL: https://issues.apache.org/jira/browse/HIVE-19604
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19604.patch
>
>
> Results of boolean expressions from Druid are expressed in the form of 
> numeric 1 or 0. 
> When reading the results in DruidSerde both 1 and 0 are translated to String 
> and then we call Boolean.valueOf(stringForm), this leads to the boolean being 
> read always as false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19604) Incorrect Handling of Boolean in DruidSerde

2018-05-18 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19604:

Attachment: HIVE-19604.patch

> Incorrect Handling of Boolean in DruidSerde
> ---
>
> Key: HIVE-19604
> URL: https://issues.apache.org/jira/browse/HIVE-19604
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19604.patch
>
>
> Results of boolean expressions from Druid are expressed in the form of 
> numeric 1 or 0. 
> When reading the results in DruidSerde both 1 and 0 are translated to String 
> and then we call Boolean.valueOf(stringForm), this leads to the boolean being 
> read always as false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19604) Incorrect Handling of Boolean in DruidSerde

2018-05-18 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19604:

Fix Version/s: 3.0.0
   Status: Patch Available  (was: Open)

> Incorrect Handling of Boolean in DruidSerde
> ---
>
> Key: HIVE-19604
> URL: https://issues.apache.org/jira/browse/HIVE-19604
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19604.patch
>
>
> Results of boolean expressions from Druid are expressed in the form of 
> numeric 1 or 0. 
> When reading the results in DruidSerde both 1 and 0 are translated to String 
> and then we call Boolean.valueOf(stringForm), this leads to the boolean being 
> read always as false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19552) Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q

2018-05-24 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-19552:
---

Assignee: Nishant Bangarwa

> Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q
> -
>
> Key: HIVE-19552
> URL: https://issues.apache.org/jira/browse/HIVE-19552
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Nishant Bangarwa
>Priority: Critical
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-22395) Add ability to read Druid metastore password from jceks

2019-11-25 Thread Nishant Bangarwa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-22395:

Attachment: HIVE-22395.1.patch

> Add ability to read Druid metastore password from jceks
> ---
>
> Key: HIVE-22395
> URL: https://issues.apache.org/jira/browse/HIVE-22395
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22395.1.patch, HIVE-22395.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22395) Add ability to read Druid metastore password from jceks

2019-11-27 Thread Nishant Bangarwa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-22395:

Attachment: HIVE-22395.2.patch

> Add ability to read Druid metastore password from jceks
> ---
>
> Key: HIVE-22395
> URL: https://issues.apache.org/jira/browse/HIVE-22395
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22395.1.patch, HIVE-22395.2.patch, HIVE-22395.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22395) Add ability to read Druid metastore password from jceks

2019-12-04 Thread Nishant Bangarwa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-22395:

Attachment: HIVE-22395.2.patch

> Add ability to read Druid metastore password from jceks
> ---
>
> Key: HIVE-22395
> URL: https://issues.apache.org/jira/browse/HIVE-22395
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22395.1.patch, HIVE-22395.2.patch, 
> HIVE-22395.2.patch, HIVE-22395.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22395) Add ability to read Druid metastore password from jceks

2020-01-08 Thread Nishant Bangarwa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-22395:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Add ability to read Druid metastore password from jceks
> ---
>
> Key: HIVE-22395
> URL: https://issues.apache.org/jira/browse/HIVE-22395
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22395.1.patch, HIVE-22395.2.patch, 
> HIVE-22395.2.patch, HIVE-22395.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22395) Add ability to read Druid metastore password from jceks

2020-01-08 Thread Nishant Bangarwa (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010682#comment-17010682
 ] 

Nishant Bangarwa commented on HIVE-22395:
-

committed via 
https://github.com/apache/hive/commit/948144a49753d3955505f428d427fb7b2fb9642a

> Add ability to read Druid metastore password from jceks
> ---
>
> Key: HIVE-22395
> URL: https://issues.apache.org/jira/browse/HIVE-22395
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22395.1.patch, HIVE-22395.2.patch, 
> HIVE-22395.2.patch, HIVE-22395.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-19026) Configurable serde for druid kafka indexing

2018-03-22 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-19026:
---


> Configurable serde for druid kafka indexing 
> 
>
> Key: HIVE-19026
> URL: https://issues.apache.org/jira/browse/HIVE-19026
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> https://issues.apache.org/jira/browse/HIVE-18976 introduces support for 
> setting up druid kafka-indexing service. 
> Input serialization should be configurable. for now we can say we only 
> support json, but there should be a mechanism to support other formats. 
> Perhaps, we can make use of Hive's serde library like LazySimpleSerde etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-26 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Attachment: HIVE-19049.patch

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-26 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Status: Patch Available  (was: Open)

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-26 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-19049:
---


> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-26 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16413970#comment-16413970
 ] 

Nishant Bangarwa commented on HIVE-19049:
-

+cc [~ashutoshc] This patch makes the alter table work by storing the field 
types for druid in metadata store. 
shouldStoreFieldsInMetastore in DruidSerde now returns true.  
However I am still exploring if there is an alternate way of making it work 
without needed to store the field type info in metadata store.

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-27 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Attachment: HIVE-19049.01.patch

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19049.01.patch, HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-27 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16416140#comment-16416140
 ] 

Nishant Bangarwa commented on HIVE-19049:
-

[~ashutoshc] Updated the patch to read old columns from storage descriptor 
instead of metadata store. Also added a hook to the HiveMetaHook that can be 
used by the StorageHandlers. 
This hook is unused for druid storage handler as of now, since we do not 
require to make any changes on druid side to add columns. 

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19049.01.patch, HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-27 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16416357#comment-16416357
 ] 

Nishant Bangarwa commented on HIVE-19049:
-

[~ashutoshc] done with review comments, please check. 

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-27 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Attachment: HIVE-19049.02.patch

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-28 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Attachment: (was: HIVE-19049.02.patch)

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-28 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Attachment: HIVE-19049.02.patch

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-28 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16416964#comment-16416964
 ] 

Nishant Bangarwa commented on HIVE-19049:
-

Ah, my bad, rebased and updated the patch. 

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-29 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Attachment: HIVE-19049.addendum.patch

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.addendum.patch, HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-29 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Attachment: HIVE-19049.addendum.patch

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.addendum.patch, HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-29 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Attachment: (was: HIVE-19049.addendum.patch)

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.addendum.patch, HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-29 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Status: Patch Available  (was: Reopened)

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.addendum.patch, HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Reopened] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-29 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reopened HIVE-19049:
-

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.addendum.patch, HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-29 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16420122#comment-16420122
 ] 

Nishant Bangarwa commented on HIVE-19049:
-

[~ashutoshc] thanks for the review. It seems some of the test failures are 
related to this patch. have attached an addendum patch to handle the case when 
envContext is null. 

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.addendum.patch, HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-30 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Attachment: (was: HIVE-19049.addendum.patch)

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.addendum.patch, HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19049) Add support for Alter table add columns for Druid

2018-03-30 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19049:

Attachment: HIVE-19049.addendum.patch

> Add support for Alter table add columns for Druid
> -
>
> Key: HIVE-19049
> URL: https://issues.apache.org/jira/browse/HIVE-19049
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19049.01.patch, HIVE-19049.02.patch, 
> HIVE-19049.addendum.patch, HIVE-19049.patch
>
>
> Add support for Alter table add columns for Druid. 
> Currently it is not supported and throws exception. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-04-02 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-18976:

Attachment: HIVE-18976.03.patch

> Add ability to setup Druid Kafka Ingestion from Hive
> 
>
> Key: HIVE-18976
> URL: https://issues.apache.org/jira/browse/HIVE-18976
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18976.03.patch, HIVE-18976.patch
>
>
> Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement
> e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
> druid can start ingesting events from kafka. 
> {code:java}
>  
> CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language 
> string, `user` string, added int, deleted int, delta int)
> STORED BY 
> 'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "HOUR",
> "druid.query.granularity" = "MINUTE",
> "kafka.bootstrap.servers" = "localhost:9092",
> "kafka.topic" = "test-topic",
> "druid.kafka.ingest.useEarliestOffset" = "true"
> );
> {code}
> Design - This can be done via a DruidKafkaStreamingStorageHandler that 
> extends existing DruidStorageHandler and adds the additional functionality 
> for Streaming. 
> Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster 
> + Single Node Kafka Broker. The broker can be populated with a test topic 
> that has some predefined data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-04-03 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-18976:

Attachment: HIVE-18976.04.patch

> Add ability to setup Druid Kafka Ingestion from Hive
> 
>
> Key: HIVE-18976
> URL: https://issues.apache.org/jira/browse/HIVE-18976
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18976.03.patch, HIVE-18976.04.patch, 
> HIVE-18976.patch
>
>
> Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement
> e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
> druid can start ingesting events from kafka. 
> {code:java}
>  
> CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language 
> string, `user` string, added int, deleted int, delta int)
> STORED BY 
> 'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "HOUR",
> "druid.query.granularity" = "MINUTE",
> "kafka.bootstrap.servers" = "localhost:9092",
> "kafka.topic" = "test-topic",
> "druid.kafka.ingest.useEarliestOffset" = "true"
> );
> {code}
> Design - This can be done via a DruidKafkaStreamingStorageHandler that 
> extends existing DruidStorageHandler and adds the additional functionality 
> for Streaming. 
> Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster 
> + Single Node Kafka Broker. The broker can be populated with a test topic 
> that has some predefined data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19107) Wait for druid kafka indexing tasks to start before returning from create table statement

2018-04-04 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-19107:
---


> Wait for druid kafka indexing tasks to start before returning from create 
> table statement
> -
>
> Key: HIVE-19107
> URL: https://issues.apache.org/jira/browse/HIVE-19107
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Follow up for https://issues.apache.org/jira/browse/HIVE-18976 
> Above PR adds support to setup druid kafka indexing service from hive. 
> However, the create table command submits the kafka supervisor to druid and 
> does not wait for the indexing tasks to start. This task is to add a wait by 
> checking the supervisor status on druid side. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-04-04 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-18976:

Attachment: HIVE-18976.05.patch

> Add ability to setup Druid Kafka Ingestion from Hive
> 
>
> Key: HIVE-18976
> URL: https://issues.apache.org/jira/browse/HIVE-18976
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18976.03.patch, HIVE-18976.04.patch, 
> HIVE-18976.05.patch, HIVE-18976.patch
>
>
> Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement
> e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
> druid can start ingesting events from kafka. 
> {code:java}
>  
> CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language 
> string, `user` string, added int, deleted int, delta int)
> STORED BY 
> 'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "HOUR",
> "druid.query.granularity" = "MINUTE",
> "kafka.bootstrap.servers" = "localhost:9092",
> "kafka.topic" = "test-topic",
> "druid.kafka.ingest.useEarliestOffset" = "true"
> );
> {code}
> Design - This can be done via a DruidKafkaStreamingStorageHandler that 
> extends existing DruidStorageHandler and adds the additional functionality 
> for Streaming. 
> Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster 
> + Single Node Kafka Broker. The broker can be populated with a test topic 
> that has some predefined data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19172) NPE due to null EnvironmentContext in DDLTask

2018-04-11 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-19172:
---


> NPE due to null EnvironmentContext in DDLTask
> -
>
> Key: HIVE-19172
> URL: https://issues.apache.org/jira/browse/HIVE-19172
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Stack Trace -
> {code}
> 2018-04-11T02:52:51,386 ERROR [5f2e24bf-ac93-4977-84fe-aa2c5f674ea4 main] 
> exec.DDLTask: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3539)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:392)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1987)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1667)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19172) NPE due to null EnvironmentContext in DDLTask

2018-04-11 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433910#comment-16433910
 ] 

Nishant Bangarwa commented on HIVE-19172:
-

Note - This was reported by [~abstractdog] 

> NPE due to null EnvironmentContext in DDLTask
> -
>
> Key: HIVE-19172
> URL: https://issues.apache.org/jira/browse/HIVE-19172
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Stack Trace -
> {code}
> 2018-04-11T02:52:51,386 ERROR [5f2e24bf-ac93-4977-84fe-aa2c5f674ea4 main] 
> exec.DDLTask: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3539)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:392)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1987)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1667)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19172) NPE due to null EnvironmentContext in DDLTask

2018-04-11 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19172:

Status: Patch Available  (was: Open)

> NPE due to null EnvironmentContext in DDLTask
> -
>
> Key: HIVE-19172
> URL: https://issues.apache.org/jira/browse/HIVE-19172
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19172.patch
>
>
> Stack Trace -
> {code}
> 2018-04-11T02:52:51,386 ERROR [5f2e24bf-ac93-4977-84fe-aa2c5f674ea4 main] 
> exec.DDLTask: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3539)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:392)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1987)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1667)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19172) NPE due to null EnvironmentContext in DDLTask

2018-04-11 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19172:

Attachment: HIVE-19172.patch

> NPE due to null EnvironmentContext in DDLTask
> -
>
> Key: HIVE-19172
> URL: https://issues.apache.org/jira/browse/HIVE-19172
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19172.patch
>
>
> Stack Trace -
> {code}
> 2018-04-11T02:52:51,386 ERROR [5f2e24bf-ac93-4977-84fe-aa2c5f674ea4 main] 
> exec.DDLTask: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3539)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:392)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1987)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1667)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19172) NPE due to null EnvironmentContext in DDLTask

2018-04-11 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433913#comment-16433913
 ] 

Nishant Bangarwa commented on HIVE-19172:
-

+cc [~ashutoshc] please review.

> NPE due to null EnvironmentContext in DDLTask
> -
>
> Key: HIVE-19172
> URL: https://issues.apache.org/jira/browse/HIVE-19172
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19172.patch
>
>
> Stack Trace -
> {code}
> 2018-04-11T02:52:51,386 ERROR [5f2e24bf-ac93-4977-84fe-aa2c5f674ea4 main] 
> exec.DDLTask: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3539)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:392)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1987)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1667)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19170) Fix TestMiniDruidKafkaCliDriver

2018-04-11 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-19170:
---

Assignee: Nishant Bangarwa

> Fix TestMiniDruidKafkaCliDriver
> ---
>
> Key: HIVE-19170
> URL: https://issues.apache.org/jira/browse/HIVE-19170
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19170.patch
>
>
> added in HIVE-18976 
> the property key {{druid.kafka.query.files}} doesn't exists in 
> testconfiguration.properties.
> because of this TestMiniDruidKafkaCliDriver tries to run *all* qtests...which 
> time out...and produce 
> {code}
> TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
> out) (batchId=252)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19170) Fix TestMiniDruidKafkaCliDriver

2018-04-11 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19170:

Attachment: HIVE-19170.patch

> Fix TestMiniDruidKafkaCliDriver
> ---
>
> Key: HIVE-19170
> URL: https://issues.apache.org/jira/browse/HIVE-19170
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19170.patch
>
>
> added in HIVE-18976 
> the property key {{druid.kafka.query.files}} doesn't exists in 
> testconfiguration.properties.
> because of this TestMiniDruidKafkaCliDriver tries to run *all* qtests...which 
> time out...and produce 
> {code}
> TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
> out) (batchId=252)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19170) Fix TestMiniDruidKafkaCliDriver

2018-04-11 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19170:

Status: Patch Available  (was: Open)

[~kgyrtkirk] there is only 1 test for this driver at present. Attached a patch 
to add that to testconfiguration.

> Fix TestMiniDruidKafkaCliDriver
> ---
>
> Key: HIVE-19170
> URL: https://issues.apache.org/jira/browse/HIVE-19170
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19170.patch
>
>
> added in HIVE-18976 
> the property key {{druid.kafka.query.files}} doesn't exists in 
> testconfiguration.properties.
> because of this TestMiniDruidKafkaCliDriver tries to run *all* qtests...which 
> time out...and produce 
> {code}
> TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
> out) (batchId=252)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19173) Add Storage Handler runtime information as part of DESCRIBE EXTENDED

2018-04-11 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-19173:
---


> Add Storage Handler runtime information as part of DESCRIBE EXTENDED
> 
>
> Key: HIVE-19173
> URL: https://issues.apache.org/jira/browse/HIVE-19173
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Follow up for https://issues.apache.org/jira/browse/HIVE-18976 
> Kafka Indexing Service in Druid has a runtime state associated with it. 
> Druid publishes this runtime state as KafkaSupervisorReport which has latest 
> offsets as reported by Kafka, the consumer lag per partition, as well as the 
> aggregate lag of all partitions.
> This information is quite useful to know whether a kafka-indexing-service 
> backed table has latest info or not. 
> This task is to add a this information as part of the output of DESCRIBE 
> EXTENDED statement



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19170) Fix TestMiniDruidKafkaCliDriver

2018-04-12 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19170:

Attachment: HIVE-19170.01.patch

> Fix TestMiniDruidKafkaCliDriver
> ---
>
> Key: HIVE-19170
> URL: https://issues.apache.org/jira/browse/HIVE-19170
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19170.01.patch, HIVE-19170.patch
>
>
> added in HIVE-18976 
> the property key {{druid.kafka.query.files}} doesn't exists in 
> testconfiguration.properties.
> because of this TestMiniDruidKafkaCliDriver tries to run *all* qtests...which 
> time out...and produce 
> {code}
> TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
> out) (batchId=252)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19170) Fix TestMiniDruidKafkaCliDriver

2018-04-12 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16436763#comment-16436763
 ] 

Nishant Bangarwa commented on HIVE-19170:
-

removed druidkafkamini_basic from TestMiniDruidCliDriver. 

> Fix TestMiniDruidKafkaCliDriver
> ---
>
> Key: HIVE-19170
> URL: https://issues.apache.org/jira/browse/HIVE-19170
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-19170.01.patch, HIVE-19170.patch
>
>
> added in HIVE-18976 
> the property key {{druid.kafka.query.files}} doesn't exists in 
> testconfiguration.properties.
> because of this TestMiniDruidKafkaCliDriver tries to run *all* qtests...which 
> time out...and produce 
> {code}
> TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
> out) (batchId=252)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19173) Add Storage Handler runtime information as part of DESCRIBE EXTENDED

2018-04-13 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19173:

Attachment: HIVE-19173.01.patch

> Add Storage Handler runtime information as part of DESCRIBE EXTENDED
> 
>
> Key: HIVE-19173
> URL: https://issues.apache.org/jira/browse/HIVE-19173
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19173.01.patch
>
>
> Follow up for https://issues.apache.org/jira/browse/HIVE-18976 
> Kafka Indexing Service in Druid has a runtime state associated with it. 
> Druid publishes this runtime state as KafkaSupervisorReport which has latest 
> offsets as reported by Kafka, the consumer lag per partition, as well as the 
> aggregate lag of all partitions.
> This information is quite useful to know whether a kafka-indexing-service 
> backed table has latest info or not. 
> This task is to add a this information as part of the output of DESCRIBE 
> EXTENDED statement



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19173) Add Storage Handler runtime information as part of DESCRIBE EXTENDED

2018-04-13 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19173:

Fix Version/s: 3.0.0
   Status: Patch Available  (was: Open)

> Add Storage Handler runtime information as part of DESCRIBE EXTENDED
> 
>
> Key: HIVE-19173
> URL: https://issues.apache.org/jira/browse/HIVE-19173
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19173.01.patch
>
>
> Follow up for https://issues.apache.org/jira/browse/HIVE-18976 
> Kafka Indexing Service in Druid has a runtime state associated with it. 
> Druid publishes this runtime state as KafkaSupervisorReport which has latest 
> offsets as reported by Kafka, the consumer lag per partition, as well as the 
> aggregate lag of all partitions.
> This information is quite useful to know whether a kafka-indexing-service 
> backed table has latest info or not. 
> This task is to add a this information as part of the output of DESCRIBE 
> EXTENDED statement



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19173) Add Storage Handler runtime information as part of DESCRIBE EXTENDED

2018-04-13 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16437556#comment-16437556
 ] 

Nishant Bangarwa commented on HIVE-19173:
-

[~ashutoshc] please review. 

> Add Storage Handler runtime information as part of DESCRIBE EXTENDED
> 
>
> Key: HIVE-19173
> URL: https://issues.apache.org/jira/browse/HIVE-19173
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19173.01.patch
>
>
> Follow up for https://issues.apache.org/jira/browse/HIVE-18976 
> Kafka Indexing Service in Druid has a runtime state associated with it. 
> Druid publishes this runtime state as KafkaSupervisorReport which has latest 
> offsets as reported by Kafka, the consumer lag per partition, as well as the 
> aggregate lag of all partitions.
> This information is quite useful to know whether a kafka-indexing-service 
> backed table has latest info or not. 
> This task is to add a this information as part of the output of DESCRIBE 
> EXTENDED statement



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19170) Fix TestMiniDruidKafkaCliDriver

2018-04-13 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19170:

Attachment: HIVE-19170.addendum.patch

> Fix TestMiniDruidKafkaCliDriver
> ---
>
> Key: HIVE-19170
> URL: https://issues.apache.org/jira/browse/HIVE-19170
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19170.01.patch, HIVE-19170.addendum.patch, 
> HIVE-19170.patch
>
>
> added in HIVE-18976 
> the property key {{druid.kafka.query.files}} doesn't exists in 
> testconfiguration.properties.
> because of this TestMiniDruidKafkaCliDriver tries to run *all* qtests...which 
> time out...and produce 
> {code}
> TestMiniDruidKafkaCliDriver - did not produce a TEST-*.xml file (likely timed 
> out) (batchId=252)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-03-16 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-18976:
---


> Add ability to setup Druid Kafka Ingestion from Hive
> 
>
> Key: HIVE-18976
> URL: https://issues.apache.org/jira/browse/HIVE-18976
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement
> e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
> druid can start ingesting events from kafka. 
> {code:java}
>  
> CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language 
> string, `user` string, added int, deleted int, delta int)
> STORED BY 
> 'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "HOUR",
> "druid.query.granularity" = "MINUTE",
> "kafka.bootstrap.servers" = "localhost:9092",
> "kafka.topic" = "test-topic",
> "druid.kafka.ingest.useEarliestOffset" = "true"
> );
> {code}
> Design - This can be done via a DruidKafkaStreamingStorageHandler that 
> extends existing DruidStorageHandler and adds the additional functionality 
> for Streaming. 
> Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster 
> + Single Node Kafka Broker. The broker can be populated with a test topic 
> that has some predefined data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-03-16 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-18976:

Status: Patch Available  (was: Open)

> Add ability to setup Druid Kafka Ingestion from Hive
> 
>
> Key: HIVE-18976
> URL: https://issues.apache.org/jira/browse/HIVE-18976
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18976.patch
>
>
> Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement
> e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
> druid can start ingesting events from kafka. 
> {code:java}
>  
> CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language 
> string, `user` string, added int, deleted int, delta int)
> STORED BY 
> 'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "HOUR",
> "druid.query.granularity" = "MINUTE",
> "kafka.bootstrap.servers" = "localhost:9092",
> "kafka.topic" = "test-topic",
> "druid.kafka.ingest.useEarliestOffset" = "true"
> );
> {code}
> Design - This can be done via a DruidKafkaStreamingStorageHandler that 
> extends existing DruidStorageHandler and adds the additional functionality 
> for Streaming. 
> Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster 
> + Single Node Kafka Broker. The broker can be populated with a test topic 
> that has some predefined data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-03-16 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-18976:

Attachment: HIVE-18976.patch

> Add ability to setup Druid Kafka Ingestion from Hive
> 
>
> Key: HIVE-18976
> URL: https://issues.apache.org/jira/browse/HIVE-18976
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18976.patch
>
>
> Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement
> e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
> druid can start ingesting events from kafka. 
> {code:java}
>  
> CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language 
> string, `user` string, added int, deleted int, delta int)
> STORED BY 
> 'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "HOUR",
> "druid.query.granularity" = "MINUTE",
> "kafka.bootstrap.servers" = "localhost:9092",
> "kafka.topic" = "test-topic",
> "druid.kafka.ingest.useEarliestOffset" = "true"
> );
> {code}
> Design - This can be done via a DruidKafkaStreamingStorageHandler that 
> extends existing DruidStorageHandler and adds the additional functionality 
> for Streaming. 
> Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster 
> + Single Node Kafka Broker. The broker can be populated with a test topic 
> that has some predefined data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-03-16 Thread Nishant Bangarwa (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16402089#comment-16402089
 ] 

Nishant Bangarwa commented on HIVE-18976:
-

+cc [~ashutoshc] [~jcamachorodriguez] @b-slim Please review. 

> Add ability to setup Druid Kafka Ingestion from Hive
> 
>
> Key: HIVE-18976
> URL: https://issues.apache.org/jira/browse/HIVE-18976
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18976.patch
>
>
> Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement
> e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
> druid can start ingesting events from kafka. 
> {code:java}
>  
> CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language 
> string, `user` string, added int, deleted int, delta int)
> STORED BY 
> 'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "HOUR",
> "druid.query.granularity" = "MINUTE",
> "kafka.bootstrap.servers" = "localhost:9092",
> "kafka.topic" = "test-topic",
> "druid.kafka.ingest.useEarliestOffset" = "true"
> );
> {code}
> Design - This can be done via a DruidKafkaStreamingStorageHandler that 
> extends existing DruidStorageHandler and adds the additional functionality 
> for Streaming. 
> Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster 
> + Single Node Kafka Broker. The broker can be populated with a test topic 
> that has some predefined data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-03-16 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-18976:

Attachment: (was: HIVE-18976.patch)

> Add ability to setup Druid Kafka Ingestion from Hive
> 
>
> Key: HIVE-18976
> URL: https://issues.apache.org/jira/browse/HIVE-18976
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18976.patch
>
>
> Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement
> e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
> druid can start ingesting events from kafka. 
> {code:java}
>  
> CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language 
> string, `user` string, added int, deleted int, delta int)
> STORED BY 
> 'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "HOUR",
> "druid.query.granularity" = "MINUTE",
> "kafka.bootstrap.servers" = "localhost:9092",
> "kafka.topic" = "test-topic",
> "druid.kafka.ingest.useEarliestOffset" = "true"
> );
> {code}
> Design - This can be done via a DruidKafkaStreamingStorageHandler that 
> extends existing DruidStorageHandler and adds the additional functionality 
> for Streaming. 
> Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster 
> + Single Node Kafka Broker. The broker can be populated with a test topic 
> that has some predefined data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18976) Add ability to setup Druid Kafka Ingestion from Hive

2018-03-16 Thread Nishant Bangarwa (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-18976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-18976:

Attachment: HIVE-18976.patch

> Add ability to setup Druid Kafka Ingestion from Hive
> 
>
> Key: HIVE-18976
> URL: https://issues.apache.org/jira/browse/HIVE-18976
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18976.patch
>
>
> Add Ability to setup druid kafka Ingestion using Hive CREATE TABLE statement
> e.g. Below query can submit a kafka supervisor spec to the druid overlord and 
> druid can start ingesting events from kafka. 
> {code:java}
>  
> CREATE TABLE druid_kafka_test(`__time` timestamp, page string, language 
> string, `user` string, added int, deleted int, delta int)
> STORED BY 
> 'org.apache.hadoop.hive.druid.DruidKafkaStreamingStorageHandler'
> TBLPROPERTIES (
> "druid.segment.granularity" = "HOUR",
> "druid.query.granularity" = "MINUTE",
> "kafka.bootstrap.servers" = "localhost:9092",
> "kafka.topic" = "test-topic",
> "druid.kafka.ingest.useEarliestOffset" = "true"
> );
> {code}
> Design - This can be done via a DruidKafkaStreamingStorageHandler that 
> extends existing DruidStorageHandler and adds the additional functionality 
> for Streaming. 
> Testing - Add a DruidKafkaMiniCluster which will consist of DruidMiniCluster 
> + Single Node Kafka Broker. The broker can be populated with a test topic 
> that has some predefined data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20349) Implement Retry Logic in HiveDruidSplit for Scan Queries

2018-08-17 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20349:

Status: Patch Available  (was: Open)

> Implement Retry Logic in HiveDruidSplit for Scan Queries
> 
>
> Key: HIVE-20349
> URL: https://issues.apache.org/jira/browse/HIVE-20349
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20349.patch
>
>
> while distributing druid scan query we check where the segments are loaded 
> and then each HiveDruidSplit directly queries the historical node. 
> There are few cases when we need to retry and refetch the segments. 
> # The segment is loaded on multiple historical nodes and one of them went 
> down. in this case when we do not get response from one segment, we query the 
> next replica. 
> # The segment was loaded onto a realtime task and was handed over, when we 
> query the realtime task has already finished. In this case there is no 
> replica. The Split needs to query the broker again for the location of the 
> segment and then send the query to correct historical node. 
> This is also the root cause of failure of druidkafkamini_basic.q test, where 
> the segment handover happens before the scan query is executed.
> Note: This is not a problem when we are directly querying Druid brokers as 
> the broker handles the retry logic. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20349) Implement Retry Logic in HiveDruidSplit for Scan Queries

2018-08-17 Thread Nishant Bangarwa (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16583816#comment-16583816
 ] 

Nishant Bangarwa commented on HIVE-20349:
-

+cc [~ashutoshc] please review. 

> Implement Retry Logic in HiveDruidSplit for Scan Queries
> 
>
> Key: HIVE-20349
> URL: https://issues.apache.org/jira/browse/HIVE-20349
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20349.patch
>
>
> while distributing druid scan query we check where the segments are loaded 
> and then each HiveDruidSplit directly queries the historical node. 
> There are few cases when we need to retry and refetch the segments. 
> # The segment is loaded on multiple historical nodes and one of them went 
> down. in this case when we do not get response from one segment, we query the 
> next replica. 
> # The segment was loaded onto a realtime task and was handed over, when we 
> query the realtime task has already finished. In this case there is no 
> replica. The Split needs to query the broker again for the location of the 
> segment and then send the query to correct historical node. 
> This is also the root cause of failure of druidkafkamini_basic.q test, where 
> the segment handover happens before the scan query is executed.
> Note: This is not a problem when we are directly querying Druid brokers as 
> the broker handles the retry logic. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20349) Implement Retry Logic in HiveDruidSplit for Scan Queries

2018-08-17 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20349:

Attachment: HIVE-20349.patch

> Implement Retry Logic in HiveDruidSplit for Scan Queries
> 
>
> Key: HIVE-20349
> URL: https://issues.apache.org/jira/browse/HIVE-20349
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20349.patch
>
>
> while distributing druid scan query we check where the segments are loaded 
> and then each HiveDruidSplit directly queries the historical node. 
> There are few cases when we need to retry and refetch the segments. 
> # The segment is loaded on multiple historical nodes and one of them went 
> down. in this case when we do not get response from one segment, we query the 
> next replica. 
> # The segment was loaded onto a realtime task and was handed over, when we 
> query the realtime task has already finished. In this case there is no 
> replica. The Split needs to query the broker again for the location of the 
> segment and then send the query to correct historical node. 
> This is also the root cause of failure of druidkafkamini_basic.q test, where 
> the segment handover happens before the scan query is executed.
> Note: This is not a problem when we are directly querying Druid brokers as 
> the broker handles the retry logic. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20349) Implement Retry Logic in HiveDruidSplit for Scan Queries

2018-08-20 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20349:

Attachment: HIVE-20349.1.patch

> Implement Retry Logic in HiveDruidSplit for Scan Queries
> 
>
> Key: HIVE-20349
> URL: https://issues.apache.org/jira/browse/HIVE-20349
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20349.1.patch, HIVE-20349.patch
>
>
> while distributing druid scan query we check where the segments are loaded 
> and then each HiveDruidSplit directly queries the historical node. 
> There are few cases when we need to retry and refetch the segments. 
> # The segment is loaded on multiple historical nodes and one of them went 
> down. in this case when we do not get response from one segment, we query the 
> next replica. 
> # The segment was loaded onto a realtime task and was handed over, when we 
> query the realtime task has already finished. In this case there is no 
> replica. The Split needs to query the broker again for the location of the 
> segment and then send the query to correct historical node. 
> This is also the root cause of failure of druidkafkamini_basic.q test, where 
> the segment handover happens before the scan query is executed.
> Note: This is not a problem when we are directly querying Druid brokers as 
> the broker handles the retry logic. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20349) Implement Retry Logic in HiveDruidSplit for Scan Queries

2018-08-20 Thread Nishant Bangarwa (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586129#comment-16586129
 ] 

Nishant Bangarwa commented on HIVE-20349:
-

added review link and updated patch. 

> Implement Retry Logic in HiveDruidSplit for Scan Queries
> 
>
> Key: HIVE-20349
> URL: https://issues.apache.org/jira/browse/HIVE-20349
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20349.1.patch, HIVE-20349.patch
>
>
> while distributing druid scan query we check where the segments are loaded 
> and then each HiveDruidSplit directly queries the historical node. 
> There are few cases when we need to retry and refetch the segments. 
> # The segment is loaded on multiple historical nodes and one of them went 
> down. in this case when we do not get response from one segment, we query the 
> next replica. 
> # The segment was loaded onto a realtime task and was handed over, when we 
> query the realtime task has already finished. In this case there is no 
> replica. The Split needs to query the broker again for the location of the 
> segment and then send the query to correct historical node. 
> This is also the root cause of failure of druidkafkamini_basic.q test, where 
> the segment handover happens before the scan query is executed.
> Note: This is not a problem when we are directly querying Druid brokers as 
> the broker handles the retry logic. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20349) Implement Retry Logic in HiveDruidSplit for Scan Queries

2018-08-20 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20349:

Attachment: HIVE-20349.2.patch

> Implement Retry Logic in HiveDruidSplit for Scan Queries
> 
>
> Key: HIVE-20349
> URL: https://issues.apache.org/jira/browse/HIVE-20349
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20349.1.patch, HIVE-20349.2.patch, HIVE-20349.patch
>
>
> while distributing druid scan query we check where the segments are loaded 
> and then each HiveDruidSplit directly queries the historical node. 
> There are few cases when we need to retry and refetch the segments. 
> # The segment is loaded on multiple historical nodes and one of them went 
> down. in this case when we do not get response from one segment, we query the 
> next replica. 
> # The segment was loaded onto a realtime task and was handed over, when we 
> query the realtime task has already finished. In this case there is no 
> replica. The Split needs to query the broker again for the location of the 
> segment and then send the query to correct historical node. 
> This is also the root cause of failure of druidkafkamini_basic.q test, where 
> the segment handover happens before the scan query is executed.
> Note: This is not a problem when we are directly querying Druid brokers as 
> the broker handles the retry logic. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20013) Add an Implicit cast to date type for to_date function

2018-08-23 Thread Nishant Bangarwa (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590391#comment-16590391
 ] 

Nishant Bangarwa commented on HIVE-20013:
-

[~kgyrtkirk] Thanks for rebasing this. 
[~ashutoshc] this is passing CI, please merge if no additional comments. 

> Add an Implicit cast to date type for to_date function
> --
>
> Key: HIVE-20013
> URL: https://issues.apache.org/jira/browse/HIVE-20013
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20013.02.patch, HIVE-20013.patch, HIVE-20013.patch
>
>
> Issue - 
> SELECT TO_DATE(date1), TO_DATE(datetime1) FROM druid_table_n1;
> Running this query on Druid returns null values when date1 and datetime1 are 
> of type String. 
> {code} 
> INFO  : Executing 
> command(queryId=hive_20180627144822_d4395567-e3cb-4b20-b53b-4e5eba2d7dac): 
> EXPLAIN SELECT TO_DATE(datetime0) ,TO_DATE(date0) FROM calcs
> INFO  : Starting task [Stage-1:EXPLAIN] in serial mode
> INFO  : Completed executing 
> command(queryId=hive_20180627144822_d4395567-e3cb-4b20-b53b-4e5eba2d7dac); 
> Time taken: 0.003 seconds
> INFO  : OK
> ++
> |  Explain   |
> ++
> | Plan optimized by CBO. |
> ||
> | Stage-0|
> |   Fetch Operator   |
> | limit:-1   |
> | Select Operator [SEL_1]|
> |   Output:["_col0","_col1"] |
> |   TableScan [TS_0] |
> | 
> Output:["vc","vc0"],properties:{"druid.fieldNames":"vc,vc0","druid.fieldTypes":"date,date","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"druid_tableau.calcs\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"timestamp_floor(\\\"datetime0\\\",'P1D','','UTC')\",\"outputType\":\"LONG\"},{\"type\":\"expression\",\"name\":\"vc0\",\"expression\":\"timestamp_floor(\\\"date0\\\",'P1D','','UTC')\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\",\"vc0\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"}
>  |
> ||
> ++
> 10 rows selected (0.606 seconds)
> {code}
> Reported by [~dileep529]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20449) DruidMiniTests - Move creation of druid table from allTypesOrc to test setup phase

2018-08-23 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-20449:
---


> DruidMiniTests - Move creation of druid table from allTypesOrc to test setup 
> phase
> --
>
> Key: HIVE-20449
> URL: https://issues.apache.org/jira/browse/HIVE-20449
> Project: Hive
>  Issue Type: Improvement
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Multiple druid tests end up creating a Druid table from allTypesOrc table. 
> Moving this table creation to a pre-test setup phase would avoid redundant 
> work in tests and possibly help in reducing test runtimes. 
> Thanks, [~jcamachorodriguez] for suggesting this improvement. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18583) Enable DateRangeRules

2018-08-23 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-18583:

Attachment: HIVE-18583.4.patch

> Enable DateRangeRules 
> --
>
> Key: HIVE-18583
> URL: https://issues.apache.org/jira/browse/HIVE-18583
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18583.2.patch, HIVE-18583.3.patch, 
> HIVE-18583.4.patch, HIVE-18583.patch
>
>
> Enable DateRangeRules to translate druid filters to date ranges. 
> Need calcite version to upgrade to 0.16.0 before merging this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20449) DruidMiniTests - Move creation of druid table from allTypesOrc to test setup phase

2018-08-27 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20449:

Attachment: HIVE-20449.patch

> DruidMiniTests - Move creation of druid table from allTypesOrc to test setup 
> phase
> --
>
> Key: HIVE-20449
> URL: https://issues.apache.org/jira/browse/HIVE-20449
> Project: Hive
>  Issue Type: Improvement
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20449.patch
>
>
> Multiple druid tests end up creating a Druid table from allTypesOrc table. 
> Moving this table creation to a pre-test setup phase would avoid redundant 
> work in tests and possibly help in reducing test runtimes. 
> Thanks, [~jcamachorodriguez] for suggesting this improvement. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20449) DruidMiniTests - Move creation of druid table from allTypesOrc to test setup phase

2018-08-27 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20449:

Status: Patch Available  (was: Open)

> DruidMiniTests - Move creation of druid table from allTypesOrc to test setup 
> phase
> --
>
> Key: HIVE-20449
> URL: https://issues.apache.org/jira/browse/HIVE-20449
> Project: Hive
>  Issue Type: Improvement
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20449.patch
>
>
> Multiple druid tests end up creating a Druid table from allTypesOrc table. 
> Moving this table creation to a pre-test setup phase would avoid redundant 
> work in tests and possibly help in reducing test runtimes. 
> Thanks, [~jcamachorodriguez] for suggesting this improvement. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20449) DruidMiniTests - Move creation of druid table from allTypesOrc to test setup phase

2018-08-27 Thread Nishant Bangarwa (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16593637#comment-16593637
 ] 

Nishant Bangarwa commented on HIVE-20449:
-

+cc [~ashutoshc] This patch reduces druid mini cluster test runtime from 25 
mins to ~15 mins on my machine. Please review. 

> DruidMiniTests - Move creation of druid table from allTypesOrc to test setup 
> phase
> --
>
> Key: HIVE-20449
> URL: https://issues.apache.org/jira/browse/HIVE-20449
> Project: Hive
>  Issue Type: Improvement
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20449.patch
>
>
> Multiple druid tests end up creating a Druid table from allTypesOrc table. 
> Moving this table creation to a pre-test setup phase would avoid redundant 
> work in tests and possibly help in reducing test runtimes. 
> Thanks, [~jcamachorodriguez] for suggesting this improvement. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19552) Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q

2018-08-27 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19552:

Status: Patch Available  (was: Open)

> Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q
> -
>
> Key: HIVE-19552
> URL: https://issues.apache.org/jira/browse/HIVE-19552
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Nishant Bangarwa
>Priority: Critical
> Attachments: HIVE-19552.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19552) Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q

2018-08-27 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19552:

Attachment: HIVE-19552.patch

> Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q
> -
>
> Key: HIVE-19552
> URL: https://issues.apache.org/jira/browse/HIVE-19552
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Nishant Bangarwa
>Priority: Critical
> Attachments: HIVE-19552.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20449) DruidMiniTests - Move creation of druid table from allTypesOrc to test setup phase

2018-08-27 Thread Nishant Bangarwa (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16593973#comment-16593973
 ] 

Nishant Bangarwa commented on HIVE-20449:
-

This dataset is initialized in QTestUtil 
https://github.com/apache/hive/blob/master/itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java#L1140
Any test that needs standard druid alltypesorc dataset needs to mention the 
dataset in the q file as follows - 
{code} 
--! qt:dataset:druid_table_alltypesorc
{code} 


> DruidMiniTests - Move creation of druid table from allTypesOrc to test setup 
> phase
> --
>
> Key: HIVE-20449
> URL: https://issues.apache.org/jira/browse/HIVE-20449
> Project: Hive
>  Issue Type: Improvement
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20449.patch
>
>
> Multiple druid tests end up creating a Druid table from allTypesOrc table. 
> Moving this table creation to a pre-test setup phase would avoid redundant 
> work in tests and possibly help in reducing test runtimes. 
> Thanks, [~jcamachorodriguez] for suggesting this improvement. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20468) Add ability to skip creating druid bitmap indexes for specific string dimensions

2018-08-27 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-20468:
---


> Add ability to skip creating druid bitmap indexes for specific string 
> dimensions
> 
>
> Key: HIVE-20468
> URL: https://issues.apache.org/jira/browse/HIVE-20468
> Project: Hive
>  Issue Type: New Feature
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Currently we create bitmap index for all druid dimensions. 
> For some columns (e.g Free form text, high cardinality columns that are 
> rarely filtered upon), It may be beneficial to skip creating druid bitmap 
> index and save disk space.  
> In druid https://github.com/apache/incubator-druid/pull/5402 added support 
> for creating string dimension columns without bitmap indexes. 
> This task is to add similar option when indexing data from hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20469) Do not rollup PK/FK columns when indexing to druid.

2018-08-27 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-20469:
---


> Do not rollup PK/FK columns when indexing to druid. 
> 
>
> Key: HIVE-20469
> URL: https://issues.apache.org/jira/browse/HIVE-20469
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> When indexing data to druid if a numeric column has a PK/FK constraint. 
> We need to make sure it is not indexed as a metric and rolled up when 
> indexing to druid. 
> Thanks [~t3rmin4t0r] for recommending this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20449) DruidMiniTests - Move creation of druid table from allTypesOrc to test setup phase

2018-08-27 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20449:

Attachment: HIVE-20449.1.patch

> DruidMiniTests - Move creation of druid table from allTypesOrc to test setup 
> phase
> --
>
> Key: HIVE-20449
> URL: https://issues.apache.org/jira/browse/HIVE-20449
> Project: Hive
>  Issue Type: Improvement
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20449.1.patch, HIVE-20449.patch
>
>
> Multiple druid tests end up creating a Druid table from allTypesOrc table. 
> Moving this table creation to a pre-test setup phase would avoid redundant 
> work in tests and possibly help in reducing test runtimes. 
> Thanks, [~jcamachorodriguez] for suggesting this improvement. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20468) Add ability to skip creating druid bitmap indexes for specific string dimensions

2018-08-27 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20468:

Component/s: Druid integration

> Add ability to skip creating druid bitmap indexes for specific string 
> dimensions
> 
>
> Key: HIVE-20468
> URL: https://issues.apache.org/jira/browse/HIVE-20468
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Currently we create bitmap index for all druid dimensions. 
> For some columns (e.g Free form text, high cardinality columns that are 
> rarely filtered upon), It may be beneficial to skip creating druid bitmap 
> index and save disk space.  
> In druid https://github.com/apache/incubator-druid/pull/5402 added support 
> for creating string dimension columns without bitmap indexes. 
> This task is to add similar option when indexing data from hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19552) Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q

2018-09-11 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19552:

Description: 
The failure was caused by the following sequence of steps - 
# Test queries for available hosts where a segment is located and gets the 
location of kafka task. 
# Kafka task hands over the data and finishes
# Now the scan query is sent to the kafka task, but the task has already 
completed and will fail. 


> Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q
> -
>
> Key: HIVE-19552
> URL: https://issues.apache.org/jira/browse/HIVE-19552
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Nishant Bangarwa
>Priority: Critical
> Attachments: HIVE-19552.patch
>
>
> The failure was caused by the following sequence of steps - 
> # Test queries for available hosts where a segment is located and gets the 
> location of kafka task. 
> # Kafka task hands over the data and finishes
> # Now the scan query is sent to the kafka task, but the task has already 
> completed and will fail. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19552) Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q

2018-09-11 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19552:

Description: 
The failure was caused by the following sequence of steps - 
# Test queries for available hosts where a segment is located and gets the 
location of kafka task. 
# Kafka task hands over the data and finishes
# Now the scan query is sent to the kafka task, but the task has already 
completed and will fail. 

https://issues.apache.org/jira/browse/HIVE-20349 fixes this issue by retrying 
the broker in this case. 


  was:
The failure was caused by the following sequence of steps - 
# Test queries for available hosts where a segment is located and gets the 
location of kafka task. 
# Kafka task hands over the data and finishes
# Now the scan query is sent to the kafka task, but the task has already 
completed and will fail. 



> Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q
> -
>
> Key: HIVE-19552
> URL: https://issues.apache.org/jira/browse/HIVE-19552
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Nishant Bangarwa
>Priority: Critical
> Attachments: HIVE-19552.patch
>
>
> The failure was caused by the following sequence of steps - 
> # Test queries for available hosts where a segment is located and gets the 
> location of kafka task. 
> # Kafka task hands over the data and finishes
> # Now the scan query is sent to the kafka task, but the task has already 
> completed and will fail. 
> https://issues.apache.org/jira/browse/HIVE-20349 fixes this issue by retrying 
> the broker in this case. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19552) Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q

2018-09-11 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19552:

Description: 
The failure was caused by the following sequence of steps - 
# Test queries for available hosts where a segment is located and gets the 
location of kafka task. 
# Kafka task hands over the data and finishes
# Now the scan query is sent to the kafka task, but the task has already 
completed and will fail. 

https://issues.apache.org/jira/browse/HIVE-20349 fixes this issue by retrying 
the broker in this case. 

One more cause of failure was the latestOffsets and minimumLag not reported 
when there is no task. 
This patch masks those two values also. Query results are verified to ensure 
that there is no lag. 


  was:
The failure was caused by the following sequence of steps - 
# Test queries for available hosts where a segment is located and gets the 
location of kafka task. 
# Kafka task hands over the data and finishes
# Now the scan query is sent to the kafka task, but the task has already 
completed and will fail. 

https://issues.apache.org/jira/browse/HIVE-20349 fixes this issue by retrying 
the broker in this case. 



> Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q
> -
>
> Key: HIVE-19552
> URL: https://issues.apache.org/jira/browse/HIVE-19552
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Nishant Bangarwa
>Priority: Critical
> Attachments: HIVE-19552.patch
>
>
> The failure was caused by the following sequence of steps - 
> # Test queries for available hosts where a segment is located and gets the 
> location of kafka task. 
> # Kafka task hands over the data and finishes
> # Now the scan query is sent to the kafka task, but the task has already 
> completed and will fail. 
> https://issues.apache.org/jira/browse/HIVE-20349 fixes this issue by retrying 
> the broker in this case. 
> One more cause of failure was the latestOffsets and minimumLag not reported 
> when there is no task. 
> This patch masks those two values also. Query results are verified to ensure 
> that there is no lag. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19552) Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q

2018-09-11 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-19552:

Attachment: HIVE-19552.1.patch

> Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q
> -
>
> Key: HIVE-19552
> URL: https://issues.apache.org/jira/browse/HIVE-19552
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Nishant Bangarwa
>Priority: Critical
> Attachments: HIVE-19552.1.patch, HIVE-19552.patch
>
>
> The failure was caused by the following sequence of steps - 
> # Test queries for available hosts where a segment is located and gets the 
> location of kafka task. 
> # Kafka task hands over the data and finishes
> # Now the scan query is sent to the kafka task, but the task has already 
> completed and will fail. 
> https://issues.apache.org/jira/browse/HIVE-20349 fixes this issue by retrying 
> the broker in this case. 
> One more cause of failure was the latestOffsets and minimumLag not reported 
> when there is no task. 
> This patch masks those two values also. Query results are verified to ensure 
> that there is no lag. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19552) Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q

2018-09-11 Thread Nishant Bangarwa (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16611167#comment-16611167
 ] 

Nishant Bangarwa commented on HIVE-19552:
-

rebased and updated. 

> Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q
> -
>
> Key: HIVE-19552
> URL: https://issues.apache.org/jira/browse/HIVE-19552
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Nishant Bangarwa
>Priority: Critical
> Attachments: HIVE-19552.1.patch, HIVE-19552.patch
>
>
> The failure was caused by the following sequence of steps - 
> # Test queries for available hosts where a segment is located and gets the 
> location of kafka task. 
> # Kafka task hands over the data and finishes
> # Now the scan query is sent to the kafka task, but the task has already 
> completed and will fail. 
> https://issues.apache.org/jira/browse/HIVE-20349 fixes this issue by retrying 
> the broker in this case. 
> One more cause of failure was the latestOffsets and minimumLag not reported 
> when there is no task. 
> This patch masks those two values also. Query results are verified to ensure 
> that there is no lag. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-20405) Run all druid tests in one batch

2018-09-11 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa resolved HIVE-20405.
-
Resolution: Fixed

the batch size has been changed as part of 
https://issues.apache.org/jira/browse/HIVE-20481

> Run all druid tests in one batch
> 
>
> Key: HIVE-20405
> URL: https://issues.apache.org/jira/browse/HIVE-20405
> Project: Hive
>  Issue Type: Test
>  Components: Testing Infrastructure
>Reporter: Vineet Garg
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Running druid tests in parallel could cause issues so all of the tests should 
> be run in one batch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20539) Remove dependency on com.metamx.java-util

2018-09-11 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-20539:
---


> Remove dependency on com.metamx.java-util
> -
>
> Key: HIVE-20539
> URL: https://issues.apache.org/jira/browse/HIVE-20539
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> java-util was moved from com.metamx to druid code repository. 
> Currently we are packing both com.metamx.java-jtil and io.druid.java-util, 
> This task is to remove the dependency on com.metamx.java-util



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20539) Remove dependency on com.metamx.java-util

2018-09-11 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20539:

Status: Patch Available  (was: Open)

> Remove dependency on com.metamx.java-util
> -
>
> Key: HIVE-20539
> URL: https://issues.apache.org/jira/browse/HIVE-20539
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20539.patch
>
>
> java-util was moved from com.metamx to druid code repository. 
> Currently we are packing both com.metamx.java-jtil and io.druid.java-util, 
> This task is to remove the dependency on com.metamx.java-util



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20539) Remove dependency on com.metamx.java-util

2018-09-11 Thread Nishant Bangarwa (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16611255#comment-16611255
 ] 

Nishant Bangarwa commented on HIVE-20539:
-

+cc [~ashutoshc] please review. 

> Remove dependency on com.metamx.java-util
> -
>
> Key: HIVE-20539
> URL: https://issues.apache.org/jira/browse/HIVE-20539
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20539.patch
>
>
> java-util was moved from com.metamx to druid code repository. 
> Currently we are packing both com.metamx.java-jtil and io.druid.java-util, 
> This task is to remove the dependency on com.metamx.java-util



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20539) Remove dependency on com.metamx.java-util

2018-09-11 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20539:

Attachment: HIVE-20539.patch

> Remove dependency on com.metamx.java-util
> -
>
> Key: HIVE-20539
> URL: https://issues.apache.org/jira/browse/HIVE-20539
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20539.patch
>
>
> java-util was moved from com.metamx to druid code repository. 
> Currently we are packing both com.metamx.java-jtil and io.druid.java-util, 
> This task is to remove the dependency on com.metamx.java-util



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20349) Implement Retry Logic in HiveDruidSplit for Scan Queries

2018-09-12 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20349:

Attachment: HIVE-20349.3.patch

> Implement Retry Logic in HiveDruidSplit for Scan Queries
> 
>
> Key: HIVE-20349
> URL: https://issues.apache.org/jira/browse/HIVE-20349
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20349.1.patch, HIVE-20349.2.patch, 
> HIVE-20349.3.patch, HIVE-20349.patch
>
>
> while distributing druid scan query we check where the segments are loaded 
> and then each HiveDruidSplit directly queries the historical node. 
> There are few cases when we need to retry and refetch the segments. 
> # The segment is loaded on multiple historical nodes and one of them went 
> down. in this case when we do not get response from one segment, we query the 
> next replica. 
> # The segment was loaded onto a realtime task and was handed over, when we 
> query the realtime task has already finished. In this case there is no 
> replica. The Split needs to query the broker again for the location of the 
> segment and then send the query to correct historical node. 
> This is also the root cause of failure of druidkafkamini_basic.q test, where 
> the segment handover happens before the scan query is executed.
> Note: This is not a problem when we are directly querying Druid brokers as 
> the broker handles the retry logic. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20349) Implement Retry Logic in HiveDruidSplit for Scan Queries

2018-09-12 Thread Nishant Bangarwa (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612314#comment-16612314
 ] 

Nishant Bangarwa commented on HIVE-20349:
-

updated patch. 

> Implement Retry Logic in HiveDruidSplit for Scan Queries
> 
>
> Key: HIVE-20349
> URL: https://issues.apache.org/jira/browse/HIVE-20349
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20349.1.patch, HIVE-20349.2.patch, 
> HIVE-20349.3.patch, HIVE-20349.patch
>
>
> while distributing druid scan query we check where the segments are loaded 
> and then each HiveDruidSplit directly queries the historical node. 
> There are few cases when we need to retry and refetch the segments. 
> # The segment is loaded on multiple historical nodes and one of them went 
> down. in this case when we do not get response from one segment, we query the 
> next replica. 
> # The segment was loaded onto a realtime task and was handed over, when we 
> query the realtime task has already finished. In this case there is no 
> replica. The Split needs to query the broker again for the location of the 
> segment and then send the query to correct historical node. 
> This is also the root cause of failure of druidkafkamini_basic.q test, where 
> the segment handover happens before the scan query is executed.
> Note: This is not a problem when we are directly querying Druid brokers as 
> the broker handles the retry logic. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19552) Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q

2018-09-12 Thread Nishant Bangarwa (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612316#comment-16612316
 ] 

Nishant Bangarwa commented on HIVE-19552:
-

[~jcamachorodriguez] Please merge. This is good to be merged now.

> Enable TestMiniDruidKafkaCliDriver#druidkafkamini_basic.q
> -
>
> Key: HIVE-19552
> URL: https://issues.apache.org/jira/browse/HIVE-19552
> Project: Hive
>  Issue Type: Test
>  Components: Test
>Affects Versions: 3.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Nishant Bangarwa
>Priority: Critical
> Attachments: HIVE-19552.1.patch, HIVE-19552.patch
>
>
> The failure was caused by the following sequence of steps - 
> # Test queries for available hosts where a segment is located and gets the 
> location of kafka task. 
> # Kafka task hands over the data and finishes
> # Now the scan query is sent to the kafka task, but the task has already 
> completed and will fail. 
> https://issues.apache.org/jira/browse/HIVE-20349 fixes this issue by retrying 
> the broker in this case. 
> One more cause of failure was the latestOffsets and minimumLag not reported 
> when there is no task. 
> This patch masks those two values also. Query results are verified to ensure 
> that there is no lag. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18583) Enable DateRangeRules

2018-09-12 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-18583:

Attachment: HIVE-18583.5.patch

> Enable DateRangeRules 
> --
>
> Key: HIVE-18583
> URL: https://issues.apache.org/jira/browse/HIVE-18583
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18583.2.patch, HIVE-18583.3.patch, 
> HIVE-18583.4.patch, HIVE-18583.5.patch, HIVE-18583.patch
>
>
> Enable DateRangeRules to translate druid filters to date ranges. 
> Need calcite version to upgrade to 0.16.0 before merging this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18583) Enable DateRangeRules

2018-09-12 Thread Nishant Bangarwa (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612490#comment-16612490
 ] 

Nishant Bangarwa commented on HIVE-18583:
-

rebased and attached new patch. 

> Enable DateRangeRules 
> --
>
> Key: HIVE-18583
> URL: https://issues.apache.org/jira/browse/HIVE-18583
> Project: Hive
>  Issue Type: Improvement
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-18583.2.patch, HIVE-18583.3.patch, 
> HIVE-18583.4.patch, HIVE-18583.5.patch, HIVE-18583.patch
>
>
> Enable DateRangeRules to translate druid filters to date ranges. 
> Need calcite version to upgrade to 0.16.0 before merging this. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20546) Upgrade to Druid 0.13.0

2018-09-12 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-20546:
---


> Upgrade to Druid 0.13.0
> ---
>
> Key: HIVE-20546
> URL: https://issues.apache.org/jira/browse/HIVE-20546
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> This task is to upgrade to druid 0.13.0 when it is released. Note that it 
> will hopefully be first apache release for Druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20546) Upgrade to Druid 0.13.0

2018-09-12 Thread Nishant Bangarwa (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-20546:

Attachment: HIVE-20546.patch

> Upgrade to Druid 0.13.0
> ---
>
> Key: HIVE-20546
> URL: https://issues.apache.org/jira/browse/HIVE-20546
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20546.patch
>
>
> This task is to upgrade to druid 0.13.0 when it is released. Note that it 
> will hopefully be first apache release for Druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20546) Upgrade to Druid 0.13.0

2018-09-12 Thread Nishant Bangarwa (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612858#comment-16612858
 ] 

Nishant Bangarwa commented on HIVE-20546:
-

Attached a Work in progress patch. 
Main changes include - 
# Upgrade druid version 
# package renamings from io.druid to org.apache.druid
# some test results changed due to double precision. 


> Upgrade to Druid 0.13.0
> ---
>
> Key: HIVE-20546
> URL: https://issues.apache.org/jira/browse/HIVE-20546
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20546.patch
>
>
> This task is to upgrade to druid 0.13.0 when it is released. Note that it 
> will hopefully be first apache release for Druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 3 4 5 >

1 - 100 of 429 matches

Mail list logo