date:20170222

[jira] [Comment Edited] (HIVE-14217) Druid integration

2017-02-22 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15475910#comment-15475910
 ] 

Lefty Leverenz edited comment on HIVE-14217 at 2/23/17 7:34 AM:


Doc note:  In addition to the Druid Integration wikidoc, two new configuration 
parameters (*hive.druid.broker.address.default* & 
*hive.druid.select.threshold*) and the *druid.datasource* table property need 
to be documented in other wikidocs.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]
* [DDL -- Table Properties | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties]

There may be other things too -- I'll find them when I review the Druid 
Integration doc.  I've already added Druid Integration to the wiki's home page.

Added a TODOC2.2 label.

Edit (22/Feb/17):  HIVE-15928 revises the description of 
*hive.druid.select.threshold* in 2.2.0, so that's description to document.


was (Author: le...@hortonworks.com):
Doc note:  In addition to the Druid Integration wikidoc, two new configuration 
parameters (*hive.druid.broker.address.default* & 
*hive.druid.select.threshold*) and the *druid.datasource* table property need 
to be documented in other wikidocs.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]
* [DDL -- Table Properties | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties]

There may be other things too -- I'll find them when I review the Druid 
Integration doc.  I've already added Druid Integration to the wiki's home page.

Added a TODOC2.2 label.

> Druid integration
> -
>
> Key: HIVE-14217
> URL: https://issues.apache.org/jira/browse/HIVE-14217
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Julian Hyde
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14217.01.patch, HIVE-14217.02.patch, 
> HIVE-14217.03.patch, HIVE-14217.04.patch, HIVE-14217.05.patch, 
> HIVE-14217.06.patch
>
>
> Allow Hive to query data in Druid



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15830) Allow additional view ACLs for tez jobs

2017-02-22 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-15830:
--
Attachment: HIVE-15830.07.patch

Updated to fix test failures.

> Allow additional view ACLs for tez jobs
> ---
>
> Key: HIVE-15830
> URL: https://issues.apache.org/jira/browse/HIVE-15830
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15830.01.patch, HIVE-15830.02.patch, 
> HIVE-15830.03.patch, HIVE-15830.05.patch, HIVE-15830.06.patch, 
> HIVE-15830.07.patch
>
>
> Allow users to grant view access to additional users when running tez jobs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15928) Parallelization of Select queries in Druid handler

2017-02-22 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880047#comment-15880047
 ] 

Lefty Leverenz commented on HIVE-15928:
---

Doc note:  This adds configuration parameter *hive.druid.select.distribute* and 
amends the description of *hive.druid.select.threshold*, which was created by 
HIVE-14217 (also in 2.2.0).  They need to be documented in the wiki.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]
* [Druid Integration | 
https://cwiki.apache.org/confluence/display/Hive/Druid+Integration]

Added a TODOC2.2 label.

> Parallelization of Select queries in Druid handler
> --
>
> Key: HIVE-15928
> URL: https://issues.apache.org/jira/browse/HIVE-15928
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-15928.01.patch, HIVE-15928.02.patch, 
> HIVE-15928.patch
>
>
> Even if we split a Select query along its time dimension, parallelization is 
> limited as all queries will hit the broker node. Instead, we can interrogate 
> the broker to get the Druid nodes that contain the data, and query those 
> nodes directly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16013) Fragments without locality can stack up on nodes

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15880042#comment-15880042
 ] 

Hive QA commented on HIVE-16013:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854080/HIVE-16013.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10254 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3715/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3715/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3715/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854080 - PreCommit-HIVE-Build

> Fragments without locality can stack up on nodes
> 
>
> Key: HIVE-16013
> URL: https://issues.apache.org/jira/browse/HIVE-16013
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16013.1.patch
>
>
> When no locality information is provide, task requests can stack up on a node 
> because of consistent no selection. When locality information is not provided 
> we should fallback to random selection for better work distribution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15928) Parallelization of Select queries in Druid handler

2017-02-22 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15928:
--
Labels: TODOC2.2  (was: )

> Parallelization of Select queries in Druid handler
> --
>
> Key: HIVE-15928
> URL: https://issues.apache.org/jira/browse/HIVE-15928
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-15928.01.patch, HIVE-15928.02.patch, 
> HIVE-15928.patch
>
>
> Even if we split a Select query along its time dimension, parallelization is 
> limited as all queries will hit the broker node. Instead, we can interrogate 
> the broker to get the Druid nodes that contain the data, and query those 
> nodes directly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16022) BloomFilter check not showing up in MERGE statement queries

2017-02-22 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-16022:
--
Attachment: HIVE-16022.1.patch

> BloomFilter check not showing up in MERGE statement queries
> ---
>
> Key: HIVE-16022
> URL: https://issues.apache.org/jira/browse/HIVE-16022
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16022.1.patch
>
>
> Running explain on a MERGE statement with runtime filtering enabled, I see 
> the min/max being applied on the large table, but not the bloom filter check:
> {noformat}
> explain merge into acidTbl as t using nonAcidOrcTbl s ON t.a = s.a
> WHEN MATCHED AND s.a > 8 THEN DELETE
> WHEN MATCHED THEN UPDATE SET b = 7
> WHEN NOT MATCHED THEN INSERT VALUES(s.a, s.b)
> ...
> Map 1
> Map Operator Tree:
> TableScan
>   alias: t
>   Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
> Column stats: NONE
>   Filter Operator
> predicate: a BETWEEN DynamicValue(RS_3_s_a_min) AND 
> DynamicValue(RS_3_s_a_max) (type: boolean)
> Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
> Column stats: NONE
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16022) BloomFilter check not showing up in MERGE statement queries

2017-02-22 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1587#comment-1587
 ] 

Jason Dere commented on HIVE-16022:
---

Noticed a couple of problems when I run the semijoin optimization on a MERGE 
statement:
- DynamicPartitionPruningOptimization.generateSemiJoinOperator(): parentOfRS 
does not necessarily have to be a SelectOperator - in this case it is a TS. As 
a result we are missing some important checking on whether this table is 
appropriate for semijoin opt.
- grandParent.getChildren().add(bloomFilterNode) - This wrongly assumes 
grandParent is AND: In this case, there was no previous filterExpr so 
grandParent is BETWEEN. Adding the child here incorrectly adds a new parameter 
to BETWEEN , which is probably getting ignored. This is why in_bloom_filter() 
is not in the EXPLAIN.

> BloomFilter check not showing up in MERGE statement queries
> ---
>
> Key: HIVE-16022
> URL: https://issues.apache.org/jira/browse/HIVE-16022
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-16022.1.patch
>
>
> Running explain on a MERGE statement with runtime filtering enabled, I see 
> the min/max being applied on the large table, but not the bloom filter check:
> {noformat}
> explain merge into acidTbl as t using nonAcidOrcTbl s ON t.a = s.a
> WHEN MATCHED AND s.a > 8 THEN DELETE
> WHEN MATCHED THEN UPDATE SET b = 7
> WHEN NOT MATCHED THEN INSERT VALUES(s.a, s.b)
> ...
> Map 1
> Map Operator Tree:
> TableScan
>   alias: t
>   Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
> Column stats: NONE
>   Filter Operator
> predicate: a BETWEEN DynamicValue(RS_3_s_a_min) AND 
> DynamicValue(RS_3_s_a_max) (type: boolean)
> Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
> Column stats: NONE
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15859) Hive client side shows Spark Driver disconnected while Spark Driver side could not get RPC header

2017-02-22 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879995#comment-15879995
 ] 

Rui Li commented on HIVE-15859:
---

Hi [~xuefuz], netty's channel is thread safe. We can write to it concurrently 
in multiple threads. The problem is we divide each message into header and 
payload and write them to the channel separately. And thus the order can be 
messed up on receiver side. If we combine them into one message, I think we 
don't need to force all the writes thru event loop. I suppose we can try this 
way if the current approach doesn't solve the issue.

> Hive client side shows Spark Driver disconnected while Spark Driver side 
> could not get RPC header 
> --
>
> Key: HIVE-15859
> URL: https://issues.apache.org/jira/browse/HIVE-15859
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.2.0
> Environment: hadoop2.7.1
> spark1.6.2
> hive2.2
>Reporter: KaiXu
>Assignee: Rui Li
> Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch
>
>
> Hive on Spark, failed with error:
> {noformat}
> 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 
> 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> {noformat}
> application log shows the driver commanded a shutdown with some unknown 
> reason, but hive's log shows Driver could not get RPC header( Expected RPC 
> header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).
> {noformat}
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in 
> stage 3.0 (TID 2519)
> 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver 
> commanded a shutdown
> 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
> 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
> (hsx-node1:42777) driver disconnected.
> 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> 192.168.1.1:42777 disassociated! Shutting down.
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in 
> stage 3.0 (TID 2511)
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Shutting down remote daemon.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remote daemon shut down; proceeding with flushing remote transports.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remoting shut down.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
>

[jira] [Assigned] (HIVE-16022) BloomFilter check not showing up in MERGE statement queries

2017-02-22 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-16022:
-


> BloomFilter check not showing up in MERGE statement queries
> ---
>
> Key: HIVE-16022
> URL: https://issues.apache.org/jira/browse/HIVE-16022
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> Running explain on a MERGE statement with runtime filtering enabled, I see 
> the min/max being applied on the large table, but not the bloom filter check:
> {noformat}
> explain merge into acidTbl as t using nonAcidOrcTbl s ON t.a = s.a
> WHEN MATCHED AND s.a > 8 THEN DELETE
> WHEN MATCHED THEN UPDATE SET b = 7
> WHEN NOT MATCHED THEN INSERT VALUES(s.a, s.b)
> ...
> Map 1
> Map Operator Tree:
> TableScan
>   alias: t
>   Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
> Column stats: NONE
>   Filter Operator
> predicate: a BETWEEN DynamicValue(RS_3_s_a_min) AND 
> DynamicValue(RS_3_s_a_max) (type: boolean)
> Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
> Column stats: NONE
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15830) Allow additional view ACLs for tez jobs

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879961#comment-15879961
 ] 

Hive QA commented on HIVE-15830:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854109/HIVE-15830.06.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10255 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testBuildDag (batchId=263)
org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testEmptyWork (batchId=263)
org.apache.hive.common.util.TestACLConfigurationParser.test (batchId=237)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3714/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3714/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3714/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854109 - PreCommit-HIVE-Build

> Allow additional view ACLs for tez jobs
> ---
>
> Key: HIVE-15830
> URL: https://issues.apache.org/jira/browse/HIVE-15830
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15830.01.patch, HIVE-15830.02.patch, 
> HIVE-15830.03.patch, HIVE-15830.05.patch, HIVE-15830.06.patch
>
>
> Allow users to grant view access to additional users when running tez jobs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15964) LLAP: Llap IO codepath not getting invoked due to file column id mismatch

2017-02-22 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-15964:

Attachment: HIVE-15964.3.patch

addressing review comments from [~prasanth_j]. Added -ve test case as well.

> LLAP: Llap IO codepath not getting invoked due to file column id mismatch
> -
>
> Key: HIVE-15964
> URL: https://issues.apache.org/jira/browse/HIVE-15964
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-15964.1.patch, HIVE-15964.2.patch, 
> HIVE-15964.3.patch
>
>
> LLAP IO codepath is not getting invoked in certain cases when schema 
> evolution checks are done. Though "int --> long" (fileType to readerType) 
> conversions are allowed, the file type columns are not matched correctly when 
> such conversions need to happen. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16013) Fragments without locality can stack up on nodes

2017-02-22 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879954#comment-15879954
 ] 

Siddharth Seth commented on HIVE-16013:
---

{code}
+if (nodeInfo != null && nodeInfo.canAcceptTask()) {
{code}
This gets in the way of determining the next host to use. The requested host 
will have canAcceptTask = false (since we already tried and failed a local 
allocation). The requestedHostIdx will stay at -1, and we'll the first 
available in the reduce allNodes assigned. (The loop to determine the next host 
isn't serving any purpose after the patch - since the canAcceptTask check is 
removed).
Should've been caught by a unit test :(
I'd say create a separate list for the randomAllocation vs the consistent 
locality based allocation. random filtered by canAccept. Locality based would 
be the complete list.

Unrelated to this specific jira: what happens in cases where a node is not 
found in 'allNodes' during a consistent allocation (node went down between 
split generation and actual execution of the query). requestedHostIdx will stay 
at -1. Is this handled?

{code}
// no locality-requested, iterate the available hosts in consistent order from 
the beginning
{code}
This comment needs to be fixed. We're not iterating the available hosts any 
longer.

StackServlet.java - unrelated change?

Nit: Unused import in the Test

> Fragments without locality can stack up on nodes
> 
>
> Key: HIVE-16013
> URL: https://issues.apache.org/jira/browse/HIVE-16013
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16013.1.patch
>
>
> When no locality information is provide, task requests can stack up on a node 
> because of consistent no selection. When locality information is not provided 
> we should fallback to random selection for better work distribution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15882) HS2 generating high memory pressure with many partitions and concurrent queries

2017-02-22 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879949#comment-15879949
 ] 

Misha Dmitriev commented on HIVE-15882:
---

Hi [~lirui], this is a legitimate concern. Regarding the performance of 
String::intern, I suggest to check this article: 
http://java-performance.info/string-intern-in-java-6-7-8/ In summary, looks 
like the default size of the internal string table (~60,000) is sufficient for 
any reasonable application (one that uses strings to represent some sensible, 
"human words" data, as opposed to say each separate integer number in the 
0..2^31 range). And its performance looks very reasonable. So far I haven't 
measure its impact in the context of Hive. But based on what I see Hive (and 
Hadoop) is doing internally, like tons of serialization/deserialization, disk 
read/write and other time-consuming operations - my impression is that the CPU 
overhead of string interning should be negligible compared to that. Also, 
without string interning, queries may generate tons of garbage, and collecting 
it may slow down the application to near-zero speed - as it happens in my own 
benchmark.

Still I guess it would make sense to try my benchmark with much bigger heap, 
when GC doesn't kill it, and see if by any chance string interning affects 
performance negatively. I'll do that and will publish the results here.

> HS2 generating high memory pressure with many partitions and concurrent 
> queries
> ---
>
> Key: HIVE-15882
> URL: https://issues.apache.org/jira/browse/HIVE-15882
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-15882.01.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code:
> 1. 24.5% of memory is wasted by duplicate strings (see section 6). With 
> String.intern() calls added in the ~10 relevant places in the code, this 
> overhead can be highly reduced.
> 2. Almost 20% of memory is wasted due to various suboptimally used 
> collections (see section 8). There are many maps and lists that are either 
> empty or have just 1 element. By modifying the code that creates and 
> populates these collections, we may likely save 5-10% of memory.
> 3. Almost 20% of memory is used by instances of java.util.Properties. It 
> looks like these objects are highly duplicate, since for each Partition each 
> concurrently running query creates its own copy of Partion, PartitionDesc and 
> Properties. Thus we have nearly 100,000 (50 queries * 2,000 partitions) 
> Properties in memory. By interning/deduplicating these objects we may be able 
> to save perhaps 15% of memory.
> So overall, I think there is a good chance to reduce HS2 memory consumption 
> in this scenario by ~40%.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16015) LLAP: some Tez INFO logs are too noisy II

2017-02-22 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879922#comment-15879922
 ] 

Siddharth Seth commented on HIVE-16015:
---

I think we should commit now. The logging will depend on the version of Tez 
used. If we're not committing - we should revert HIVE-15954.
Disabling logging from these classes is not a good option - makes debugging any 
issue related to them difficult. I'd rather have extra logs instead of none. 
For reference, the same logs show up in regular Tez jobs as well. They haven't 
been noticed since the logs are on a per query basis, rather than giant log 
files.
The TezMerger definitely needs fixing. It's more noisy when disk spills are 
involved - which typically indicates a large query. For shorter queries, there 
should not be that much noise.

> LLAP: some Tez INFO logs are too noisy II
> -
>
> Key: HIVE-16015
> URL: https://issues.apache.org/jira/browse/HIVE-16015
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-16015.01.patch, HIVE-16015.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15882) HS2 generating high memory pressure with many partitions and concurrent queries

2017-02-22 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879917#comment-15879917
 ] 

Rui Li commented on HIVE-15882:
---

Hi [~mi...@cloudera.com], I guess one possible issue of String::intern is it 
takes extra time to search the string pool. And I found some blogs suggesting a 
bigger value for -XX:StringTableSize to mitigate it. I'm wondering if there's 
any other drawbacks you can think of for the intensive use of String::intern? 
Have you done any benchmark to measure performance in term of query time?

> HS2 generating high memory pressure with many partitions and concurrent 
> queries
> ---
>
> Key: HIVE-15882
> URL: https://issues.apache.org/jira/browse/HIVE-15882
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HIVE-15882.01.patch, hs2-crash-2000p-500m-50q.txt
>
>
> I've created a Hive table with 2000 partitions, each backed by two files, 
> with one row in each file. When I execute some number of concurrent queries 
> against this table, e.g. as follows
> {code}
> for i in `seq 1 50`; do beeline -u jdbc:hive2://localhost:1 -n admin -p 
> admin -e "select count(i_f_1) from misha_table;" & done
> {code}
> it results in a big memory spike. With 20 queries I caused an OOM in a HS2 
> server with -Xmx200m and with 50 queries - in the one with -Xmx500m.
> I am attaching the results of jxray (www.jxray.com) analysis of a heap dump 
> that was generated in the 50queries/500m heap scenario. It suggests that 
> there are several opportunities to reduce memory pressure with not very 
> invasive changes to the code:
> 1. 24.5% of memory is wasted by duplicate strings (see section 6). With 
> String.intern() calls added in the ~10 relevant places in the code, this 
> overhead can be highly reduced.
> 2. Almost 20% of memory is wasted due to various suboptimally used 
> collections (see section 8). There are many maps and lists that are either 
> empty or have just 1 element. By modifying the code that creates and 
> populates these collections, we may likely save 5-10% of memory.
> 3. Almost 20% of memory is used by instances of java.util.Properties. It 
> looks like these objects are highly duplicate, since for each Partition each 
> concurrently running query creates its own copy of Partion, PartitionDesc and 
> Properties. Thus we have nearly 100,000 (50 queries * 2,000 partitions) 
> Properties in memory. By interning/deduplicating these objects we may be able 
> to save perhaps 15% of memory.
> So overall, I think there is a good chance to reduce HS2 memory consumption 
> in this scenario by ~40%.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15255) LLAP: service_busy error should not be retried so fast

2017-02-22 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879911#comment-15879911
 ] 

Siddharth Seth commented on HIVE-15255:
---

A node gets marked bad irrespective of the reason for a task completion (except 
success).
Not sure what would have caused this - same node, and same task (different 
attempts). If the nodes were different - that is understandable.

> LLAP: service_busy error should not be retried so fast
> --
>
> Key: HIVE-15255
> URL: https://issues.apache.org/jira/browse/HIVE-15255
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> {noformat}
> 2016-11-18 20:28:20,605 FINISHED]: vertexName=Map 1, 
> taskAttemptId=attempt_1478967587833_2622_1_06_000105_1328, timeTaken=5, 
> status=KILLED, errorEnum=SERVICE_BUSY, diagnostics=Service Busy, 
> nodeHttpAddress=(node3), counters=Counters: 1, 
> org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
> 2016-11-18 20:28:20,612 STARTED]: vertexName=Map 1, 
> taskAttemptId=attempt_1478967587833_2622_1_06_000105_1329, 
> containerId=container_1_2622_01_012504, nodeId=(node3):15001
> 2016-11-18 20:28:20,628 FINISHED]: vertexName=Map 1, 
> taskAttemptId=attempt_1478967587833_2622_1_06_000105_1329, timeTaken=16, 
> status=KILLED, errorEnum=SERVICE_BUSY, diagnostics=Service Busy, 
> nodeHttpAddress=(node3), counters=Counters: 1, 
> org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
> 2016-11-18 20:28:20,634 STARTED]: vertexName=Map 1, 
> taskAttemptId=attempt_1478967587833_2622_1_06_000105_1330, 
> containerId=container_1_2622_01_012511, nodeId=(node3):15001
> 2016-11-18 20:28:20,751 FINISHED]: vertexName=Map 1, 
> taskAttemptId=attempt_1478967587833_2622_1_06_000105_1330, timeTaken=117, 
> status=KILLED, errorEnum=SERVICE_BUSY, diagnostics=Service Busy, 
> nodeHttpAddress=(node3), counters=Counters: 1, 
> org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
> 2016-11-18 20:28:20,757 STARTED]: vertexName=Map 1, 
> taskAttemptId=attempt_1478967587833_2622_1_06_000105_1331, 
> containerId=container_1_2622_01_012522, nodeId=(node3):15001
> 2016-11-18 20:28:20,771 FINISHED]: vertexName=Map 1, 
> taskAttemptId=attempt_1478967587833_2622_1_06_000105_1331, timeTaken=14, 
> status=KILLED, errorEnum=SERVICE_BUSY, diagnostics=Service Busy, 
> nodeHttpAddress=(node3), counters=Counters: 1, 
> org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
> 2016-11-18 20:28:20,777 STARTED]: vertexName=Map 1, 
> taskAttemptId=attempt_1478967587833_2622_1_06_000105_1332, 
> containerId=container_1_2622_01_012529, nodeId=(node3):15001
> 2016-11-18 20:28:20,783 FINISHED]: vertexName=Map 1, 
> taskAttemptId=attempt_1478967587833_2622_1_06_000105_1332, timeTaken=6, 
> status=KILLED, errorEnum=SERVICE_BUSY, diagnostics=Service Busy, 
> nodeHttpAddress=(node3), counters=Counters: 1, 
> org.apache.tez.common.counters.DAGCounter, DATA_LOCAL_TASKS=1
> {noformat}
> As you can see by the attempt number, this has been going on for a while. In 
> fact I think other tasks could have been scheduled in the time (not sure), 
> but the thread just kept at it for this one task until it was finally 
> scheduled.
> There should be some fallback after initial failures; we should also make 
> sure such retries do not take over all scheduling (not sure if they do, need 
> to check).
> LLAP on the node was alive, just busy with other tasks. The task did 
> eventually get scheduled.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-13864) Beeline ignores the command that follows a semicolon and comment

2017-02-22 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879905#comment-15879905
 ] 

Yongzhi Chen commented on HIVE-13864:
-

The failures are not related.  
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel passed in 
my local machine.



> Beeline ignores the command that follows a semicolon and comment
> 
>
> Key: HIVE-13864
> URL: https://issues.apache.org/jira/browse/HIVE-13864
> Project: Hive
>  Issue Type: Bug
>Reporter: Muthu Manickam
>Assignee: Yongzhi Chen
> Attachments: HIVE-13864.01.patch, HIVE-13864.02.patch, 
> HIVE-13864.3.patch
>
>
> Beeline ignores the next line/command that follows a command with semicolon 
> and comments.
> Example 1:
> select *
> from table1; -- comments
> select * from table2;
> In this case, only the first command is executed.. second command "select * 
> from table2" is not executed.
> --
> Example 2:
> select *
> from table1; -- comments
> select * from table2;
> select * from table3;
> In this case, first command and third command is executed. second command 
> "select * from table2" is not executed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15958) LLAP: IPC connections are not being reused for umbilical protocol

2017-02-22 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879895#comment-15879895
 ] 

Siddharth Seth commented on HIVE-15958:
---

I suspect the following is the reason for connections not being re-used for 
taskKilled.
For regular heartbeats, only one session will ever run for an AM - and this is 
controlled via the QueueCallable / HeartbeatCallable. When taskKilled comes 
into play, it is possible for a taskKilled to get a handle on the umbilical, 
and have one of the queued threads close the umbilical right after that, 
resulting in an error.

We have that situation again. More prominent now - since queryComplete causes 
fragments to be killed (should probably not be done - HIVE-16021), which in 
turn result in a heartbeat. The queryComplete closes the umbilical, while 
taskKilled requests get scheduled.

Also, iterating over the knownAppMasters is very avoidable. We can store 
information about the AM in the queryTracker, and retrieve it on queryComplete. 
Alternately send the AM information on the queryComplete call.

> LLAP: IPC connections are not being reused for umbilical protocol
> -
>
> Key: HIVE-15958
> URL: https://issues.apache.org/jira/browse/HIVE-15958
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15958.1.patch, HIVE-15958.2.patch
>
>
> During concurrency testing, observed 1000s of ipc thread creations. Ideally, 
> the connections to same hosts should be reused.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-14735) Build Infra: Spark artifacts download takes a long time

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879889#comment-15879889
 ] 

Hive QA commented on HIVE-14735:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854079/HIVE-14735.4.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3713/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3713/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3713/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-02-23 05:27:14.653
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-3713/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-02-23 05:27:14.655
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 759766e HIVE-15955: make explain formatted to include opId and 
etc (Pengcheng Xiong, reviewed by Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 759766e HIVE-15955: make explain formatted to include opId and 
etc (Pengcheng Xiong, reviewed by Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-02-23 05:27:15.852
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: itests/thirdparty/.gitignore: already exists in working directory
error: itests/thirdparty/pom.xml: already exists in working directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854079 - PreCommit-HIVE-Build

> Build Infra: Spark artifacts download takes a long time
> ---
>
> Key: HIVE-14735
> URL: https://issues.apache.org/jira/browse/HIVE-14735
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Vaibhav Gumashta
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14735.1.patch, HIVE-14735.1.patch, 
> HIVE-14735.1.patch, HIVE-14735.1.patch, HIVE-14735.2.patch, 
> HIVE-14735.3.patch, HIVE-14735.4.patch
>
>
> In particular this command:
> {{curl -Sso ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
> http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16012) BytesBytes hash table - better capacity exhaustion handling

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879886#comment-15879886
 ] 

Hive QA commented on HIVE-16012:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854085/HIVE-16012.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10254 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3711/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3711/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3711/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854085 - PreCommit-HIVE-Build

> BytesBytes hash table - better capacity exhaustion handling
> ---
>
> Key: HIVE-16012
> URL: https://issues.apache.org/jira/browse/HIVE-16012
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16012.01.patch, HIVE-16012.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16019) Query fails when group by/order by on same column with uppercase name

2017-02-22 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16019:
---
Status: Patch Available  (was: Open)

> Query fails when group by/order by on same column with uppercase name
> -
>
> Key: HIVE-16019
> URL: https://issues.apache.org/jira/browse/HIVE-16019
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16019.patch
>
>
> Query with group by/order by on same column KEY failed:
> {code}
> SELECT T1.KEY AS MYKEY FROM SRC T1 GROUP BY T1.KEY ORDER BY T1.KEY LIMIT 3;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16019) Query fails when group by/order by on same column with uppercase name

2017-02-22 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-16019:
---
Attachment: HIVE-16019.patch

It is related to HIVE-12590. I am not quite sure in that patch why we removed 
the "col_alias = col_alias.toLowerCase();" from RowResolver.addMappingOnly & 
get methods. Is it because the that col_alias could probably be a 
case-sensitive constant value instead of column identifier?
To follow the change in HIVE-12590, I converted the column names to lowercase 
when processing the TOK_TABLE_OR_COL token in groupby.
[~ashutoshc], [~pxiong] Could you review the patch?

> Query fails when group by/order by on same column with uppercase name
> -
>
> Key: HIVE-16019
> URL: https://issues.apache.org/jira/browse/HIVE-16019
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-16019.patch
>
>
> Query with group by/order by on same column KEY failed:
> {code}
> SELECT T1.KEY AS MYKEY FROM SRC T1 GROUP BY T1.KEY ORDER BY T1.KEY LIMIT 3;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16020) LLAP : Reduce IPC connection misses

2017-02-22 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-16020:

Attachment: HIVE-16020.1.patch

Changes:
1. Common SocketFactory passed  to ContainerRunnerImpl, AMReporter, 
TaskRunnerCallable.
2. UserGroupInformation passed at query level (this is part of [~sseth]'s 
patch). Initial patch could be to provide at query level; can create subsequent 
jira to cache at app level.

> LLAP : Reduce IPC connection misses
> ---
>
> Key: HIVE-16020
> URL: https://issues.apache.org/jira/browse/HIVE-16020
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
> Attachments: HIVE-16020.1.patch
>
>
> {{RPC.getProxy}} created in {{TaskRunnerCallable}} does not pass 
> SocketFactory. This would cause new SocketFactory to be created every time 
> which would cause connection reuse issues.  Also, UserGroupInformation can be 
> reused at query level.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16020) LLAP : Reduce IPC connection misses

2017-02-22 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-16020:

Status: Patch Available  (was: Open)

> LLAP : Reduce IPC connection misses
> ---
>
> Key: HIVE-16020
> URL: https://issues.apache.org/jira/browse/HIVE-16020
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
> Attachments: HIVE-16020.1.patch
>
>
> {{RPC.getProxy}} created in {{TaskRunnerCallable}} does not pass 
> SocketFactory. This would cause new SocketFactory to be created every time 
> which would cause connection reuse issues.  Also, UserGroupInformation can be 
> reused at query level.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16019) Query fails when group by/order by on same column with uppercase name

2017-02-22 Thread Chaoyu Tang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang reassigned HIVE-16019:
--


> Query fails when group by/order by on same column with uppercase name
> -
>
> Key: HIVE-16019
> URL: https://issues.apache.org/jira/browse/HIVE-16019
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> Query with group by/order by on same column KEY failed:
> {code}
> SELECT T1.KEY AS MYKEY FROM SRC T1 GROUP BY T1.KEY ORDER BY T1.KEY LIMIT 3;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-13864) Beeline ignores the command that follows a semicolon and comment

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879828#comment-15879828
 ] 

Hive QA commented on HIVE-13864:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854071/HIVE-13864.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10255 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3710/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3710/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3710/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854071 - PreCommit-HIVE-Build

> Beeline ignores the command that follows a semicolon and comment
> 
>
> Key: HIVE-13864
> URL: https://issues.apache.org/jira/browse/HIVE-13864
> Project: Hive
>  Issue Type: Bug
>Reporter: Muthu Manickam
>Assignee: Yongzhi Chen
> Attachments: HIVE-13864.01.patch, HIVE-13864.02.patch, 
> HIVE-13864.3.patch
>
>
> Beeline ignores the next line/command that follows a command with semicolon 
> and comments.
> Example 1:
> select *
> from table1; -- comments
> select * from table2;
> In this case, only the first command is executed.. second command "select * 
> from table2" is not executed.
> --
> Example 2:
> select *
> from table1; -- comments
> select * from table2;
> select * from table3;
> In this case, first command and third command is executed. second command 
> "select * from table2" is not executed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16018) Add more information for DynamicPartitionPruningOptimization

2017-02-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-16018:
--


> Add more information for DynamicPartitionPruningOptimization
> 
>
> Key: HIVE-16018
> URL: https://issues.apache.org/jira/browse/HIVE-16018
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15991) Flaky Test: TestEncryptedHDFSCliDriver encryption_join_with_different_encryption_keys

2017-02-22 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879816#comment-15879816
 ] 

Sahil Takiar commented on HIVE-15991:
-

[~pxiong] yeah I had some trouble with this test. Try running the test on 
Linux. I was never able to get it to run on OS X, but the test runs on Ubuntu. 
When running on OS X there is some masked exception that pops up (I'm planning 
to file a follow up JIRA to make the logging a bit better).

> Flaky Test: TestEncryptedHDFSCliDriver 
> encryption_join_with_different_encryption_keys
> -
>
> Key: HIVE-15991
> URL: https://issues.apache.org/jira/browse/HIVE-15991
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 2.2.0
>
> Attachments: HIVE-15991.txt
>
>
> I ran a git-bisect and seems HIVE-15703 started causing this failure. Not 
> entirely sure why, but I updated the .out file and the diff is pretty 
> straightforward, so I think its safe to just update it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15991) Flaky Test: TestEncryptedHDFSCliDriver encryption_join_with_different_encryption_keys

2017-02-22 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879815#comment-15879815
 ] 

Pengcheng Xiong commented on HIVE-15991:


[~stakiar], i tried to update this golden file before but could not get 
success. Could u share the trick to run this test? Thanks.

> Flaky Test: TestEncryptedHDFSCliDriver 
> encryption_join_with_different_encryption_keys
> -
>
> Key: HIVE-15991
> URL: https://issues.apache.org/jira/browse/HIVE-15991
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 2.2.0
>
> Attachments: HIVE-15991.txt
>
>
> I ran a git-bisect and seems HIVE-15703 started causing this failure. Not 
> entirely sure why, but I updated the .out file and the diff is pretty 
> straightforward, so I think its safe to just update it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-22 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Status: Patch Available  (was: Open)

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.3.patch, HIVE-1555.4.patch, HIVE-1555.5.patch, 
> HIVE-1555.6.patch, HIVE-1555.7.patch, HIVE-1555.8.patch, JDBCStorageHandler 
> Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-22 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Attachment: HIVE-1555.8.patch

.8 is rebased against latest master.

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.3.patch, HIVE-1555.4.patch, HIVE-1555.5.patch, 
> HIVE-1555.6.patch, HIVE-1555.7.patch, HIVE-1555.8.patch, JDBCStorageHandler 
> Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-22 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Status: Open  (was: Patch Available)

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.3.patch, HIVE-1555.4.patch, HIVE-1555.5.patch, 
> HIVE-1555.6.patch, HIVE-1555.7.patch, JDBCStorageHandler Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2017-02-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879798#comment-15879798
 ] 

Sergey Shelukhin commented on HIVE-14990:
-

Many tests output duplicate results, incl. with non-MM tables. I filed a bug 
for bucketed union, but I also see it e.g. for skewjoin test. That did not 
happen before, must be bad merge after the hiatus, or interaction with some of 
the merged changes.

> run all tests for MM tables and fix the issues that are found
> -
>
> Key: HIVE-14990
> URL: https://issues.apache.org/jira/browse/HIVE-14990
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14990.01.patch, HIVE-14990.02.patch, 
> HIVE-14990.03.patch, HIVE-14990.04.patch, HIVE-14990.04.patch, 
> HIVE-14990.05.patch, HIVE-14990.05.patch, HIVE-14990.06.patch, 
> HIVE-14990.06.patch, HIVE-14990.07.patch, HIVE-14990.08.patch, 
> HIVE-14990.09.patch, HIVE-14990.10.patch, HIVE-14990.10.patch, 
> HIVE-14990.10.patch, HIVE-14990.12.patch, HIVE-14990.13.patch, 
> HIVE-14990.patch
>
>
> Expected failures 
> 1) All HCat tests (cannot write MM tables via the HCat writer)
> 2) Almost all merge tests (alter .. concat is not supported).
> 3) Tests that run dfs commands with specific paths (path changes).
> 4) Truncate column (not supported).
> 5) Describe formatted will have the new table fields in the output (before 
> merging MM with ACID).
> 6) Many tests w/explain extended - diff in partition "base file name" (path 
> changes).
> 7) TestTxnCommands - all the conversion tests, as they check for bucket count 
> using file lists (path changes).
> 8) HBase metastore tests cause methods are not implemented.
> 9) Some load and ExIm tests that export a table and then rely on specific 
> path for load (path changes).
> 10) Bucket map join/etc. - diffs; disabled the optimization for MM tables due 
> to how it accounts for buckets
> 11) rand - different results due to different sequence of processing.
> 12) many (not all i.e. not the ones with just one insert) tests that have 
> stats output, such as file count, for obvious reasons
> 13) materialized views, not handled by design - the test check erroneously 
> makes them "mm", no easy way to tell them apart, I don't want to plumb more 
> stuff thru just for this test
> I'm filing jiras for some test failures that are not obvious and need an 
> investigation later



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16010) incorrect set in TezSessionPoolManager

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879789#comment-15879789
 ] 

Hive QA commented on HIVE-16010:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854053/HIVE-16010.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3708/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3708/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3708/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.io.IOException: Could not create 
/data/hiveptest/logs/PreCommit-HIVE-Build-3708/succeeded/202_TestSetUGIOnOnlyServer
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854053 - PreCommit-HIVE-Build

> incorrect set in TezSessionPoolManager
> --
>
> Key: HIVE-16010
> URL: https://issues.apache.org/jira/browse/HIVE-16010
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15854) MM tables - autoColumnStats_9.q different stats

2017-02-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15854:

Summary: MM tables - autoColumnStats_9.q different stats  (was: failing 
test on MM - autoColumnStats_9.q)

> MM tables - autoColumnStats_9.q different stats
> ---
>
> Key: HIVE-15854
> URL: https://issues.apache.org/jira/browse/HIVE-15854
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>
> There's difference in column stats for unclear reason (null count). The 
> results of the queries are the same (if added to the file). Logs are huge and 
> it's unclear how to investigate, probably the stats UDF needs to be logged in 
> the debugger with all rows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16016) Use same PersistenceManager for metadata and notification

2017-02-22 Thread Mohit Sabharwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879753#comment-15879753
 ] 

Mohit Sabharwal commented on HIVE-16016:


Thanks for the review [~sershe]!  The separate RS in DbNotificationListener is 
used by the CleanerThread (which gets created by DbNotificationListener) 
outside the TThreadPoolServer/hmshandler threadpool.

Thanks, [~vgumashta]. Looks like your patch already had a successful test run. 
Please commit your patch. I'll move the test portion of this patch over to 
HIVE-15305 (where it really belongs).  

> Use same PersistenceManager for metadata and notification
> -
>
> Key: HIVE-16016
> URL: https://issues.apache.org/jira/browse/HIVE-16016
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-16016.patch
>
>
> HIVE-13966 added to support for persisting notification in the same JDO 
> transaction as the metadata event. However, the notification is currently 
> added using a different ObjectStore object from the one used to persist the 
> metadata event.  
> The notification is added using the ObjectStore constructed in 
> DbNotificationListener, whereas the metadata event is added via the thread 
> local ObjectStore (i.e. threadLocalMS in HiveMetaStore.HMSHandler).
> As a result, different PersistenceManagers (different transactions) are used 
> to persist notification and metadata events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15766) DBNotificationlistener leaks JDOPersistenceManager

2017-02-22 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879748#comment-15879748
 ] 

Vaibhav Gumashta commented on HIVE-15766:
-

Test failures unrelated. [~thejas] your +1 still holds?

cc [~mohitsabharwal]

> DBNotificationlistener leaks JDOPersistenceManager
> --
>
> Key: HIVE-15766
> URL: https://issues.apache.org/jira/browse/HIVE-15766
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-15766.1.patch, HIVE-15766.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16016) Use same PersistenceManager for metadata and notification

2017-02-22 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879745#comment-15879745
 ] 

Vaibhav Gumashta commented on HIVE-16016:
-

[~mohitsabharwal] Thanks for taking this up. The patch for HIVE-15766 also has 
the fix.

> Use same PersistenceManager for metadata and notification
> -
>
> Key: HIVE-16016
> URL: https://issues.apache.org/jira/browse/HIVE-16016
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-16016.patch
>
>
> HIVE-13966 added to support for persisting notification in the same JDO 
> transaction as the metadata event. However, the notification is currently 
> added using a different ObjectStore object from the one used to persist the 
> metadata event.  
> The notification is added using the ObjectStore constructed in 
> DbNotificationListener, whereas the metadata event is added via the thread 
> local ObjectStore (i.e. threadLocalMS in HiveMetaStore.HMSHandler).
> As a result, different PersistenceManagers (different transactions) are used 
> to persist notification and metadata events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-12923) CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver groupby_grouping_sets4.q failure

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879744#comment-15879744
 ] 

Hive QA commented on HIVE-12923:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12786107/HIVE-12923.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3707/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3707/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3707/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
java.io.IOException: Could not create 
/data/hiveptest/logs/PreCommit-HIVE-Build-3707/succeeded/2-TestCliDriver-ppd_constant_where.q-drop_index_removes_partition_dirs.q-cbo_input26.q-and-27-more
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12786107 - PreCommit-HIVE-Build

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver 
> groupby_grouping_sets4.q failure
> 
>
> Key: HIVE-12923
> URL: https://issues.apache.org/jira/browse/HIVE-12923
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12923.1.patch, HIVE-12923.2.patch
>
>
> {code}
> EXPLAIN
> SELECT * FROM
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq1
> join
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq2
> on subq1.a = subq2.a
> {code}
> Stack trace:
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.pruneJoinOperator(ColumnPrunerProcFactory.java:1110)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.access$400(ColumnPrunerProcFactory.java:85)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerJoinProc.process(ColumnPrunerProcFactory.java:941)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:172)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:135)
> at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:237)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10176)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:229)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:472)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:312)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1168)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1256)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1094)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1129)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1103)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:10444)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets4(TestCliDriver.java:3313)
> {code}



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HIVE-1555) JDBC Storage Handler

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879733#comment-15879733
 ] 

Hive QA commented on HIVE-1555:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854037/HIVE-1555.7.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10274 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=236)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[jdbc_handler] 
(batchId=52)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=224)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3705/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3705/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3705/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854037 - PreCommit-HIVE-Build

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.3.patch, HIVE-1555.4.patch, HIVE-1555.5.patch, 
> HIVE-1555.6.patch, HIVE-1555.7.patch, JDBCStorageHandler Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF

2017-02-22 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16002:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Vineet!

> Correlated IN subquery with aggregate asserts in sq_count_check UDF
> ---
>
> Key: HIVE-16002
> URL: https://issues.apache.org/jira/browse/HIVE-16002
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 2.2.0
>
> Attachments: HIVE-16002.1.patch, HIVE-16002.2.patch, 
> HIVE-16002.3.patch
>
>
> Reproducer
> {code:SQL}
> create table t(i int, j int);
> insert into t values(0,1), (0,2);
> create table tt(i int, j int);
> insert into tt values(0,3);
> select * from t where i IN (select count(i) from tt where tt.j = t.j);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16016) Use same PersistenceManager for metadata and notification

2017-02-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879714#comment-15879714
 ] 

Sergey Shelukhin commented on HIVE-16016:
-

+1 pending tests. Is separate RS stored for notifications even needed?

> Use same PersistenceManager for metadata and notification
> -
>
> Key: HIVE-16016
> URL: https://issues.apache.org/jira/browse/HIVE-16016
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-16016.patch
>
>
> HIVE-13966 added to support for persisting notification in the same JDO 
> transaction as the metadata event. However, the notification is currently 
> added using a different ObjectStore object from the one used to persist the 
> metadata event.  
> The notification is added using the ObjectStore constructed in 
> DbNotificationListener, whereas the metadata event is added via the thread 
> local ObjectStore (i.e. threadLocalMS in HiveMetaStore.HMSHandler).
> As a result, different PersistenceManagers (different transactions) are used 
> to persist notification and metadata events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16016) Use same PersistenceManager for metadata and notification

2017-02-22 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-16016:
---
Attachment: HIVE-16016.patch

> Use same PersistenceManager for metadata and notification
> -
>
> Key: HIVE-16016
> URL: https://issues.apache.org/jira/browse/HIVE-16016
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-16016.patch
>
>
> HIVE-13966 added to support for persisting notification in the same JDO 
> transaction as the metadata event. However, the notification is currently 
> added using a different ObjectStore object from the one used to persist the 
> metadata event.  
> The notification is added using the ObjectStore constructed in 
> DbNotificationListener, whereas the metadata event is added via the thread 
> local ObjectStore (i.e. threadLocalMS in HiveMetaStore.HMSHandler).
> As a result, different PersistenceManagers (different transactions) are used 
> to persist notification and metadata events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16016) Use same PersistenceManager for metadata and notification

2017-02-22 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-16016:
---
Status: Patch Available  (was: Open)

> Use same PersistenceManager for metadata and notification
> -
>
> Key: HIVE-16016
> URL: https://issues.apache.org/jira/browse/HIVE-16016
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-16016.patch
>
>
> HIVE-13966 added to support for persisting notification in the same JDO 
> transaction as the metadata event. However, the notification is currently 
> added using a different ObjectStore object from the one used to persist the 
> metadata event.  
> The notification is added using the ObjectStore constructed in 
> DbNotificationListener, whereas the metadata event is added via the thread 
> local ObjectStore (i.e. threadLocalMS in HiveMetaStore.HMSHandler).
> As a result, different PersistenceManagers (different transactions) are used 
> to persist notification and metadata events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16004) OutOfMemory in SparkReduceRecordHandler with vectorization mode

2017-02-22 Thread Ferdinand Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879712#comment-15879712
 ] 

Ferdinand Xu commented on HIVE-16004:
-

+1

> OutOfMemory in SparkReduceRecordHandler with vectorization mode
> ---
>
> Key: HIVE-16004
> URL: https://issues.apache.org/jira/browse/HIVE-16004
> Project: Hive
>  Issue Type: Bug
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HIVE-16004.001.patch, HIVE-16004.002.patch
>
>
> For the query 28 of TPCs-BB with 1T data, the executor memory is set as 30G. 
> Get the following exception:
> java.lang.OutOfMemoryError
>   at 
> java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
>   at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
>   at 
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
>   at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
>   at java.io.DataOutputStream.write(DataOutputStream.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:467)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.addRowToBatchFrom(VectorizedBatchUtil.java:238)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:367)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:286)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974)
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>   at org.apache.spark.scheduler.Task.run(Task.scala:85)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745) 
> I think DataOutputBuffer isn't cleared on time cause this problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15903) Compute table stats when user computes column stats

2017-02-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15903:
---
Status: Open  (was: Patch Available)

> Compute table stats when user computes column stats
> ---
>
> Key: HIVE-15903
> URL: https://issues.apache.org/jira/browse/HIVE-15903
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15903.01.patch, HIVE-15903.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16016) Use same PersistenceManager for metadata and notification

2017-02-22 Thread Mohit Sabharwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal reassigned HIVE-16016:
--


> Use same PersistenceManager for metadata and notification
> -
>
> Key: HIVE-16016
> URL: https://issues.apache.org/jira/browse/HIVE-16016
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
>
> HIVE-13966 added to support for persisting notification in the same JDO 
> transaction as the metadata event. However, the notification is currently 
> added using a different ObjectStore object from the one used to persist the 
> metadata event.  
> The notification is added using the ObjectStore constructed in 
> DbNotificationListener, whereas the metadata event is added via the thread 
> local ObjectStore (i.e. threadLocalMS in HiveMetaStore.HMSHandler).
> As a result, different PersistenceManagers (different transactions) are used 
> to persist notification and metadata events.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15903) Compute table stats when user computes column stats

2017-02-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15903:
---
Status: Patch Available  (was: Open)

> Compute table stats when user computes column stats
> ---
>
> Key: HIVE-15903
> URL: https://issues.apache.org/jira/browse/HIVE-15903
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15903.01.patch, HIVE-15903.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15903) Compute table stats when user computes column stats

2017-02-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15903:
---
Attachment: HIVE-15903.02.patch

> Compute table stats when user computes column stats
> ---
>
> Key: HIVE-15903
> URL: https://issues.apache.org/jira/browse/HIVE-15903
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15903.01.patch, HIVE-15903.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16012) BytesBytes hash table - better capacity exhaustion handling

2017-02-22 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-16012:
-
Summary: BytesBytes hash table - better capacity exhaustion handling  (was: 
BytesBytes hash table - better capacity exhaustion handling I)

> BytesBytes hash table - better capacity exhaustion handling
> ---
>
> Key: HIVE-16012
> URL: https://issues.apache.org/jira/browse/HIVE-16012
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16012.01.patch, HIVE-16012.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16012) BytesBytes hash table - better capacity exhaustion handling I

2017-02-22 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879705#comment-15879705
 ] 

Wei Zheng commented on HIVE-16012:
--

+1 for failing early

> BytesBytes hash table - better capacity exhaustion handling I
> -
>
> Key: HIVE-16012
> URL: https://issues.apache.org/jira/browse/HIVE-16012
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16012.01.patch, HIVE-16012.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16004) OutOfMemory in SparkReduceRecordHandler with vectorization mode

2017-02-22 Thread Colin Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879704#comment-15879704
 ] 

Colin Ma commented on HIVE-16004:
-

[~xuefuz], thanks for the review. I check the log of failure test cases, and it 
isn't caused by this update.

> OutOfMemory in SparkReduceRecordHandler with vectorization mode
> ---
>
> Key: HIVE-16004
> URL: https://issues.apache.org/jira/browse/HIVE-16004
> Project: Hive
>  Issue Type: Bug
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HIVE-16004.001.patch, HIVE-16004.002.patch
>
>
> For the query 28 of TPCs-BB with 1T data, the executor memory is set as 30G. 
> Get the following exception:
> java.lang.OutOfMemoryError
>   at 
> java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
>   at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
>   at 
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
>   at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
>   at java.io.DataOutputStream.write(DataOutputStream.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:467)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.addRowToBatchFrom(VectorizedBatchUtil.java:238)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:367)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:286)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974)
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>   at org.apache.spark.scheduler.Task.run(Task.scala:85)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745) 
> I think DataOutputBuffer isn't cleared on time cause this problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16015) LLAP: some Tez INFO logs are too noisy II

2017-02-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16015:

Attachment: HIVE-16015.01.patch

> LLAP: some Tez INFO logs are too noisy II
> -
>
> Key: HIVE-16015
> URL: https://issues.apache.org/jira/browse/HIVE-16015
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-16015.01.patch, HIVE-16015.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-16015) LLAP: some Tez INFO logs are too noisy II

2017-02-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879699#comment-15879699
 ] 

Sergey Shelukhin edited comment on HIVE-16015 at 2/23/17 1:55 AM:
--

If we commit it now, the older versions of Tez will still produce log spam... 
Also TezMerger was 11.5% of the entire log on some cluster where that was 
measured (from 3 different statements). Perhaps it also needs a custom logger


was (Author: sershe):
If we commit it now, the older versions of Tez will still produce log spam... 

> LLAP: some Tez INFO logs are too noisy II
> -
>
> Key: HIVE-16015
> URL: https://issues.apache.org/jira/browse/HIVE-16015
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-16015.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16015) LLAP: some Tez INFO logs are too noisy II

2017-02-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879699#comment-15879699
 ] 

Sergey Shelukhin commented on HIVE-16015:
-

If we commit it now, the older versions of Tez will still produce log spam... 

> LLAP: some Tez INFO logs are too noisy II
> -
>
> Key: HIVE-16015
> URL: https://issues.apache.org/jira/browse/HIVE-16015
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-16015.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16015) LLAP: some Tez INFO logs are too noisy II

2017-02-22 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879683#comment-15879683
 ] 

Siddharth Seth commented on HIVE-16015:
---

ShuffleScheduler should be ShuffleScheduler.fetch
TezMerger - I think we should leave this at INFO. Can add another change once a 
custom less noisy logger is setup in Tez, or tez starts logging less (the 
second one is more likely in this case).

Think this can be committed. It will have an affect depending on the deployed 
version of Tez. There's no warnings / compile time problems that will show up 
if it is committed.

Tried on a real cluster?

> LLAP: some Tez INFO logs are too noisy II
> -
>
> Key: HIVE-16015
> URL: https://issues.apache.org/jira/browse/HIVE-16015
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-16015.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16010) incorrect set in TezSessionPoolManager

2017-02-22 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879663#comment-15879663
 ] 

Siddharth Seth commented on HIVE-16010:
---

+1

> incorrect set in TezSessionPoolManager
> --
>
> Key: HIVE-16010
> URL: https://issues.apache.org/jira/browse/HIVE-16010
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16015) LLAP: some Tez INFO logs are too noisy II

2017-02-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16015:

Attachment: HIVE-16015.patch

The patch. Cannot be committed before Tez 0.9.0.
[~sseth] does this make sense? The tez patch example sets the entire 
ShuffleScheduler to WARN, and has no changes to TezMerger, so I'm keeping them 
as is

> LLAP: some Tez INFO logs are too noisy II
> -
>
> Key: HIVE-16015
> URL: https://issues.apache.org/jira/browse/HIVE-16015
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-16015.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15549) Better naming of Tez edges

2017-02-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879655#comment-15879655
 ] 

Ashutosh Chauhan commented on HIVE-15549:
-

Wishing this one gets in soon. Spent *another* hour ruminating how 
CUSTOM_SIMPLE_EDGE is different than CUSTOM_EDGE. : )

> Better naming of Tez edges
> --
>
> Key: HIVE-15549
> URL: https://issues.apache.org/jira/browse/HIVE-15549
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-15549.1.patch, HIVE-15549.2.patch, 
> HIVE-15549.3.patch, HIVE-15549.4.patch
>
>
> Do the following renames:
> CUSTOM_EDGE -> CO_PARTITION_EDGE
> CUSTOM_SIMPLE_EDGE -> PARTITION_EDGE
> SIMPLE_EDGE -> SORT_PARTITION_EDGE
> Because that's what those edges actually do.
> Also rename Map/Reduce  to just Vertex . These vertices haven't mapped 
> or reduced in a long time. The names are leftover items from MR.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16015) LLAP: some Tez INFO logs are too noisy II

2017-02-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879651#comment-15879651
 ] 

Sergey Shelukhin commented on HIVE-16015:
-

required Tez upgrade

> LLAP: some Tez INFO logs are too noisy II
> -
>
> Key: HIVE-16015
> URL: https://issues.apache.org/jira/browse/HIVE-16015
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15830) Allow additional view ACLs for tez jobs

2017-02-22 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-15830:
--
Attachment: HIVE-15830.06.patch

Updated to use userFromAUthenticator. Also leaving the ugi.getShortUserName 
change in place. The long name normally includes realm etc information.

> Allow additional view ACLs for tez jobs
> ---
>
> Key: HIVE-15830
> URL: https://issues.apache.org/jira/browse/HIVE-15830
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15830.01.patch, HIVE-15830.02.patch, 
> HIVE-15830.03.patch, HIVE-15830.05.patch, HIVE-15830.06.patch
>
>
> Allow users to grant view access to additional users when running tez jobs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16015) LLAP: some Tez INFO logs are too noisy II

2017-02-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-16015:
---


> LLAP: some Tez INFO logs are too noisy II
> -
>
> Key: HIVE-16015
> URL: https://issues.apache.org/jira/browse/HIVE-16015
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879646#comment-15879646
 ] 

Hive QA commented on HIVE-15955:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854017/HIVE-15955.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10253 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3704/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3704/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3704/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854017 - PreCommit-HIVE-Build

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch, HIVE-15955.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF

2017-02-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879571#comment-15879571
 ] 

Ashutosh Chauhan commented on HIVE-16002:
-

+1

> Correlated IN subquery with aggregate asserts in sq_count_check UDF
> ---
>
> Key: HIVE-16002
> URL: https://issues.apache.org/jira/browse/HIVE-16002
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16002.1.patch, HIVE-16002.2.patch, 
> HIVE-16002.3.patch
>
>
> Reproducer
> {code:SQL}
> create table t(i int, j int);
> insert into t values(0,1), (0,2);
> create table tt(i int, j int);
> insert into tt values(0,3);
> select * from t where i IN (select count(i) from tt where tt.j = t.j);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-13780) Allow user to update AVRO table schema via command even if table's definition was defined through schema file

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879555#comment-15879555
 ] 

Hive QA commented on HIVE-13780:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854015/HIVE-13780.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10254 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3703/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3703/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3703/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854015 - PreCommit-HIVE-Build

> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file
> -
>
> Key: HIVE-13780
> URL: https://issues.apache.org/jira/browse/HIVE-13780
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Eric Lin
>Assignee: Adam Szita
>Priority: Minor
> Attachments: HIVE-13780.0.patch, HIVE-13780.1.patch, 
> HIVE-13780.2.patch
>
>
> If a table is defined as below:
> {code}
> CREATE TABLE test
> STORED AS AVRO 
> TBLPROPERTIES ('avro.schema.url'='/tmp/schema.json');
> {code}
> if user tries to run command:
> {code}
> ALTER TABLE test CHANGE COLUMN col1 col1 STRING COMMENT 'test comment';
> {code}
> The query will return without any warning, but has no affect to the table.
> It would be good if we can allow user to ALTER table (add/change column, 
> update comment etc) even though the schema is defined through schema file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16005) miscellaneous small fixes to help with llap debuggability

2017-02-22 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-16005:
--
Attachment: HIVE-16005.02.patch

Updated patch. Using queryId-dagIndex (not dagId). DagId is already available 
in the HistoryEvents directly, and the jmx ExecutorStatus output via 
task_attempt info.

dagId=queryId:counter.getAndIncrement - the counter has no relation with the 
actual dagIndex. (The dagIndex here is the actual index used within the tez app)

Also updated the constructShortString to work properly.

> miscellaneous small fixes to help with llap debuggability
> -
>
> Key: HIVE-16005
> URL: https://issues.apache.org/jira/browse/HIVE-16005
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16005.01.patch, HIVE-16005.02.patch
>
>
> - Include proc_ in cli, beeline, metastore, hs2 process args
> - LLAP history logger - log QueryId instead of dagName (dag name is free 
> flowing text)
> - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / 
> queued
> - Include thread name in TaskRunnerCallable so that it shows up in stack 
> traces (will cause extra output in logs)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15166) Provide beeline option to set the jline history max size

2017-02-22 Thread Eric Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Lin updated HIVE-15166:

Attachment: HIVE-15166.2.patch

Re-attach patch

> Provide beeline option to set the jline history max size
> 
>
> Key: HIVE-15166
> URL: https://issues.apache.org/jira/browse/HIVE-15166
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.1.0
>Reporter: Eric Lin
>Assignee: Eric Lin
>Priority: Minor
> Attachments: HIVE-15166.2.patch, HIVE-15166.patch
>
>
> Currently Beeline does not provide an option to limit the max size for 
> beeline history file, in the case that each query is very big, it will flood 
> the history file and slow down beeline on start up and shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15166) Provide beeline option to set the jline history max size

2017-02-22 Thread Eric Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Lin updated HIVE-15166:

Attachment: (was: HIVE-15166.2.patch)

> Provide beeline option to set the jline history max size
> 
>
> Key: HIVE-15166
> URL: https://issues.apache.org/jira/browse/HIVE-15166
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.1.0
>Reporter: Eric Lin
>Assignee: Eric Lin
>Priority: Minor
> Attachments: HIVE-15166.patch
>
>
> Currently Beeline does not provide an option to limit the max size for 
> beeline history file, in the case that each query is very big, it will flood 
> the history file and slow down beeline on start up and shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15859) Hive client side shows Spark Driver disconnected while Spark Driver side could not get RPC header

2017-02-22 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879467#comment-15879467
 ] 

Xuefu Zhang commented on HIVE-15859:


It seems to me that option #2 can be on top of option #1 because we may want to 
let all messages go thru the event loop. If that's case, we can further 
implement option #2 as a followup. Thoughts?

In the meantime, I'm very eager to know if this has addressed [~KaiXu]'s 
problem.

> Hive client side shows Spark Driver disconnected while Spark Driver side 
> could not get RPC header 
> --
>
> Key: HIVE-15859
> URL: https://issues.apache.org/jira/browse/HIVE-15859
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.2.0
> Environment: hadoop2.7.1
> spark1.6.2
> hive2.2
>Reporter: KaiXu
>Assignee: Rui Li
> Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch
>
>
> Hive on Spark, failed with error:
> {noformat}
> 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 
> 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> {noformat}
> application log shows the driver commanded a shutdown with some unknown 
> reason, but hive's log shows Driver could not get RPC header( Expected RPC 
> header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).
> {noformat}
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in 
> stage 3.0 (TID 2519)
> 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver 
> commanded a shutdown
> 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
> 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
> (hsx-node1:42777) driver disconnected.
> 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> 192.168.1.1:42777 disassociated! Shutting down.
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in 
> stage 3.0 (TID 2511)
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Shutting down remote daemon.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remote daemon shut down; proceeding with flushing remote transports.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remoting shut down.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-e17931ad-9e8a-45da-86f8-9a0fdca0fad1
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
>

[jira] [Commented] (HIVE-16006) Incremental REPL LOAD doesn't operate on the target database if name differs from source database.

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879466#comment-15879466
 ] 

Hive QA commented on HIVE-16006:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854011/HIVE-16006.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10238 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=130)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3702/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3702/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3702/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854011 - PreCommit-HIVE-Build

> Incremental REPL LOAD doesn't operate on the target database if name differs 
> from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: HIVE-16006.01.patch
>
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. REPL DUMP default FROM 52;
> 2. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max

2017-02-22 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879459#comment-15879459
 ] 

Yongzhi Chen commented on HIVE-15881:
-

PATCH 3 looks good.  +1
Please make sure the test failures are not related.

> Use new thread count variable name instead of mapred.dfsclient.parallelism.max
> --
>
> Key: HIVE-15881
> URL: https://issues.apache.org/jira/browse/HIVE-15881
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Minor
> Attachments: HIVE-15881.1.patch, HIVE-15881.2.patch, 
> HIVE-15881.3.patch
>
>
> The Utilities class has two methods, {{getInputSummary}} and 
> {{getInputPaths}}, that use the variable {{mapred.dfsclient.parallelism.max}} 
> to get the summary of a list of input locations in parallel. These methods 
> are Hive related, but the variable name does not look it is specific for Hive.
> Also, the above variable is not on HiveConf nor used anywhere else. I just 
> found a reference on the Hadoop MR1 code.
> I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, 
> and use a different variable name, such as 
> {{hive.get.input.listing.num.threads}}, that reflects the intention of the 
> variable. The removal of the old variable might happen on Hive 3.x



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16005) miscellaneous small fixes to help with llap debuggability

2017-02-22 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879455#comment-15879455
 ] 

Prasanth Jayachandran commented on HIVE-16005:
--

can we please have queryId+"-"+dagId? just to be consistent with log file and 
log url. 

> miscellaneous small fixes to help with llap debuggability
> -
>
> Key: HIVE-16005
> URL: https://issues.apache.org/jira/browse/HIVE-16005
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16005.01.patch
>
>
> - Include proc_ in cli, beeline, metastore, hs2 process args
> - LLAP history logger - log QueryId instead of dagName (dag name is free 
> flowing text)
> - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / 
> queued
> - Include thread name in TaskRunnerCallable so that it shows up in stack 
> traces (will cause extra output in logs)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879419#comment-15879419
 ] 

Jason Dere edited comment on HIVE-15955 at 2/22/17 11:19 PM:
-

DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan().
The mapping of RS to the TS the min/max/bloomfilter will be sent to is in 
ParseContext.getRsOpToTsOpMap()


was (Author: jdere):
DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan()

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch, HIVE-15955.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15857) Vectorization: Add string conversion case for UDFToInteger, etc

2017-02-22 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-15857:

Attachment: HIVE-15857.02.patch

> Vectorization: Add string conversion case for UDFToInteger, etc
> ---
>
> Key: HIVE-15857
> URL: https://issues.apache.org/jira/browse/HIVE-15857
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-15857.01.patch, HIVE-15857.02.patch
>
>
> Otherwise, VectorUDFAdaptor is used to convert a column from String to Int, 
> etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879419#comment-15879419
 ] 

Jason Dere commented on HIVE-15955:
---

DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan()

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch, HIVE-15955.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15830) Allow additional view ACLs for tez jobs

2017-02-22 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879412#comment-15879412
 ] 

Jason Dere commented on HIVE-15830:
---

A bit confusing, but in ATSHook, user is the HS2 process user (hive). 
requestuser is the one that shows the user logged into HS2 and running the 
query.
I think the real fix for this is in Driver.execute():
{code}
  SessionState ss = SessionState.get();
  hookContext = new HookContext(plan, queryState, ctx.getPathToCS(), 
ss.getUserName(),
  ss.getUserIpAddress(), InetAddress.getLocalHost().getHostAddress(), 
operationId,
  ss.getSessionId(), Thread.currentThread().getName(), 
ss.isHiveServerQuery(), perfLogger);
{code}

ss.getUserName() should be changed to ss.getUserFromAuthenticator() instead.

> Allow additional view ACLs for tez jobs
> ---
>
> Key: HIVE-15830
> URL: https://issues.apache.org/jira/browse/HIVE-15830
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15830.01.patch, HIVE-15830.02.patch, 
> HIVE-15830.03.patch, HIVE-15830.05.patch
>
>
> Allow users to grant view access to additional users when running tez jobs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16012) BytesBytes hash table - better capacity exhaustion handling I

2017-02-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16012:

Status: Patch Available  (was: Open)

A better patch

> BytesBytes hash table - better capacity exhaustion handling I
> -
>
> Key: HIVE-16012
> URL: https://issues.apache.org/jira/browse/HIVE-16012
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16012.01.patch, HIVE-16012.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879415#comment-15879415
 ] 

Pengcheng Xiong commented on HIVE-15955:


[~jdere], could u point me to the place where u added SEL-GBY-RS-GBY-RS branch? 
Thanks. 

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch, HIVE-15955.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16014) HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of hive.mv.files.thread for pool size

2017-02-22 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-16014:
--


> HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of 
> hive.mv.files.thread for pool size
> --
>
> Key: HIVE-16014
> URL: https://issues.apache.org/jira/browse/HIVE-16014
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> HiveMetastoreChecker uses hive.mv.files.thread configuration value for 
> determining the pool size as below :
> {noformat}
> private void checkPartitionDirs(Path basePath, Set allDirs, int 
> maxDepth) throws IOException, HiveException {
> ConcurrentLinkedQueue basePaths = new ConcurrentLinkedQueue<>();
> basePaths.add(basePath);
> Set dirSet = Collections.newSetFromMap(new ConcurrentHashMap Boolean>());
> // Here we just reuse the THREAD_COUNT configuration for
> // HIVE_MOVE_FILES_THREAD_COUNT
> int poolSize = conf.getInt(ConfVars.HIVE_MOVE_FILES_THREAD_COUNT.varname, 
> 15);
> // Check if too low config is provided for move files. 2x CPU is 
> reasonable max count.
> poolSize = poolSize == 0 ? poolSize : Math.max(poolSize,
> Runtime.getRuntime().availableProcessors() * 2);
> {noformat}
> msck is commonly used to add the missing partitions for the table from the 
> Filesystem. In such a case different pool sizes for HMSHandler and 
> HiveMetastoreChecker can affect the performance. Eg. If 
> {{hive.metastore.fshandler.threads}} is set to a lower value like 15 and 
> {{hive.mv.files.thread}} is much higher like 100 or vice versa the smaller 
> pool will become the bottleneck. If would be good to use 
> {{hive.metastore.fshandler.threads}} to size the pool for 
> HiveMetastoreChecker since the number missing partitions and number of 
> partitions to be added will most likely be the same. In such a case the 
> performance of the query will be optimum when both the pool sizes are same.
> Since it is possible to tune both the configs individually it will be very 
> likely that they may be different. But since there is a strong co-relation 
> between amount of work done by HiveMetastoreChecker and 
> HiveMetastore.add_partitions call it might be a good idea to use 
> {{hive.metastore.fshandler.threads}} for pool size instead of 
> {{hive.mv.files.thread}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16014) HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of hive.mv.files.thread for pool size

2017-02-22 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879413#comment-15879413
 ] 

Vihang Karajgaonkar commented on HIVE-16014:


Hi [~rajesh.balamohan] can you please comment if you think this would 
reasonable to change?

> HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of 
> hive.mv.files.thread for pool size
> --
>
> Key: HIVE-16014
> URL: https://issues.apache.org/jira/browse/HIVE-16014
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> HiveMetastoreChecker uses hive.mv.files.thread configuration value for 
> determining the pool size as below :
> {noformat}
> private void checkPartitionDirs(Path basePath, Set allDirs, int 
> maxDepth) throws IOException, HiveException {
> ConcurrentLinkedQueue basePaths = new ConcurrentLinkedQueue<>();
> basePaths.add(basePath);
> Set dirSet = Collections.newSetFromMap(new ConcurrentHashMap Boolean>());
> // Here we just reuse the THREAD_COUNT configuration for
> // HIVE_MOVE_FILES_THREAD_COUNT
> int poolSize = conf.getInt(ConfVars.HIVE_MOVE_FILES_THREAD_COUNT.varname, 
> 15);
> // Check if too low config is provided for move files. 2x CPU is 
> reasonable max count.
> poolSize = poolSize == 0 ? poolSize : Math.max(poolSize,
> Runtime.getRuntime().availableProcessors() * 2);
> {noformat}
> msck is commonly used to add the missing partitions for the table from the 
> Filesystem. In such a case different pool sizes for HMSHandler and 
> HiveMetastoreChecker can affect the performance. Eg. If 
> {{hive.metastore.fshandler.threads}} is set to a lower value like 15 and 
> {{hive.mv.files.thread}} is much higher like 100 or vice versa the smaller 
> pool will become the bottleneck. If would be good to use 
> {{hive.metastore.fshandler.threads}} to size the pool for 
> HiveMetastoreChecker since the number missing partitions and number of 
> partitions to be added will most likely be the same. In such a case the 
> performance of the query will be optimum when both the pool sizes are same.
> Since it is possible to tune both the configs individually it will be very 
> likely that they may be different. But since there is a strong co-relation 
> between amount of work done by HiveMetastoreChecker and 
> HiveMetastore.add_partitions call it might be a good idea to use 
> {{hive.metastore.fshandler.threads}} for pool size instead of 
> {{hive.mv.files.thread}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879410#comment-15879410
 ] 

Ashutosh Chauhan commented on HIVE-15951:
-

I checked commons-io is not packaged with hive-exec so it wont be available on 
task nodes. All callers of FileUtils in hive-common are in front end, so it 
wont be an issue for it. But if we add it here, we will need commons-io jar on 
task nodes.

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.2.patch, HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16012) BytesBytes hash table - better capacity exhaustion handling I

2017-02-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16012:

Attachment: HIVE-16012.01.patch

> BytesBytes hash table - better capacity exhaustion handling I
> -
>
> Key: HIVE-16012
> URL: https://issues.apache.org/jira/browse/HIVE-16012
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16012.01.patch, HIVE-16012.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-16005) miscellaneous small fixes to help with llap debuggability

2017-02-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879399#comment-15879399
 ] 

Sergey Shelukhin edited comment on HIVE-16005 at 2/22/17 10:45 PM:
---

DagID is generated in TezWork ctor {noformat}   public TezWork(String queryId, 
Configuration conf) {
this.dagId = queryId + ":" + counter.getAndIncrement(); {noformat} and 
retrieved via getDagId. Doesn't really matter to me which one is used (or if 
they match), just a nit.

+1 can be modified on commit if needed.


was (Author: sershe):
DagID is generated in TezWork ctor {noformat}   public TezWork(String queryId, 
Configuration conf) {
this.dagId = queryId + ":" + counter.getAndIncrement(); {noformat} and 
retrieved via getDagId. Doesn't really matter to me which one is used (or if 
they matched), just a nit.

+1 can be modified on commit if needed.

> miscellaneous small fixes to help with llap debuggability
> -
>
> Key: HIVE-16005
> URL: https://issues.apache.org/jira/browse/HIVE-16005
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16005.01.patch
>
>
> - Include proc_ in cli, beeline, metastore, hs2 process args
> - LLAP history logger - log QueryId instead of dagName (dag name is free 
> flowing text)
> - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / 
> queued
> - Include thread name in TaskRunnerCallable so that it shows up in stack 
> traces (will cause extra output in logs)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16005) miscellaneous small fixes to help with llap debuggability

2017-02-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879399#comment-15879399
 ] 

Sergey Shelukhin commented on HIVE-16005:
-

DagID is generated in TezWork ctor {noformat}   public TezWork(String queryId, 
Configuration conf) {
this.dagId = queryId + ":" + counter.getAndIncrement(); {noformat} and 
retrieved via getDagId. Doesn't really matter to me which one is used (or if 
they matched), just a nit.

+1 can be modified on commit if needed.

> miscellaneous small fixes to help with llap debuggability
> -
>
> Key: HIVE-16005
> URL: https://issues.apache.org/jira/browse/HIVE-16005
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16005.01.patch
>
>
> - Include proc_ in cli, beeline, metastore, hs2 process args
> - LLAP history logger - log QueryId instead of dagName (dag name is free 
> flowing text)
> - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / 
> queued
> - Include thread name in TaskRunnerCallable so that it shows up in stack 
> traces (will cause extra output in logs)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15830) Allow additional view ACLs for tez jobs

2017-02-22 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-15830:
--
Attachment: HIVE-15830.05.patch

Updated patch:
Moved to Uilities under ql/exec
Also added a minor fix in the ATSHook to use ugi.getShortName instead of 
ugi.getUser. cc [~jdere]

> Allow additional view ACLs for tez jobs
> ---
>
> Key: HIVE-15830
> URL: https://issues.apache.org/jira/browse/HIVE-15830
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-15830.01.patch, HIVE-15830.02.patch, 
> HIVE-15830.03.patch, HIVE-15830.05.patch
>
>
> Allow users to grant view access to additional users when running tez jobs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16013) Fragments without locality can stack up on nodes

2017-02-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16013:
-
Attachment: HIVE-16013.1.patch

> Fragments without locality can stack up on nodes
> 
>
> Key: HIVE-16013
> URL: https://issues.apache.org/jira/browse/HIVE-16013
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16013.1.patch
>
>
> When no locality information is provide, task requests can stack up on a node 
> because of consistent no selection. When locality information is not provided 
> we should fallback to random selection for better work distribution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16013) Fragments without locality can stack up on nodes

2017-02-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16013:
-
Status: Patch Available  (was: Open)

> Fragments without locality can stack up on nodes
> 
>
> Key: HIVE-16013
> URL: https://issues.apache.org/jira/browse/HIVE-16013
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16013.1.patch
>
>
> When no locality information is provide, task requests can stack up on a node 
> because of consistent no selection. When locality information is not provided 
> we should fallback to random selection for better work distribution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16013) Fragments without locality can stack up on nodes

2017-02-22 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879389#comment-15879389
 ] 

Prasanth Jayachandran commented on HIVE-16013:
--

[~sseth] can you please take a look?

> Fragments without locality can stack up on nodes
> 
>
> Key: HIVE-16013
> URL: https://issues.apache.org/jira/browse/HIVE-16013
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-16013.1.patch
>
>
> When no locality information is provide, task requests can stack up on a node 
> because of consistent no selection. When locality information is not provided 
> we should fallback to random selection for better work distribution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16013) Fragments without locality can stack up on nodes

2017-02-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-16013:
-
Reporter: Siddharth Seth  (was: Prasanth Jayachandran)

> Fragments without locality can stack up on nodes
> 
>
> Key: HIVE-16013
> URL: https://issues.apache.org/jira/browse/HIVE-16013
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
>
> When no locality information is provide, task requests can stack up on a node 
> because of consistent no selection. When locality information is not provided 
> we should fallback to random selection for better work distribution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16013) Fragments without locality can stack up on nodes

2017-02-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-16013:



> Fragments without locality can stack up on nodes
> 
>
> Key: HIVE-16013
> URL: https://issues.apache.org/jira/browse/HIVE-16013
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> When no locality information is provide, task requests can stack up on a node 
> because of consistent no selection. When locality information is not provided 
> we should fallback to random selection for better work distribution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879384#comment-15879384
 ] 

Hive QA commented on HIVE-14901:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854009/HIVE-14901.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10254 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3701/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3701/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3701/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12854009 - PreCommit-HIVE-Build

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.5.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-14735) Build Infra: Spark artifacts download takes a long time

2017-02-22 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14735:

Attachment: HIVE-14735.4.patch

[~spena], i've changed the logic which built the repository into a shell script.

I was wondering: would it be possible (and acceptable) to upload this 
'org.apache.hive.aux:spark-without-hive' artifact to repository.apache.org? 

> Build Infra: Spark artifacts download takes a long time
> ---
>
> Key: HIVE-14735
> URL: https://issues.apache.org/jira/browse/HIVE-14735
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Vaibhav Gumashta
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14735.1.patch, HIVE-14735.1.patch, 
> HIVE-14735.1.patch, HIVE-14735.1.patch, HIVE-14735.2.patch, 
> HIVE-14735.3.patch, HIVE-14735.4.patch
>
>
> In particular this command:
> {{curl -Sso ./../thirdparty/spark-1.6.0-bin-hadoop2-without-hive.tgz 
> http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.6.0-bin-hadoop2-without-hive.tgz}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16005) miscellaneous small fixes to help with llap debuggability

2017-02-22 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879381#comment-15879381
 ] 

Siddharth Seth commented on HIVE-16005:
---

Is there a utility method to construct queryName":"dagIndex ? I can use that, 
or the one used in the log files. End goal is not to have free flowing text - 
which is what dagName was.

bq. Appending suffix to the thread name, is it primarily to get some context 
from jstack output? For stacktraces that gets logged will already have these 
info via NDC.
This is to have the thread name in the trace.

bq. Also in constructThreadNameSuffix, why does it have to do the dance with 
all the IDs, aren't all of them appended if you just do attemptId toString?
Cutting down the length, since it will be logged on each line.

> miscellaneous small fixes to help with llap debuggability
> -
>
> Key: HIVE-16005
> URL: https://issues.apache.org/jira/browse/HIVE-16005
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16005.01.patch
>
>
> - Include proc_ in cli, beeline, metastore, hs2 process args
> - LLAP history logger - log QueryId instead of dagName (dag name is free 
> flowing text)
> - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / 
> queued
> - Include thread name in TaskRunnerCallable so that it shows up in stack 
> traces (will cause extra output in logs)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF

2017-02-22 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879353#comment-15879353
 ] 

Vineet Garg commented on HIVE-16002:


[~ashutoshc] done

> Correlated IN subquery with aggregate asserts in sq_count_check UDF
> ---
>
> Key: HIVE-16002
> URL: https://issues.apache.org/jira/browse/HIVE-16002
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16002.1.patch, HIVE-16002.2.patch, 
> HIVE-16002.3.patch
>
>
> Reproducer
> {code:SQL}
> create table t(i int, j int);
> insert into t values(0,1), (0,2);
> create table tt(i int, j int);
> insert into tt values(0,3);
> select * from t where i IN (select count(i) from tt where tt.j = t.j);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-22 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879355#comment-15879355
 ] 

slim bouguerra commented on HIVE-15951:
---

[~ashutoshc] valid point but hive common is using that method as well so i 
think it is ok to use it.
https://github.com/b-slim/hive/blob/38ad77929980dc155dcc4a5d009a9a855eb5b017/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L755-L755


> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.2.patch, HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF

2017-02-22 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16002:
---
Attachment: HIVE-16002.3.patch

> Correlated IN subquery with aggregate asserts in sq_count_check UDF
> ---
>
> Key: HIVE-16002
> URL: https://issues.apache.org/jira/browse/HIVE-16002
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16002.1.patch, HIVE-16002.2.patch, 
> HIVE-16002.3.patch
>
>
> Reproducer
> {code:SQL}
> create table t(i int, j int);
> insert into t values(0,1), (0,2);
> create table tt(i int, j int);
> insert into tt values(0,3);
> select * from t where i IN (select count(i) from tt where tt.j = t.j);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF

2017-02-22 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16002:
---
Status: Open  (was: Patch Available)

> Correlated IN subquery with aggregate asserts in sq_count_check UDF
> ---
>
> Key: HIVE-16002
> URL: https://issues.apache.org/jira/browse/HIVE-16002
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16002.1.patch, HIVE-16002.2.patch, 
> HIVE-16002.3.patch
>
>
> Reproducer
> {code:SQL}
> create table t(i int, j int);
> insert into t values(0,1), (0,2);
> create table tt(i int, j int);
> insert into tt values(0,3);
> select * from t where i IN (select count(i) from tt where tt.j = t.j);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF

2017-02-22 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16002:
---
Status: Patch Available  (was: Open)

> Correlated IN subquery with aggregate asserts in sq_count_check UDF
> ---
>
> Key: HIVE-16002
> URL: https://issues.apache.org/jira/browse/HIVE-16002
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16002.1.patch, HIVE-16002.2.patch, 
> HIVE-16002.3.patch
>
>
> Reproducer
> {code:SQL}
> create table t(i int, j int);
> insert into t values(0,1), (0,2);
> create table tt(i int, j int);
> insert into tt values(0,3);
> select * from t where i IN (select count(i) from tt where tt.j = t.j);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16012) BytesBytes hash table - better capacity exhaustion handling I

2017-02-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879351#comment-15879351
 ] 

Sergey Shelukhin commented on HIVE-16012:
-

Actually nm, not resizing will not merely make it fail, it will make it 
extremely slow as load factor approaches 1. It's better to fail.

> BytesBytes hash table - better capacity exhaustion handling I
> -
>
> Key: HIVE-16012
> URL: https://issues.apache.org/jira/browse/HIVE-16012
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16012.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16012) BytesBytes hash table - better capacity exhaustion handling I

2017-02-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-16012:

Status: Open  (was: Patch Available)

> BytesBytes hash table - better capacity exhaustion handling I
> -
>
> Key: HIVE-16012
> URL: https://issues.apache.org/jira/browse/HIVE-16012
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-16012.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

1 2 3 >

1 - 100 of 231 matches

Mail list logo