Re: Review Request 15449: session/operation timeout for hiveserver2

2014-08-28 Thread Lefty Leverenz


> On Aug. 28, 2014, 7:56 a.m., Lefty Leverenz wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 1523
> > 
> >
> > Please restore "(in seconds)" to description and specify other time 
> > units that can be used, if any.

Not an issue -- my mistake.


> On Aug. 28, 2014, 7:56 a.m., Lefty Leverenz wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 1529
> > 
> >
> > Please restore "(in seconds)" to description and specify other time 
> > units that can be used, if any.

Not an issue -- my mistake.


> On Aug. 28, 2014, 7:56 a.m., Lefty Leverenz wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 1601
> > 
> >
> > Please add time unit information:  "Accepts time units like 
> > d/h/m/s/ms/us/ns."

Not an issue -- my mistake.


> On Aug. 28, 2014, 7:56 a.m., Lefty Leverenz wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 1604
> > 
> >
> > Please add time unit information:  "Accepts time units like 
> > d/h/m/s/ms/us/ns."

Not an issue -- my mistake.


> On Aug. 28, 2014, 7:56 a.m., Lefty Leverenz wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 1607
> > 
> >
> > Please add time unit information:  "Accepts time units like 
> > d/h/m/s/ms/us/ns."

Not an issue -- my mistake.


- Lefty


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15449/#review51760
---


On Aug. 28, 2014, 2:31 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15449/
> ---
> 
> (Updated Aug. 28, 2014, 2:31 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-5799
> https://issues.apache.org/jira/browse/HIVE-5799
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Need some timeout facility for preventing resource leakages from instable or 
> bad clients.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/ant/GenHiveTemplate.java 4293b7c 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 74bb863 
>   common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2SessionTimeout.java
>  PRE-CREATION 
>   service/src/java/org/apache/hive/service/cli/CLIService.java ff5de4a 
>   service/src/java/org/apache/hive/service/cli/OperationState.java 3e15f0c 
>   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
> 0d6436e 
>   
> service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
> 2867301 
>   service/src/java/org/apache/hive/service/cli/session/HiveSession.java 
> 270e4a6 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
> 84e1c7e 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> 4e5f595 
>   
> service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
>  39d2184 
>   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
> 17c1c7b 
>   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java d01e819 
> 
> Diff: https://reviews.apache.org/r/15449/diff/
> 
> 
> Testing
> ---
> 
> Confirmed in the local environment.
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>



Re: Review Request 15449: session/operation timeout for hiveserver2

2014-08-28 Thread Lefty Leverenz


> On Aug. 28, 2014, 7:56 a.m., Lefty Leverenz wrote:
> >
> 
> Navis Ryu wrote:
> Addressing previous comments, I've revised validator to describe itself 
> to description. For StringSet validator, the description of the conf will be 
> started with something like, "Expects one of [textfile, sequencefile, rcfile, 
> orc]." and for TimeValidator, it's "Expects a numeric value with timeunit 
> (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec)", etc. It's the 
> reason why some part of description is removed. Could you generate the 
> template and see the result? (cd commmon;mvn clean package -Phadoop-2 -Pdist 
> -DskipTests). If you don't like this, I'll revert that.

Navis, that is cool to the nth degree!  I applied patch 15, generated a 
template file, and checked each parameter changed by the patch.  All the 
"Expects" phrases look great.

However, non-numeric values are lowercase.  For example, 
hive.exec.orc.encoding.strategy used to say the values are SPEED and 
COMPRESSION, but now it's "Expects one of [speed, compression]."  Are all 
parameter values case-insensitive?  If so, the Configuration Properties & 
Configuration docs should mention it.

Two parameters still give units in their descriptions, although that seems to 
be deliberate:

  - hive.server2.long.polling.timeout:  "Time in milliseconds that HiveServer2 
will wait, ..." (has a non-zero default value, in milliseconds)
  - hive.support.quoted.identifiers:  "Whether to use quoted identifier. 'none' 
or 'column' can be used." (goes on to explain what 'none' and 'column' mean)


- Lefty


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15449/#review51760
---


On Aug. 28, 2014, 2:31 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15449/
> ---
> 
> (Updated Aug. 28, 2014, 2:31 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-5799
> https://issues.apache.org/jira/browse/HIVE-5799
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Need some timeout facility for preventing resource leakages from instable or 
> bad clients.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/ant/GenHiveTemplate.java 4293b7c 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 74bb863 
>   common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2SessionTimeout.java
>  PRE-CREATION 
>   service/src/java/org/apache/hive/service/cli/CLIService.java ff5de4a 
>   service/src/java/org/apache/hive/service/cli/OperationState.java 3e15f0c 
>   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
> 0d6436e 
>   
> service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
> 2867301 
>   service/src/java/org/apache/hive/service/cli/session/HiveSession.java 
> 270e4a6 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
> 84e1c7e 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> 4e5f595 
>   
> service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
>  39d2184 
>   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
> 17c1c7b 
>   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java d01e819 
> 
> Diff: https://reviews.apache.org/r/15449/diff/
> 
> 
> Testing
> ---
> 
> Confirmed in the local environment.
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>



Re: Review Request 25176: HIVE-7870: Insert overwrite table query does not generate correct task plan [Spark Branch]

2014-08-28 Thread Na Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25176/
---

(Updated Aug. 29, 2014, 6:44 a.m.)


Review request for hive, Brock Noland, Szehon Ho, and Xuefu Zhang.


Changes
---

1. add .q.out for TestCliDriver test for all new spark .q tests
2. update existing .q.out files because of plan change


Bugs: HIVE-7870
https://issues.apache.org/jira/browse/HIVE-7870


Repository: hive-git


Description
---

HIVE-7870: Insert overwrite table query does not generate correct task plan 
[Spark Branch]

The cause of this problem is during spark/tez task generation, the union file 
sink operator are cloned to two new filesink operator. The linkedfilesinkdesc 
info for those new filesink operators are missing. In addition, the two new 
filesink operators also need to be linked together.   


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 6393671 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 9c808d4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
5ddc16d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 379a39c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 76fc290 
  ql/src/test/queries/clientpositive/union_remove_spark_1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_10.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_11.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_15.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_16.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_17.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_18.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_19.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_20.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_21.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_24.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_25.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_4.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_7.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_8.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_9.q PRE-CREATION 
  ql/src/test/results/clientpositive/spark/sample8.q.out c7e333b 
  ql/src/test/results/clientpositive/spark/union10.q.out 20c681e 
  ql/src/test/results/clientpositive/spark/union18.q.out 3f37a0a 
  ql/src/test/results/clientpositive/spark/union19.q.out 6922fcd 
  ql/src/test/results/clientpositive/spark/union28.q.out 8bd5218 
  ql/src/test/results/clientpositive/spark/union29.q.out b9546ef 
  ql/src/test/results/clientpositive/spark/union3.q.out 3ae6536 
  ql/src/test/results/clientpositive/spark/union30.q.out 12717a1 
  ql/src/test/results/clientpositive/spark/union33.q.out b89757f 
  ql/src/test/results/clientpositive/spark/union4.q.out 6341cd9 
  ql/src/test/results/clientpositive/spark/union6.q.out 263d9f4 
  ql/src/test/results/clientpositive/spark/union_remove_spark_1.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_10.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_11.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_15.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_16.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_17.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_18.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_19.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_2.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_20.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_21.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_24.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_25.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_3.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_4.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_5.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/s

Re: Review Request 25176: HIVE-7870: Insert overwrite table query does not generate correct task plan [Spark Branch]

2014-08-28 Thread Na Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25176/
---

(Updated Aug. 29, 2014, 6:44 a.m.)


Review request for hive, Brock Noland, Szehon Ho, and Xuefu Zhang.


Bugs: HIVE-7870
https://issues.apache.org/jira/browse/HIVE-7870


Repository: hive-git


Description
---

HIVE-7870: Insert overwrite table query does not generate correct task plan 
[Spark Branch]

The cause of this problem is during spark/tez task generation, the union file 
sink operator are cloned to two new filesink operator. The linkedfilesinkdesc 
info for those new filesink operators are missing. In addition, the two new 
filesink operators also need to be linked together.   


Diffs
-

  itests/src/test/resources/testconfiguration.properties 6393671 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 9c808d4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
5ddc16d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 379a39c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 76fc290 
  ql/src/test/queries/clientpositive/union_remove_spark_1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_10.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_11.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_15.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_16.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_17.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_18.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_19.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_20.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_21.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_24.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_25.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_4.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_7.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_8.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_9.q PRE-CREATION 
  ql/src/test/results/clientpositive/spark/sample8.q.out c7e333b 
  ql/src/test/results/clientpositive/spark/union10.q.out 20c681e 
  ql/src/test/results/clientpositive/spark/union18.q.out 3f37a0a 
  ql/src/test/results/clientpositive/spark/union19.q.out 6922fcd 
  ql/src/test/results/clientpositive/spark/union28.q.out 8bd5218 
  ql/src/test/results/clientpositive/spark/union29.q.out b9546ef 
  ql/src/test/results/clientpositive/spark/union3.q.out 3ae6536 
  ql/src/test/results/clientpositive/spark/union30.q.out 12717a1 
  ql/src/test/results/clientpositive/spark/union33.q.out b89757f 
  ql/src/test/results/clientpositive/spark/union4.q.out 6341cd9 
  ql/src/test/results/clientpositive/spark/union6.q.out 263d9f4 
  ql/src/test/results/clientpositive/spark/union_remove_spark_1.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_10.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_11.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_15.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_16.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_17.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_18.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_19.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_2.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_20.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_21.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_24.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_25.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_3.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_4.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_5.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_6.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_7.q.out 
PRE-CREATION 
  ql/src/test/re

[jira] [Commented] (HIVE-7907) Bring up tez branch to changes in TEZ-1038, TEZ-1500

2014-08-28 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114933#comment-14114933
 ] 

Gopal V commented on HIVE-7907:
---

Looks like this has to wait till the 0.5.0-SNAPSHOT gets updated on the apache 
snapshots.

> Bring up tez branch to changes in TEZ-1038, TEZ-1500
> 
>
> Key: HIVE-7907
> URL: https://issues.apache.org/jira/browse/HIVE-7907
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-7907.1-tez.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7870) Insert overwrite table query does not generate correct task plan [Spark Branch]

2014-08-28 Thread Na Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Na Yang updated HIVE-7870:
--

Attachment: HIVE-7870.3-spark.patch

> Insert overwrite table query does not generate correct task plan [Spark 
> Branch]
> ---
>
> Key: HIVE-7870
> URL: https://issues.apache.org/jira/browse/HIVE-7870
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Na Yang
>Assignee: Na Yang
>  Labels: Spark-M1
> Attachments: HIVE-7870.1-spark.patch, HIVE-7870.2-spark.patch, 
> HIVE-7870.3-spark.patch
>
>
> Insert overwrite table query does not generate correct task plan when 
> hive.optimize.union.remove and hive.merge.sparkfiles properties are ON. 
> {noformat}
> set hive.optimize.union.remove=true
> set hive.merge.sparkfiles=true
> insert overwrite table outputTbl1
> SELECT * FROM
> (
> select key, 1 as values from inputTbl1
> union all
> select * FROM (
>   SELECT key, count(1) as values from inputTbl1 group by key
>   UNION ALL
>   SELECT key, 2 as values from inputTbl1
> ) a
> )b;
> select * from outputTbl1 order by key, values;
> {noformat}
> query result
> {noformat}
> 1 1
> 1 2
> 2 1
> 2 2
> 3 1
> 3 2
> 7 1
> 7 2
> 8 2
> 8 2
> 8 2
> {noformat}
> expected result:
> {noformat}
> 1 1
> 1 1
> 1 2
> 2 1
> 2 1
> 2 2
> 3 1
> 3 1
> 3 2
> 7 1
> 7 1
> 7 2
> 8 1
> 8 1
> 8 2
> 8 2
> 8 2
> {noformat}
> Move work is not working properly and some data are missing during move.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7907) Bring up tez branch to changes in TEZ-1038, TEZ-1500

2014-08-28 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114925#comment-14114925
 ] 

Gunther Hagleitner commented on HIVE-7907:
--

+1

> Bring up tez branch to changes in TEZ-1038, TEZ-1500
> 
>
> Key: HIVE-7907
> URL: https://issues.apache.org/jira/browse/HIVE-7907
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-7907.1-tez.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24688: parallel order by clause on a string column fails with IOException: Split points are out of order

2014-08-28 Thread Szehon Ho


> On Aug. 28, 2014, 6:05 a.m., Szehon Ho wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 1040
> > 
> >
> > Yep, thats what I meant.
> 
> Navis Ryu wrote:
> I think this option seemed not useful. Any bigger number than one 
> reducer, which is default for order-by, will make better performance, then 
> why don't we try with that?

You mean get rid of error check?  I was just trying to make this option easier 
to user, if we aren't going to expose it I'm ok with that.


- Szehon


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24688/#review51747
---


On Aug. 27, 2014, 2:18 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24688/
> ---
> 
> (Updated Aug. 27, 2014, 2:18 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-7669
> https://issues.apache.org/jira/browse/HIVE-7669
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The source table has 600 Million rows and it has a String column 
> "l_shipinstruct" which has 4 unique values. (Ie. these 4 values are repeated 
> across the 600 million rows)
> 
> We are sorting it based on this string column "l_shipinstruct" as shown in 
> the below HiveQL with the following parameters. 
> {code:sql}
> set hive.optimize.sampling.orderby=true;
> set hive.optimize.sampling.orderby.number=1000;
> set hive.optimize.sampling.orderby.percent=0.1f;
> 
> insert overwrite table lineitem_temp_report 
> select 
>   l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
> l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, 
> l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment
> from 
>   lineitem
> order by l_shipinstruct;
> {code}
> Stack Trace
> Diagnostic Messages for this Task:
> {noformat}
> Error: java.lang.RuntimeException: Error in configuring object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.(MapTask.java:569)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
> ... 10 more
> Caused by: java.lang.IllegalArgumentException: Can't read partitions file
> at 
> org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)
> at 
> org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:42)
> at 
> org.apache.hadoop.hive.ql.exec.HiveTotalOrderPartitioner.configure(HiveTotalOrderPartitioner.java:37)
> ... 15 more
> Caused by: java.io.IOException: Split points are out of order
> at 
> org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:96)
> ... 17 more
> {noformat}
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7f4afd9 
>   common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/HiveTotalOrderPartitioner.java 
> 6c22362 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java 166461a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java ef72039 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestPartitionKeySampler.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24688/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>



[jira] [Commented] (HIVE-7775) enable sample8.q.[Spark Branch]

2014-08-28 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114919#comment-14114919
 ] 

Szehon Ho commented on HIVE-7775:
-

Hi Chengxiang, sorry do you mind opening a new JIRA as this one is already 
resolved?  Its one JIRA per commit.

> enable sample8.q.[Spark Branch]
> ---
>
> Key: HIVE-7775
> URL: https://issues.apache.org/jira/browse/HIVE-7775
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Fix For: spark-branch
>
> Attachments: HIVE-7775.1-spark.patch, HIVE-7775.2-spark.patch, 
> HIVE-7775.3-spark.additional.patch
>
>
> sample8.q contain join query, should enable this qtest after hive on spark 
> support join operation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7908) CBO: Handle Windowing functions part of expressions

2014-08-28 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-7908:
-

Status: Patch Available  (was: Open)

> CBO: Handle Windowing functions part of expressions
> ---
>
> Key: HIVE-7908
> URL: https://issues.apache.org/jira/browse/HIVE-7908
> Project: Hive
>  Issue Type: Bug
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-7908.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7908) CBO: Handle Windowing functions part of expressions

2014-08-28 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-7908:
-

Attachment: HIVE-7908.patch

> CBO: Handle Windowing functions part of expressions
> ---
>
> Key: HIVE-7908
> URL: https://issues.apache.org/jira/browse/HIVE-7908
> Project: Hive
>  Issue Type: Bug
>Reporter: Laljo John Pullokkaran
>Assignee: Laljo John Pullokkaran
> Attachments: HIVE-7908.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7649) Support column stats with temporary tables

2014-08-28 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-7649:
-

Attachment: HIVE-7649.4.patch

patch v4, changes per review comments from Prasanth

> Support column stats with temporary tables
> --
>
> Key: HIVE-7649
> URL: https://issues.apache.org/jira/browse/HIVE-7649
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-7649.1.patch, HIVE-7649.2.patch, HIVE-7649.3.patch, 
> HIVE-7649.4.patch
>
>
> Column stats currently not supported with temp tables, see if they can be 
> added.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24472: HIVE-7649: Support column stats with temporary tables

2014-08-28 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24472/
---

(Updated Aug. 29, 2014, 6:13 a.m.)


Review request for hive and Prasanth_J.


Changes
---

Addressing review feedback from Prasanth


Bugs: HIVE-7649
https://issues.apache.org/jira/browse/HIVE-7649


Repository: hive-git


Description
---

Update SessionHiveMetastoreClient to get column stats to work for temp tables.


Diffs (updated)
-

  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
51c3f2c 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
4cf98d8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java 
3f8648b 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 9798cf3 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java 7cb7c5e 
  ql/src/test/queries/clientnegative/temp_table_column_stats.q 9b7aa4a 
  ql/src/test/queries/clientpositive/temp_table_display_colstats_tbllvl.q 
PRE-CREATION 
  ql/src/test/results/clientnegative/temp_table_column_stats.q.out 486597a 
  ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/24472/diff/


Testing
---


Thanks,

Jason Dere



[jira] [Commented] (HIVE-7497) Fix some default values in HiveConf

2014-08-28 Thread Dong Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114902#comment-14114902
 ] 

Dong Chen commented on HIVE-7497:
-

[~vgumashta] Thanks for taking care of it. I'm ok with it and please go ahead. 
Thanks :)

> Fix some default values in HiveConf
> ---
>
> Key: HIVE-7497
> URL: https://issues.apache.org/jira/browse/HIVE-7497
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Dong Chen
>  Labels: TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-7497.1.patch, HIVE-7497.patch
>
>
> HIVE-5160 resolves an env variable at runtime via calling System.getenv(). As 
> long as the variable is not defined when you run the build null is returned 
> and the path is not placed in the hive-default,template. However if it is 
> defined it will populate hive-default.template with a path which will be 
> different based on the user running the build. We should use 
> $\{system:HIVE_CONF_DIR\} instead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7857) Hive query fails after Tez session times out

2014-08-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114901#comment-14114901
 ] 

Hive QA commented on HIVE-7857:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12665141/HIVE-7857.2.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 6127 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_stats_counter
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.testImpersonation
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/552/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/552/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-552/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12665141

> Hive query fails after Tez session times out
> 
>
> Key: HIVE-7857
> URL: https://issues.apache.org/jira/browse/HIVE-7857
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-7857.1.patch, HIVE-7857.2.patch
>
>
> Originally reported by [~deepesh]
> Steps to reproduce:
> Open the Hive CLI, ensure that HIVE_AUX_JARS_PATH has hcatalog-core.jar 
> in the path.
> Keep it idle for more than 5 minutes (this is the default tez session 
> timeout). Essentially Tez session should time out.
> Run a Hive on Tez query, the query fails. Here is a sample CLI session:
> {noformat}
> hive> select from_unixtime(unix_timestamp(), "dd-MMM-") from 
> vectortab10korc limit 1;
> Query ID = hrt_qa_20140626002525_6e964079-4031-406b-85ed-cda9c65dca22
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (application id: application_1403688364015_1930)
> Map 1: -/-
> Map 1: 0/1
> Map 1: 0/1
> Map 1: 0/1
> Map 1: 0/1
> Map 1: 0/1
> Status: Failed
> Vertex failed, vertexName=Map 1, vertexId=vertex_1403688364015_1930_1_00, 
> diagnostics=[Task failed, taskId=task_1403688364015_1930_1_00_00, 
> diagnostics=[AttemptID:attempt_1403688364015_1930_1_00_00_0 
> Info:Container container_1403688364015_1930_01_02 COMPLETED with 
> diagnostics set to [Resource 
> hdfs://ambari-sec-1403670773-others-2-1.cs1cloud.internal:8020/tmp/hive-hrt_qa/_tez_session_dir/3d3ef758-90f3-4bb3-86cb-902aeb3b8830/hive-hcatalog-core-0.13.0.2.1.3.0-554.jar
>  changed on src filesystem (expected 1403741969169, was 1403742347351
> ], AttemptID:attempt_1403688364015_1930_1_00_00_1 Info:Container 
> container_1403688364015_1930_01_03 COMPLETED with diagnostics set to 
> [Resource 
> hdfs://ambari-sec-1403670773-others-2-1.cs1cloud.internal:8020/tmp/hive-hrt_qa/_tez_session_dir/3d3ef758-90f3-4bb3-86cb-902aeb3b8830/hive-hcatalog-core-0.13.0.2.1.3.0-554.jar
>  changed on src filesystem (expected 1403741969169, was 1403742347351
> ], AttemptID:attempt_1403688364015_1930_1_00_00_2 Info:Container 
> container_1403688364015_1930_01_04 COMPLETED with diagnostics set to 
> [Resource 
> hdfs://ambari-sec-1403670773-others-2-1.cs1cloud.internal:8020/tmp/hive-hrt_qa/_tez_session_dir/3d3ef758-90f3-4bb3-86cb-902aeb3b8830/hive-hcatalog-core-0.13.0.2.1.3.0-554.jar
>  changed on src filesystem (expected 1403741969169, was 1403742347351
> ], AttemptID:attempt_1403688364015_1930_1_00_00_3 Info:Container 
> container_1403688364015_1930_01_05 COMPLETED with diagnostics set to 
> [Resource 
> hdfs://ambari-sec-1403670773-others-2-1.cs1cloud.internal:8020/tmp/hive-hrt_qa/_tez_session_dir/3d3ef758-90f3-4bb3-86cb-902aeb3b8830/hive-hcatalog-core-0.13.0.2.1.3.0-554.jar
>  changed on src filesystem (expected 1403741969169, was 1403742347351
> ]], Vertex failed as one or more tasks failed. failedTasks:1]
> DAG failed due to vertex failure. failedVertices:1 killedVertices:0
> FAILED: Execution Error, return code 2 from 
> org.apach

Re: Review Request 24688: parallel order by clause on a string column fails with IOException: Split points are out of order

2014-08-28 Thread Navis Ryu


> On Aug. 28, 2014, 6:05 a.m., Szehon Ho wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 1040
> > 
> >
> > Yep, thats what I meant.

I think this option seemed not useful. Any bigger number than one reducer, 
which is default for order-by, will make better performance, then why don't we 
try with that?


- Navis


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24688/#review51747
---


On Aug. 27, 2014, 2:18 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24688/
> ---
> 
> (Updated Aug. 27, 2014, 2:18 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-7669
> https://issues.apache.org/jira/browse/HIVE-7669
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The source table has 600 Million rows and it has a String column 
> "l_shipinstruct" which has 4 unique values. (Ie. these 4 values are repeated 
> across the 600 million rows)
> 
> We are sorting it based on this string column "l_shipinstruct" as shown in 
> the below HiveQL with the following parameters. 
> {code:sql}
> set hive.optimize.sampling.orderby=true;
> set hive.optimize.sampling.orderby.number=1000;
> set hive.optimize.sampling.orderby.percent=0.1f;
> 
> insert overwrite table lineitem_temp_report 
> select 
>   l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
> l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, 
> l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment
> from 
>   lineitem
> order by l_shipinstruct;
> {code}
> Stack Trace
> Diagnostic Messages for this Task:
> {noformat}
> Error: java.lang.RuntimeException: Error in configuring object
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
> at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
> at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
> at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.(MapTask.java:569)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
> ... 10 more
> Caused by: java.lang.IllegalArgumentException: Can't read partitions file
> at 
> org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:116)
> at 
> org.apache.hadoop.mapred.lib.TotalOrderPartitioner.configure(TotalOrderPartitioner.java:42)
> at 
> org.apache.hadoop.hive.ql.exec.HiveTotalOrderPartitioner.configure(HiveTotalOrderPartitioner.java:37)
> ... 15 more
> Caused by: java.io.IOException: Split points are out of order
> at 
> org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:96)
> ... 17 more
> {noformat}
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7f4afd9 
>   common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/HiveTotalOrderPartitioner.java 
> 6c22362 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/PartitionKeySampler.java 166461a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java ef72039 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestPartitionKeySampler.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24688/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>



[jira] [Created] (HIVE-7908) CBO: Handle Windowing functions part of expressions

2014-08-28 Thread Laljo John Pullokkaran (JIRA)
Laljo John Pullokkaran created HIVE-7908:


 Summary: CBO: Handle Windowing functions part of expressions
 Key: HIVE-7908
 URL: https://issues.apache.org/jira/browse/HIVE-7908
 Project: Hive
  Issue Type: Bug
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7568) Hive Cli cannot execute query from files when file has BOM character

2014-08-28 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7568:


Attachment: HIVE-7568.3.patch.txt

> Hive Cli cannot execute query from files when file has BOM character
> 
>
> Key: HIVE-7568
> URL: https://issues.apache.org/jira/browse/HIVE-7568
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Yoonseok Woo
>Assignee: Yoonseok Woo
>Priority: Minor
> Attachments: HIVE-7568.2.patch, HIVE-7568.3.patch.txt, HIVE-7568.patch
>
>
> # query file with BOM
> {code:sql|title=test.sql|borderStyle=solid}
> select 1
> {code}
> # execute
> {code}
> $ bin/hive -f ./test.sql
> FAILED: ParseException line 1:0 character 'Ô' not supported here
> line 1:1 character 'ª' not supported here
> line 1:2 character 'Ø' not supported here
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter

2014-08-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114837#comment-14114837
 ] 

Hive QA commented on HIVE-4329:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12665137/HIVE-4329.3.patch

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 6153 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.testPigPopulation
org.apache.hive.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask[4]
org.apache.hive.hcatalog.mapreduce.TestHCatDynamicPartitioned.testHCatDynamicPartitionedTable[4]
org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask[4]
org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatDynamicPartitionedTable[4]
org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatExternalDynamicCustomLocation[4]
org.apache.hive.hcatalog.mapreduce.TestHCatExternalNonPartitioned.testHCatNonPartitionedTable[4]
org.apache.hive.hcatalog.mapreduce.TestHCatExternalPartitioned.testHCatPartitionedTable[4]
org.apache.hive.hcatalog.mapreduce.TestHCatMutableDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask[4]
org.apache.hive.hcatalog.mapreduce.TestHCatMutableDynamicPartitioned.testHCatDynamicPartitionedTable[4]
org.apache.hive.hcatalog.mapreduce.TestHCatMutableNonPartitioned.testHCatNonPartitionedTable[4]
org.apache.hive.hcatalog.mapreduce.TestHCatMutablePartitioned.testHCatPartitionedTable[4]
org.apache.hive.hcatalog.mapreduce.TestHCatNonPartitioned.testHCatNonPartitionedTable[4]
org.apache.hive.hcatalog.mapreduce.TestHCatPartitioned.testHCatPartitionedTable[4]
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/550/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/550/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-550/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12665137

> HCatalog should use getHiveRecordWriter rather than getRecordWriter
> ---
>
> Key: HIVE-4329
> URL: https://issues.apache.org/jira/browse/HIVE-4329
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Serializers/Deserializers
>Affects Versions: 0.14.0
> Environment: discovered in Pig, but it looks like the root cause 
> impacts all non-Hive users
>Reporter: Sean Busbey
>Assignee: David Chen
> Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, 
> HIVE-4329.3.patch
>
>
> Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
> with the following stacktrace:
> {code}
> java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
> cast to org.apache.hadoop.io.LongWritable
>   at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
>   at 
> org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
>   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
>   at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
>   at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
> {code}
> The proximal cause of this failure is that the AvroContainerOutputFormat's 
> signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
> forces a NullWritable. I'm not sure of a general fix, other than redefining 
> HiveOutp

[jira] [Commented] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-08-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114803#comment-14114803
 ] 

Lefty Leverenz commented on HIVE-7482:
--

+1 for doc issues fixed in patch 3

> The execution side changes for SMB join in hive-tez
> ---
>
> Key: HIVE-7482
> URL: https://issues.apache.org/jira/browse/HIVE-7482
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-7482.1.patch, HIVE-7482.2.patch, HIVE-7482.3.patch, 
> HIVE-7482.WIP.2.patch, HIVE-7482.WIP.3.patch, HIVE-7482.WIP.4.patch, 
> HIVE-7482.WIP.patch
>
>
> A piece of HIVE-7430.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7907) Bring up tez branch to changes in TEZ-1038, TEZ-1500

2014-08-28 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-7907:
--

Status: Patch Available  (was: Open)

> Bring up tez branch to changes in TEZ-1038, TEZ-1500
> 
>
> Key: HIVE-7907
> URL: https://issues.apache.org/jira/browse/HIVE-7907
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-7907.1-tez.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7907) Bring up tez branch to changes in TEZ-1038, TEZ-1500

2014-08-28 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114795#comment-14114795
 ] 

Gopal V commented on HIVE-7907:
---

[~hagleitn]/[~vikram.dixit]: can you take a look at this compat patch?

> Bring up tez branch to changes in TEZ-1038, TEZ-1500
> 
>
> Key: HIVE-7907
> URL: https://issues.apache.org/jira/browse/HIVE-7907
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-7907.1-tez.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7907) Bring up tez branch to changes in TEZ-1038, TEZ-1500

2014-08-28 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-7907:
--

Attachment: HIVE-7907.1-tez.patch

> Bring up tez branch to changes in TEZ-1038, TEZ-1500
> 
>
> Key: HIVE-7907
> URL: https://issues.apache.org/jira/browse/HIVE-7907
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-7907.1-tez.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7907) Bring up tez branch to changes in TEZ-1038, TEZ-1500

2014-08-28 Thread Gopal V (JIRA)
Gopal V created HIVE-7907:
-

 Summary: Bring up tez branch to changes in TEZ-1038, TEZ-1500
 Key: HIVE-7907
 URL: https://issues.apache.org/jira/browse/HIVE-7907
 Project: Hive
  Issue Type: Sub-task
  Components: Tez
Affects Versions: tez-branch
Reporter: Gopal V
Assignee: Gopal V






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7902) Cleanup hbase-handler/pom.xml dependency list

2014-08-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114786#comment-14114786
 ] 

Hive QA commented on HIVE-7902:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12665078/HIVE-7902.1.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6127 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/549/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/549/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-549/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12665078

> Cleanup hbase-handler/pom.xml dependency list
> -
>
> Key: HIVE-7902
> URL: https://issues.apache.org/jira/browse/HIVE-7902
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.13.0, 0.13.1
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7902.1.patch
>
>
> Noticed an extra dependency {{hive-service}} when changing dependency version 
> of {{hive-hbase-handler}} from 0.12.0 to 0.13.0 in a third party application. 
> Tracing the log of hbase-handler/pom.xml file, it is added as part of ant to 
> maven migration and not because of any specific functionality requirement. 
> Dependency {{hive-service}} is not needed in {{hive-hbase-handler}} and can 
> be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7775) enable sample8.q.[Spark Branch]

2014-08-28 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7775:


Status: Patch Available  (was: Reopened)

add an additional patch. some stats  change in the explain part, it's 
consistent with sample8.q,output in MR mode now.

> enable sample8.q.[Spark Branch]
> ---
>
> Key: HIVE-7775
> URL: https://issues.apache.org/jira/browse/HIVE-7775
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Fix For: spark-branch
>
> Attachments: HIVE-7775.1-spark.patch, HIVE-7775.2-spark.patch, 
> HIVE-7775.3-spark.additional.patch
>
>
> sample8.q contain join query, should enable this qtest after hive on spark 
> support join operation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7775) enable sample8.q.[Spark Branch]

2014-08-28 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-7775:


Attachment: HIVE-7775.3-spark.additional.patch

> enable sample8.q.[Spark Branch]
> ---
>
> Key: HIVE-7775
> URL: https://issues.apache.org/jira/browse/HIVE-7775
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Fix For: spark-branch
>
> Attachments: HIVE-7775.1-spark.patch, HIVE-7775.2-spark.patch, 
> HIVE-7775.3-spark.additional.patch
>
>
> sample8.q contain join query, should enable this qtest after hive on spark 
> support join operation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (HIVE-7775) enable sample8.q.[Spark Branch]

2014-08-28 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li reopened HIVE-7775:
-


reopen this issue since sample8.q failed in currently automatic test.

> enable sample8.q.[Spark Branch]
> ---
>
> Key: HIVE-7775
> URL: https://issues.apache.org/jira/browse/HIVE-7775
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Fix For: spark-branch
>
> Attachments: HIVE-7775.1-spark.patch, HIVE-7775.2-spark.patch, 
> HIVE-7775.3-spark.additional.patch
>
>
> sample8.q contain join query, should enable this qtest after hive on spark 
> support join operation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7846) authorization api should support group, not assume case insensitive role names

2014-08-28 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114780#comment-14114780
 ] 

Jason Dere commented on HIVE-7846:
--

+1

> authorization api should support group, not assume case insensitive role names
> --
>
> Key: HIVE-7846
> URL: https://issues.apache.org/jira/browse/HIVE-7846
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7846.1.patch
>
>
> The case insensitive behavior of roles should be specific to sql standard 
> authorization.
> Group type for principal also should be disabled at the sql std authorization 
> layer, instead of disallowing it at the API level.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7906) Missing Index on Hive metastore query

2014-08-28 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated HIVE-7906:
---

Attachment: HIVE-456.patch.txt

> Missing Index on Hive metastore query
> -
>
> Key: HIVE-7906
> URL: https://issues.apache.org/jira/browse/HIVE-7906
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.1
>Reporter: Chu Tong
> Attachments: HIVE-456.patch.txt
>
>
> When it comes to SELECT statement on a table with large number of partitions, 
> the query in the word document below causes major performance degradation. 
> Adding this missing index to turn index scan into seek.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7906) Missing Index on Hive metastore query

2014-08-28 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated HIVE-7906:
---

Description: When it comes to SELECT statement on a table with large number 
of partitions on Windows Azure DB, the query in the word document below causes 
major performance degradation. Adding this missing index to turn index scan 
into seek.  (was: When it comes to SELECT statement on a table with large 
number of partitions, the query in the word document below causes major 
performance degradation. Adding this missing index to turn index scan into 
seek.)

> Missing Index on Hive metastore query
> -
>
> Key: HIVE-7906
> URL: https://issues.apache.org/jira/browse/HIVE-7906
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.1
>Reporter: Chu Tong
> Attachments: HIVE-456.patch.txt
>
>
> When it comes to SELECT statement on a table with large number of partitions 
> on Windows Azure DB, the query in the word document below causes major 
> performance degradation. Adding this missing index to turn index scan into 
> seek.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7906) Missing Index on Hive metastore query

2014-08-28 Thread Chu Tong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated HIVE-7906:
---

Status: Patch Available  (was: Open)

> Missing Index on Hive metastore query
> -
>
> Key: HIVE-7906
> URL: https://issues.apache.org/jira/browse/HIVE-7906
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.13.1
>Reporter: Chu Tong
> Attachments: HIVE-456.patch.txt
>
>
> When it comes to SELECT statement on a table with large number of partitions 
> on Windows Azure DB, the query in the word document below causes major 
> performance degradation. Adding this missing index to turn index scan into 
> seek.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7906) Missing Index on Hive metastore query

2014-08-28 Thread Chu Tong (JIRA)
Chu Tong created HIVE-7906:
--

 Summary: Missing Index on Hive metastore query
 Key: HIVE-7906
 URL: https://issues.apache.org/jira/browse/HIVE-7906
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.13.1
Reporter: Chu Tong


When it comes to SELECT statement on a table with large number of partitions, 
the query in the word document below causes major performance degradation. 
Adding this missing index to turn index scan into seek.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7775) enable sample8.q.[Spark Branch]

2014-08-28 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114758#comment-14114758
 ] 

Chengxiang Li commented on HIVE-7775:
-

Thanks szehon, I would take a look at this today.

> enable sample8.q.[Spark Branch]
> ---
>
> Key: HIVE-7775
> URL: https://issues.apache.org/jira/browse/HIVE-7775
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
> Fix For: spark-branch
>
> Attachments: HIVE-7775.1-spark.patch, HIVE-7775.2-spark.patch
>
>
> sample8.q contain join query, should enable this qtest after hive on spark 
> support join operation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7897) ObjectStore not using getPassword() for JDO connection string

2014-08-28 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-7897:
-

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.

> ObjectStore not using getPassword() for JDO connection string
> -
>
> Key: HIVE-7897
> URL: https://issues.apache.org/jira/browse/HIVE-7897
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Security
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 0.14.0
>
> Attachments: HIVE-7897.1.patch
>
>
> HIVE-7634 was supposed to give users the ability to not have to specify the 
> metastore password in the hive conf, by configuring a credential provider and 
> using the new getPassword() API. Looks like this jira didn't do the most 
> important change which was to have ObjectStore use getPassword() when 
> populating the properties that are passed to JDO. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7811) Compactions need to update table/partition stats

2014-08-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7811:
-

Attachment: HIVE-7811.3.patch

> Compactions need to update table/partition stats
> 
>
> Key: HIVE-7811
> URL: https://issues.apache.org/jira/browse/HIVE-7811
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7811.3.patch
>
>
> Compactions should trigger stats recalculation for columns that which already 
> have sats.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7811) Compactions need to update table/partition stats

2014-08-28 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7811:
-

Status: Patch Available  (was: Open)

> Compactions need to update table/partition stats
> 
>
> Key: HIVE-7811
> URL: https://issues.apache.org/jira/browse/HIVE-7811
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 0.13.1
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-7811.3.patch
>
>
> Compactions should trigger stats recalculation for columns that which already 
> have sats.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7895) Storage based authorization should consider sticky bit for drop actions

2014-08-28 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114748#comment-14114748
 ] 

Jason Dere commented on HIVE-7895:
--

+1

> Storage based authorization should consider sticky bit for drop actions
> ---
>
> Key: HIVE-7895
> URL: https://issues.apache.org/jira/browse/HIVE-7895
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7895.1.patch, HIVE-7895.2.patch
>
>
> Storage based authorization provides access control for metadata by giving 
> users permissions on metadata that are equivalent to the permission that user 
> has on corresponding data.
> However, when checking the permissions to drop a metadata object such as 
> database, table or partition, it does not check if the sticky bit is set on 
> the parent dir of objects corresponding dir in hdfs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 25179: HIVE-7905: CBO: more cost model changes

2014-08-28 Thread Harish Butani

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25179/
---

Review request for hive, Gunther Hagleitner and John Pullokkaran.


Repository: hive-git


Description
---

CBO: more cost model changes


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveDefaultRelMetadataProvider.java
 2c08772 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/stats/HiveRelMdRowCount.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/stats/HiveRelMdSelectivity.java
 df70de2 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/stats/HiveRelMdUniqueKeys.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 6c293c8 

Diff: https://reviews.apache.org/r/25179/diff/


Testing
---

existing tests.


Thanks,

Harish Butani



Re: Review Request 25125: HIVE-7895 : Storage based authorization should consider sticky bit for drop actions

2014-08-28 Thread Jason Dere

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25125/#review51862
---

Ship it!


Ship It!

- Jason Dere


On Aug. 28, 2014, 8:16 p.m., Thejas Nair wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25125/
> ---
> 
> (Updated Aug. 28, 2014, 8:16 p.m.)
> 
> 
> Review request for hive, Jason Dere and Sushanth Sowmyan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-7895
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java f71bc3c 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/TestStorageBasedMetastoreAuthorizationDrops.java
>  PRE-CREATION 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/security/TestStorageBasedMetastoreAuthorizationProvider.java
>  b447204 
>   
> ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java
>  ddbe30c 
> 
> Diff: https://reviews.apache.org/r/25125/diff/
> 
> 
> Testing
> ---
> 
> New tests included.
> 
> 
> Thanks,
> 
> Thejas Nair
> 
>



[jira] [Commented] (HIVE-7870) Insert overwrite table query does not generate correct task plan [Spark Branch]

2014-08-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114739#comment-14114739
 ] 

Hive QA commented on HIVE-7870:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12665158/HIVE-7870.2-spark.patch

{color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 6304 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_18
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_21
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_spark_9
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample8
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union10
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union18
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union19
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union28
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union29
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union30
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union33
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union6
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/102/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/102/console
Test logs: 
http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-102/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 33 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12665158

> Insert overwrite table query does not generate correct task plan [Spark 
> Branch]
> ---
>
> Key: HIVE-7870
> URL: https://issues.apache.org/jira/browse/HIVE-7870
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Na Yang
>Assignee: Na Yang
>  Labels: Spark-M1
> Attachments: HIVE-7870.1-spark.patch, HIVE-7870.2-spark.patch
>
>
> Insert overwrite table query does not generate correct task plan when 
> hive.optimize.union.remove and hive.merge.sparkfiles properties are ON. 
> {noformat}
> set hive.optimize.union.remove=true
> set hive.merge.sparkfiles=true
> insert overwrite table outputTbl1
> SELECT * FROM
> (
> select key, 1 as values from inputTbl1
> union all
> select * FROM (
>   SELECT key, count(1) as values from inputTbl1 group by key
>   UNION ALL
>   SELECT key, 2 as values from inputTbl1
> ) a
> )b;
> select * from outputTbl1 order by key, values;
> {noformat}
> query result
> {noformat}
> 1 1
> 1 2
> 2 1
> 2 2
> 3 1
> 3 2
> 7 1
> 7 2
> 8 2
> 8 2
> 8 2
> {noformat}
> expected result:
> {noformat}
> 1 1
> 1 1
> 1 2
> 2 1
> 2   

[jira] [Updated] (HIVE-7905) CBO: more cost model changes

2014-08-28 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-7905:


Attachment: exp-backoff-vs-log-smoothing

> CBO: more cost model changes
> 
>
> Key: HIVE-7905
> URL: https://issues.apache.org/jira/browse/HIVE-7905
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Butani
>Assignee: Harish Butani
> Attachments: exp-backoff-vs-log-smoothing
>
>
> 1. For composite predicates smoothen the Selectivity calculation using 
> +exponential backoff+. Thanks to [~ mmokhtar] for this formula.
> {quote}
> Can you change the algorithm to use exponential back-off  :
> ndv(pe0) * ndv(pe1) ^(1/2)  * ndv(pe2) ^(1/4)  * ndv(pe3) ^(1/8)
> Opposed to :
> ndv(pex)*log(ndv(pe1))*log(ndv(pe2))
> If we assume selectivity of 0.7 for each store_sales join then join 
> selectivity can end up being 6.24285E-05 which is too low and eventually 
> results in an un-optimal plan.
> {quote}
> See attached picture.
> 2. In case of Fact - Dim joins on the Dim primary key we infer the Join 
> cardinality as a filter on the Fact table:
> {code}
> join card = rowCount(Fact table) * selectivity(dim table)
> {code}
> Whether a Column is a Key is inferred based on either:
> * table rowCount = column ndv
> * (tbd shortly) table rowCount = (maxVal - minVal)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7905) CBO: more cost model changes

2014-08-28 Thread Harish Butani (JIRA)
Harish Butani created HIVE-7905:
---

 Summary: CBO: more cost model changes
 Key: HIVE-7905
 URL: https://issues.apache.org/jira/browse/HIVE-7905
 Project: Hive
  Issue Type: Sub-task
Reporter: Harish Butani
Assignee: Harish Butani


1. For composite predicates smoothen the Selectivity calculation using 
+exponential backoff+. Thanks to [~ mmokhtar] for this formula.

{quote}
Can you change the algorithm to use exponential back-off  :
ndv(pe0) * ndv(pe1) ^(1/2)  * ndv(pe2) ^(1/4)  * ndv(pe3) ^(1/8)

Opposed to :

ndv(pex)*log(ndv(pe1))*log(ndv(pe2))

If we assume selectivity of 0.7 for each store_sales join then join selectivity 
can end up being 6.24285E-05 which is too low and eventually results in an 
un-optimal plan.
{quote}

See attached picture.

2. In case of Fact - Dim joins on the Dim primary key we infer the Join 
cardinality as a filter on the Fact table:
{code}
join card = rowCount(Fact table) * selectivity(dim table)
{code}

Whether a Column is a Key is inferred based on either:
* table rowCount = column ndv
* (tbd shortly) table rowCount = (maxVal - minVal)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7853) Make OrcNewInputFormat return row number as a key

2014-08-28 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114731#comment-14114731
 ] 

Navis commented on HIVE-7853:
-

Those are not related to this.

> Make OrcNewInputFormat return row number as a key
> -
>
> Key: HIVE-7853
> URL: https://issues.apache.org/jira/browse/HIVE-7853
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
> Environment: all
>Reporter: john
>Assignee: Navis
>  Labels: Orc
> Attachments: HIVE-7853.1.patch.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Key is null in map when OrcNewInputFormat is used as Input Format Class
> When using OrcNewInputFormat as input format class for my map reduce job, I 
> find its key is always null in my map method. This gives me no way to get row 
> number in my map method.  If you compare RCFileInputFormat (for RC file), its 
> key in map method returns the row number so I know which row I am processing. 
> Is there any workaround for me to get the row number from my map method?  Of 
> course, I can count the row number by myself.  But that has two problems: #1 
> I have to assume the row is coming in the order; #2 I will get duplicated 
> (and wrong) row numbers if a big input file causes multiple file splits 
> (which will trigger my map method multiple times in different data nodes).   
> At this point, I am really seeking a better way to get row number for each 
> processed row in map method.
> Here is what I have in my map logs:
>   [2014-08-06 09:39:25 DEBUG com..hadoop.orcfile.OrcFileMap]: Mapper 
> Input Key: (null)
>   [2014-08-06 09:39:25 DEBUG com..hadoop.orcfile.OrcFileMap]: Mapper 
> Input Value: {Q8151, T9976, 69976, 8156756, 966798161, 
> 97898989898, Laura, laura...@gmail.com}
> My map method is:
>   protected void map(Object key, Writable value, Context context)
>   throws IOException, InterruptedException {
>   logger.debug("Mapper Input Key: " + key);
>   logger.debug("Mapper Input Value: " + value.toString());
>   .
>   }
> The fix should be: add  following statement in nextKeyValue() method and pass 
> the result all the way up to the map() method as its key:
>   reader.getRowNumber(); 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7895) Storage based authorization should consider sticky bit for drop actions

2014-08-28 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114732#comment-14114732
 ] 

Thejas M Nair commented on HIVE-7895:
-

Ran TestHadoop20SAuthBridge locally and it passed. Other two test failed tests 
are regulars.


> Storage based authorization should consider sticky bit for drop actions
> ---
>
> Key: HIVE-7895
> URL: https://issues.apache.org/jira/browse/HIVE-7895
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7895.1.patch, HIVE-7895.2.patch
>
>
> Storage based authorization provides access control for metadata by giving 
> users permissions on metadata that are equivalent to the permission that user 
> has on corresponding data.
> However, when checking the permissions to drop a metadata object such as 
> database, table or partition, it does not check if the sticky bit is set on 
> the parent dir of objects corresponding dir in hdfs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-3026) List Bucketing in Hive

2014-08-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114697#comment-14114697
 ] 

Lefty Leverenz commented on HIVE-3026:
--

This needs a fix version.

> List Bucketing in Hive
> --
>
> Key: HIVE-3026
> URL: https://issues.apache.org/jira/browse/HIVE-3026
> Project: Hive
>  Issue Type: New Feature
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
>
> Details are at:
> https://cwiki.apache.org/confluence/display/Hive/ListBucketing
> Please comment



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7869) Long running tests (1) [Spark Branch]

2014-08-28 Thread Suhas Satish (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114696#comment-14114696
 ] 

Suhas Satish commented on HIVE-7869:


est failures are not related to the patch. 

> Long running tests (1) [Spark Branch]
> -
>
> Key: HIVE-7869
> URL: https://issues.apache.org/jira/browse/HIVE-7869
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Suhas Satish
> Attachments: HIVE-7869-spark.patch
>
>
> I have noticed when running the full test suite locally that the test JVM 
> eventually crashes. We should do some testing (not part of the unit tests) 
> which starts up a HS2 and runs queries on it continuously for 24 hours or so.
> In this JIRA let's create a stand alone java program which connects to a HS2 
> over JDBC, creates a bunch of tables (say 100) and then runs queries until 
> the JDBC client is killed. This will allow us to run long running tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7869) Long running tests (1) [Spark Branch]

2014-08-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114690#comment-14114690
 ] 

Hive QA commented on HIVE-7869:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12665168/HIVE-7869-spark.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6266 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample8
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/101/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/101/console
Test logs: 
http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-101/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12665168

> Long running tests (1) [Spark Branch]
> -
>
> Key: HIVE-7869
> URL: https://issues.apache.org/jira/browse/HIVE-7869
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Suhas Satish
> Attachments: HIVE-7869-spark.patch
>
>
> I have noticed when running the full test suite locally that the test JVM 
> eventually crashes. We should do some testing (not part of the unit tests) 
> which starts up a HS2 and runs queries on it continuously for 24 hours or so.
> In this JIRA let's create a stand alone java program which connects to a HS2 
> over JDBC, creates a bunch of tables (say 100) and then runs queries until 
> the JDBC client is killed. This will allow us to run long running tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 15449: session/operation timeout for hiveserver2

2014-08-28 Thread Navis Ryu


> On Aug. 28, 2014, 7:56 a.m., Lefty Leverenz wrote:
> >

Addressing previous comments, I've revised validator to describe itself to 
description. For StringSet validator, the description of the conf will be 
started with something like, "Expects one of [textfile, sequencefile, rcfile, 
orc]." and for TimeValidator, it's "Expects a numeric value with timeunit 
(d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec)", etc. It's the reason 
why some part of description is removed. Could you generate the template and 
see the result? (cd commmon;mvn clean package -Phadoop-2 -Pdist -DskipTests). 
If you don't like this, I'll revert that.


- Navis


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15449/#review51760
---


On Aug. 28, 2014, 2:31 a.m., Navis Ryu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15449/
> ---
> 
> (Updated Aug. 28, 2014, 2:31 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-5799
> https://issues.apache.org/jira/browse/HIVE-5799
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Need some timeout facility for preventing resource leakages from instable or 
> bad clients.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/ant/GenHiveTemplate.java 4293b7c 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 74bb863 
>   common/src/java/org/apache/hadoop/hive/conf/Validator.java cea9c41 
>   
> itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHiveServer2SessionTimeout.java
>  PRE-CREATION 
>   service/src/java/org/apache/hive/service/cli/CLIService.java ff5de4a 
>   service/src/java/org/apache/hive/service/cli/OperationState.java 3e15f0c 
>   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
> 0d6436e 
>   
> service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 
> 2867301 
>   service/src/java/org/apache/hive/service/cli/session/HiveSession.java 
> 270e4a6 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 
> 84e1c7e 
>   service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
> 4e5f595 
>   
> service/src/java/org/apache/hive/service/cli/session/HiveSessionImplwithUGI.java
>  39d2184 
>   service/src/java/org/apache/hive/service/cli/session/SessionManager.java 
> 17c1c7b 
>   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java d01e819 
> 
> Diff: https://reviews.apache.org/r/15449/diff/
> 
> 
> Testing
> ---
> 
> Confirmed in the local environment.
> 
> 
> Thanks,
> 
> Navis Ryu
> 
>



[jira] [Commented] (HIVE-7895) Storage based authorization should consider sticky bit for drop actions

2014-08-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114679#comment-14114679
 ] 

Hive QA commented on HIVE-7895:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12665055/HIVE-7895.2.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6129 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/548/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/548/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-548/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12665055

> Storage based authorization should consider sticky bit for drop actions
> ---
>
> Key: HIVE-7895
> URL: https://issues.apache.org/jira/browse/HIVE-7895
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7895.1.patch, HIVE-7895.2.patch
>
>
> Storage based authorization provides access control for metadata by giving 
> users permissions on metadata that are equivalent to the permission that user 
> has on corresponding data.
> However, when checking the permissions to drop a metadata object such as 
> database, table or partition, it does not check if the sticky bit is set on 
> the parent dir of objects corresponding dir in hdfs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.

2014-08-28 Thread david serafini (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114672#comment-14114672
 ] 

david serafini commented on HIVE-7100:
--

https://reviews.apache.org/r/25178/

> Users of hive should be able to specify skipTrash when dropping tables.
> ---
>
> Key: HIVE-7100
> URL: https://issues.apache.org/jira/browse/HIVE-7100
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.13.0
>Reporter: Ravi Prakash
>Assignee: Jayesh
> Attachments: HIVE-7100.1.patch, HIVE-7100.2.patch, HIVE-7100.3.patch, 
> HIVE-7100.4.patch, HIVE-7100.patch
>
>
> Users of our clusters are often running up against their quota limits because 
> of Hive tables. When they drop tables, they have to then manually delete the 
> files from HDFS using skipTrash. This is cumbersome and unnecessary. We 
> should enable users to skipTrash directly when dropping tables.
> We should also be able to provide this functionality without polluting SQL 
> syntax.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7869) Long running tests (1) [Spark Branch]

2014-08-28 Thread Suhas Satish (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114665#comment-14114665
 ] 

Suhas Satish commented on HIVE-7869:


https://reviews.apache.org/r/25177/

> Long running tests (1) [Spark Branch]
> -
>
> Key: HIVE-7869
> URL: https://issues.apache.org/jira/browse/HIVE-7869
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Suhas Satish
> Attachments: HIVE-7869-spark.patch
>
>
> I have noticed when running the full test suite locally that the test JVM 
> eventually crashes. We should do some testing (not part of the unit tests) 
> which starts up a HS2 and runs queries on it continuously for 24 hours or so.
> In this JIRA let's create a stand alone java program which connects to a HS2 
> over JDBC, creates a bunch of tables (say 100) and then runs queries until 
> the JDBC client is killed. This will allow us to run long running tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7803) Enable Hadoop speculative execution may cause corrupt output directory (dynamic partition)

2014-08-28 Thread Selina Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Selina Zhang updated HIVE-7803:
---

Status: Patch Available  (was: In Progress)

> Enable Hadoop speculative execution may cause corrupt output directory 
> (dynamic partition)
> --
>
> Key: HIVE-7803
> URL: https://issues.apache.org/jira/browse/HIVE-7803
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
> Environment: 
>Reporter: Selina Zhang
>Assignee: Selina Zhang
>Priority: Critical
> Attachments: HIVE-7803.1.patch, HIVE-7803.2.patch
>
>
> One of our users reports they see intermittent failures due to attempt 
> directories in the input paths. We found with speculative execution turned 
> on, two mappers tried to commit task at the same time using the same 
> committed task path,  which cause the corrupt output directory. 
> The original Pig script:
> {code}
> STORE AdvertiserDataParsedClean INTO '$DB_NAME.$ADVERTISER_META_TABLE_NAME'
> USING org.apache.hcatalog.pig.HCatStorer();
> {code}
> Two mappers
> attempt_1405021984947_5394024_m_000523_0: KILLED
> attempt_1405021984947_5394024_m_000523_1: SUCCEEDED
> attempt_1405021984947_5394024_m_000523_0 was killed right after the commit.
> As a result, it created corrupt directory as 
>   
> /projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523/
> containing 
>part-m-00523 (from attempt_1405021984947_5394024_m_000523_0)
> and 
>attempt_1405021984947_5394024_m_000523_1/part-m-00523
> Namenode Audit log
> ==
> 1. 2014-08-05 05:04:36,811 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 2. 2014-08-05 05:04:53,112 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 3. 2014-08-05 05:05:13,001 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> 4. 2014-08-05 05:05:13,004 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> After consulting our Hadoop core team, we was pointed out some HCat code does 
> not participating in the two-phase commit protocol, for example in 
> FileRecordWriterContainer.close():
> {code}
> for (Map.Entry 
> entry : baseDynamicCommitters.entrySet()) {
> org.apache.hadoop.mapred.TaskAttemptContext currContext = 
> dynamicContexts.get(entry.getKey());
> OutputCommitter baseOutputCommitter = entry.getValue();
> if (baseOutputCommitter.needsTaskCommit(currContext)) {
> baseOutputCommitter.commitTask(currContext);
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work stopped] (HIVE-7803) Enable Hadoop speculative execution may cause corrupt output directory (dynamic partition)

2014-08-28 Thread Selina Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-7803 stopped by Selina Zhang.

> Enable Hadoop speculative execution may cause corrupt output directory 
> (dynamic partition)
> --
>
> Key: HIVE-7803
> URL: https://issues.apache.org/jira/browse/HIVE-7803
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
> Environment: 
>Reporter: Selina Zhang
>Assignee: Selina Zhang
>Priority: Critical
> Attachments: HIVE-7803.1.patch, HIVE-7803.2.patch
>
>
> One of our users reports they see intermittent failures due to attempt 
> directories in the input paths. We found with speculative execution turned 
> on, two mappers tried to commit task at the same time using the same 
> committed task path,  which cause the corrupt output directory. 
> The original Pig script:
> {code}
> STORE AdvertiserDataParsedClean INTO '$DB_NAME.$ADVERTISER_META_TABLE_NAME'
> USING org.apache.hcatalog.pig.HCatStorer();
> {code}
> Two mappers
> attempt_1405021984947_5394024_m_000523_0: KILLED
> attempt_1405021984947_5394024_m_000523_1: SUCCEEDED
> attempt_1405021984947_5394024_m_000523_0 was killed right after the commit.
> As a result, it created corrupt directory as 
>   
> /projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523/
> containing 
>part-m-00523 (from attempt_1405021984947_5394024_m_000523_0)
> and 
>attempt_1405021984947_5394024_m_000523_1/part-m-00523
> Namenode Audit log
> ==
> 1. 2014-08-05 05:04:36,811 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 2. 2014-08-05 05:04:53,112 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 3. 2014-08-05 05:05:13,001 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> 4. 2014-08-05 05:05:13,004 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> After consulting our Hadoop core team, we was pointed out some HCat code does 
> not participating in the two-phase commit protocol, for example in 
> FileRecordWriterContainer.close():
> {code}
> for (Map.Entry 
> entry : baseDynamicCommitters.entrySet()) {
> org.apache.hadoop.mapred.TaskAttemptContext currContext = 
> dynamicContexts.get(entry.getKey());
> OutputCommitter baseOutputCommitter = entry.getValue();
> if (baseOutputCommitter.needsTaskCommit(currContext)) {
> baseOutputCommitter.commitTask(currContext);
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Work started] (HIVE-7803) Enable Hadoop speculative execution may cause corrupt output directory (dynamic partition)

2014-08-28 Thread Selina Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-7803 started by Selina Zhang.

> Enable Hadoop speculative execution may cause corrupt output directory 
> (dynamic partition)
> --
>
> Key: HIVE-7803
> URL: https://issues.apache.org/jira/browse/HIVE-7803
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
> Environment: 
>Reporter: Selina Zhang
>Assignee: Selina Zhang
>Priority: Critical
> Attachments: HIVE-7803.1.patch, HIVE-7803.2.patch
>
>
> One of our users reports they see intermittent failures due to attempt 
> directories in the input paths. We found with speculative execution turned 
> on, two mappers tried to commit task at the same time using the same 
> committed task path,  which cause the corrupt output directory. 
> The original Pig script:
> {code}
> STORE AdvertiserDataParsedClean INTO '$DB_NAME.$ADVERTISER_META_TABLE_NAME'
> USING org.apache.hcatalog.pig.HCatStorer();
> {code}
> Two mappers
> attempt_1405021984947_5394024_m_000523_0: KILLED
> attempt_1405021984947_5394024_m_000523_1: SUCCEEDED
> attempt_1405021984947_5394024_m_000523_0 was killed right after the commit.
> As a result, it created corrupt directory as 
>   
> /projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523/
> containing 
>part-m-00523 (from attempt_1405021984947_5394024_m_000523_0)
> and 
>attempt_1405021984947_5394024_m_000523_1/part-m-00523
> Namenode Audit log
> ==
> 1. 2014-08-05 05:04:36,811 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 2. 2014-08-05 05:04:53,112 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 3. 2014-08-05 05:05:13,001 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> 4. 2014-08-05 05:05:13,004 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> After consulting our Hadoop core team, we was pointed out some HCat code does 
> not participating in the two-phase commit protocol, for example in 
> FileRecordWriterContainer.close():
> {code}
> for (Map.Entry 
> entry : baseDynamicCommitters.entrySet()) {
> org.apache.hadoop.mapred.TaskAttemptContext currContext = 
> dynamicContexts.get(entry.getKey());
> OutputCommitter baseOutputCommitter = entry.getValue();
> if (baseOutputCommitter.needsTaskCommit(currContext)) {
> baseOutputCommitter.commitTask(currContext);
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7803) Enable Hadoop speculative execution may cause corrupt output directory (dynamic partition)

2014-08-28 Thread Selina Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Selina Zhang updated HIVE-7803:
---

Attachment: HIVE-7803.2.patch

You are right, I forgot set it to true for 
FileOutputCommitterContainer.needsTaskCommit().I have updated the patch and in 
this patch I also removed some lines we do not need anymore. 

Thanks for reviewing this!

> Enable Hadoop speculative execution may cause corrupt output directory 
> (dynamic partition)
> --
>
> Key: HIVE-7803
> URL: https://issues.apache.org/jira/browse/HIVE-7803
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
> Environment: 
>Reporter: Selina Zhang
>Assignee: Selina Zhang
>Priority: Critical
> Attachments: HIVE-7803.1.patch, HIVE-7803.2.patch
>
>
> One of our users reports they see intermittent failures due to attempt 
> directories in the input paths. We found with speculative execution turned 
> on, two mappers tried to commit task at the same time using the same 
> committed task path,  which cause the corrupt output directory. 
> The original Pig script:
> {code}
> STORE AdvertiserDataParsedClean INTO '$DB_NAME.$ADVERTISER_META_TABLE_NAME'
> USING org.apache.hcatalog.pig.HCatStorer();
> {code}
> Two mappers
> attempt_1405021984947_5394024_m_000523_0: KILLED
> attempt_1405021984947_5394024_m_000523_1: SUCCEEDED
> attempt_1405021984947_5394024_m_000523_0 was killed right after the commit.
> As a result, it created corrupt directory as 
>   
> /projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523/
> containing 
>part-m-00523 (from attempt_1405021984947_5394024_m_000523_0)
> and 
>attempt_1405021984947_5394024_m_000523_1/part-m-00523
> Namenode Audit log
> ==
> 1. 2014-08-05 05:04:36,811 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 2. 2014-08-05 05:04:53,112 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 3. 2014-08-05 05:05:13,001 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> 4. 2014-08-05 05:05:13,004 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> After consulting our Hadoop core team, we was pointed out some HCat code does 
> not participating in the two-phase commit protocol, for example in 
> FileRecordWriterContainer.close():
> {code}
> for (Map.Entry 
> entry : baseDynamicCommitters.entrySet()) {
> org.apache.hadoop.mapred.TaskAttemptContext currContext = 
> dynamicContexts.get(entry.getKey());
> OutputCommitter baseOutputCommitter = entry.getValue();
> if (baseOutputCommitter.needsTaskCommit(currContext)) {
> baseOutputCommitter.commitTask(currContext);
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7869) Long running tests (1) [Spark Branch]

2014-08-28 Thread Suhas Satish (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suhas Satish updated HIVE-7869:
---

Attachment: HIVE-7869-spark.patch

> Long running tests (1) [Spark Branch]
> -
>
> Key: HIVE-7869
> URL: https://issues.apache.org/jira/browse/HIVE-7869
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Suhas Satish
> Attachments: HIVE-7869-spark.patch
>
>
> I have noticed when running the full test suite locally that the test JVM 
> eventually crashes. We should do some testing (not part of the unit tests) 
> which starts up a HS2 and runs queries on it continuously for 24 hours or so.
> In this JIRA let's create a stand alone java program which connects to a HS2 
> over JDBC, creates a bunch of tables (say 100) and then runs queries until 
> the JDBC client is killed. This will allow us to run long running tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7869) Long running tests (1) [Spark Branch]

2014-08-28 Thread Suhas Satish (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suhas Satish updated HIVE-7869:
---

Status: Patch Available  (was: Open)

> Long running tests (1) [Spark Branch]
> -
>
> Key: HIVE-7869
> URL: https://issues.apache.org/jira/browse/HIVE-7869
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Brock Noland
>Assignee: Suhas Satish
> Attachments: HIVE-7869-spark.patch
>
>
> I have noticed when running the full test suite locally that the test JVM 
> eventually crashes. We should do some testing (not part of the unit tests) 
> which starts up a HS2 and runs queries on it continuously for 24 hours or so.
> In this JIRA let's create a stand alone java program which connects to a HS2 
> over JDBC, creates a bunch of tables (say 100) and then runs queries until 
> the JDBC client is killed. This will allow us to run long running tests.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-28 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---

Attachment: HIVE-7405.93.patch

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
> HIVE-7405.93.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-28 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---

Status: Open  (was: Patch Available)

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch, 
> HIVE-7405.93.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7870) Insert overwrite table query does not generate correct task plan [Spark Branch]

2014-08-28 Thread Na Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114608#comment-14114608
 ] 

Na Yang commented on HIVE-7870:
---

review board link: https://reviews.apache.org/r/25176/

A set of new .q tests are added to test the hive.merge.sparkfile configuration 
property. 

> Insert overwrite table query does not generate correct task plan [Spark 
> Branch]
> ---
>
> Key: HIVE-7870
> URL: https://issues.apache.org/jira/browse/HIVE-7870
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Na Yang
>Assignee: Na Yang
>  Labels: Spark-M1
> Attachments: HIVE-7870.1-spark.patch, HIVE-7870.2-spark.patch
>
>
> Insert overwrite table query does not generate correct task plan when 
> hive.optimize.union.remove and hive.merge.sparkfiles properties are ON. 
> {noformat}
> set hive.optimize.union.remove=true
> set hive.merge.sparkfiles=true
> insert overwrite table outputTbl1
> SELECT * FROM
> (
> select key, 1 as values from inputTbl1
> union all
> select * FROM (
>   SELECT key, count(1) as values from inputTbl1 group by key
>   UNION ALL
>   SELECT key, 2 as values from inputTbl1
> ) a
> )b;
> select * from outputTbl1 order by key, values;
> {noformat}
> query result
> {noformat}
> 1 1
> 1 2
> 2 1
> 2 2
> 3 1
> 3 2
> 7 1
> 7 2
> 8 2
> 8 2
> 8 2
> {noformat}
> expected result:
> {noformat}
> 1 1
> 1 1
> 1 2
> 2 1
> 2 1
> 2 2
> 3 1
> 3 1
> 3 2
> 7 1
> 7 1
> 7 2
> 8 1
> 8 1
> 8 2
> 8 2
> 8 2
> {noformat}
> Move work is not working properly and some data are missing during move.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 25176: HIVE-7870: Insert overwrite table query does not generate correct task plan [Spark Branch]

2014-08-28 Thread Na Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25176/
---

(Updated Aug. 28, 2014, 11:42 p.m.)


Review request for hive, Brock Noland, Szehon Ho, and Xuefu Zhang.


Bugs: HIVE-7870
https://issues.apache.org/jira/browse/HIVE-7870


Repository: hive-git


Description
---

HIVE-7870: Insert overwrite table query does not generate correct task plan 
[Spark Branch]

The cause of this problem is during spark/tez task generation, the union file 
sink operator are cloned to two new filesink operator. The linkedfilesinkdesc 
info for those new filesink operators are missing. In addition, the two new 
filesink operators also need to be linked together.   


Diffs
-

  itests/src/test/resources/testconfiguration.properties 6393671 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 9c808d4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
5ddc16d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 379a39c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 76fc290 
  ql/src/test/queries/clientpositive/union_remove_spark_1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_10.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_15.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_16.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_17.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_18.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_19.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_20.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_21.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_24.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_25.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_4.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_7.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_8.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_9.q PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_1.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_10.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_11.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_15.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_16.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_17.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_18.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_19.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_2.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_20.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_21.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_24.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_25.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_3.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_4.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_5.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_6.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_7.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_8.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_9.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25176/diff/


Testing
---


Thanks,

Na Yang



Review Request 25176: HIVE-7870: Insert overwrite table query does not generate correct task plan [Spark Branch]

2014-08-28 Thread Na Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25176/
---

Review request for hive, Brock Noland, Szehon Ho, and Xuefu Zhang.


Bugs: HIVE-7870
https://issues.apache.org/jira/browse/HIVE-7870


Repository: hive-git


Description
---

HIVE-7870: Insert overwrite table query does not generate correct task plan 
[Spark Branch]

The cause of this problem is during spark/tez task generation, the union file 
sink operator are cloned to two new filesink operator. The linkedfilesinkdesc 
info for those new filesink operators are missing. In addition, the two new 
filesink operators also need to be linked together.   


Diffs
-

  itests/src/test/resources/testconfiguration.properties 6393671 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 9c808d4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
5ddc16d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 379a39c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 76fc290 
  ql/src/test/queries/clientpositive/union_remove_spark_1.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_10.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_15.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_16.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_17.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_18.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_19.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_2.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_20.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_21.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_24.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_25.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_4.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_7.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_8.q PRE-CREATION 
  ql/src/test/queries/clientpositive/union_remove_spark_9.q PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_1.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_10.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_11.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_15.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_16.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_17.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_18.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_19.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_2.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_20.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_21.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_24.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_25.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_3.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_4.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_5.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_6.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_7.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_8.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/spark/union_remove_spark_9.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25176/diff/


Testing
---


Thanks,

Na Yang



Re: Hive Contributor request

2014-08-28 Thread Thejas Nair
Done. Looking forward to your contributions Suma!


On Thu, Aug 28, 2014 at 11:03 AM, Suma Shivaprasad
 wrote:
> Hi,
>
> Please add me to Hive contributor list
>
> Jira User name : suma.shivaprasad
>
> Thanks
> Suma

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Updated] (HIVE-7870) Insert overwrite table query does not generate correct task plan [Spark Branch]

2014-08-28 Thread Na Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Na Yang updated HIVE-7870:
--

Attachment: HIVE-7870.2-spark.patch

> Insert overwrite table query does not generate correct task plan [Spark 
> Branch]
> ---
>
> Key: HIVE-7870
> URL: https://issues.apache.org/jira/browse/HIVE-7870
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Na Yang
>Assignee: Na Yang
>  Labels: Spark-M1
> Attachments: HIVE-7870.1-spark.patch, HIVE-7870.2-spark.patch
>
>
> Insert overwrite table query does not generate correct task plan when 
> hive.optimize.union.remove and hive.merge.sparkfiles properties are ON. 
> {noformat}
> set hive.optimize.union.remove=true
> set hive.merge.sparkfiles=true
> insert overwrite table outputTbl1
> SELECT * FROM
> (
> select key, 1 as values from inputTbl1
> union all
> select * FROM (
>   SELECT key, count(1) as values from inputTbl1 group by key
>   UNION ALL
>   SELECT key, 2 as values from inputTbl1
> ) a
> )b;
> select * from outputTbl1 order by key, values;
> {noformat}
> query result
> {noformat}
> 1 1
> 1 2
> 2 1
> 2 2
> 3 1
> 3 2
> 7 1
> 7 2
> 8 2
> 8 2
> 8 2
> {noformat}
> expected result:
> {noformat}
> 1 1
> 1 1
> 1 2
> 2 1
> 2 1
> 2 2
> 3 1
> 3 1
> 3 2
> 7 1
> 7 1
> 7 2
> 8 1
> 8 1
> 8 2
> 8 2
> 8 2
> {noformat}
> Move work is not working properly and some data are missing during move.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6245) HS2 creates DBs/Tables with wrong ownership when HMS setugi is true

2014-08-28 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-6245:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Thank you so much Venki! I have committed this to trunk!

> HS2 creates DBs/Tables with wrong ownership when HMS setugi is true
> ---
>
> Key: HIVE-6245
> URL: https://issues.apache.org/jira/browse/HIVE-6245
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 0.12.0, 0.13.0
>Reporter: Chaoyu Tang
>Assignee: Venki Korukanti
> Fix For: 0.14.0
>
> Attachments: HIVE-6245.2.patch.txt, HIVE-6245.3.patch.txt, 
> HIVE-6245.4.patch, HIVE-6245.5.patch, HIVE-6245.patch
>
>
> The case with following settings is valid but does not work correctly in 
> current HS2:
> ==
> hive.server2.authentication=NONE (or LDAP)
> hive.server2.enable.doAs= true
> hive.metastore.sasl.enabled=false
> hive.metastore.execute.setugi=true
> ==
> Ideally, HS2 is able to impersonate the logged in user (from Beeline, or JDBC 
> application) and create DBs/Tables with user's ownership.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7846) authorization api should support group, not assume case insensitive role names

2014-08-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114576#comment-14114576
 ] 

Hive QA commented on HIVE-7846:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12664210/HIVE-7846.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6128 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/547/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/547/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-547/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12664210

> authorization api should support group, not assume case insensitive role names
> --
>
> Key: HIVE-7846
> URL: https://issues.apache.org/jira/browse/HIVE-7846
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-7846.1.patch
>
>
> The case insensitive behavior of roles should be specific to sql standard 
> authorization.
> Group type for principal also should be disabled at the sql std authorization 
> layer, instead of disallowing it at the API level.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7857) Hive query fails after Tez session times out

2014-08-28 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-7857:
-

Attachment: HIVE-7857.2.patch

Fixes the failing test.

> Hive query fails after Tez session times out
> 
>
> Key: HIVE-7857
> URL: https://issues.apache.org/jira/browse/HIVE-7857
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 0.14.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-7857.1.patch, HIVE-7857.2.patch
>
>
> Originally reported by [~deepesh]
> Steps to reproduce:
> Open the Hive CLI, ensure that HIVE_AUX_JARS_PATH has hcatalog-core.jar 
> in the path.
> Keep it idle for more than 5 minutes (this is the default tez session 
> timeout). Essentially Tez session should time out.
> Run a Hive on Tez query, the query fails. Here is a sample CLI session:
> {noformat}
> hive> select from_unixtime(unix_timestamp(), "dd-MMM-") from 
> vectortab10korc limit 1;
> Query ID = hrt_qa_20140626002525_6e964079-4031-406b-85ed-cda9c65dca22
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (application id: application_1403688364015_1930)
> Map 1: -/-
> Map 1: 0/1
> Map 1: 0/1
> Map 1: 0/1
> Map 1: 0/1
> Map 1: 0/1
> Status: Failed
> Vertex failed, vertexName=Map 1, vertexId=vertex_1403688364015_1930_1_00, 
> diagnostics=[Task failed, taskId=task_1403688364015_1930_1_00_00, 
> diagnostics=[AttemptID:attempt_1403688364015_1930_1_00_00_0 
> Info:Container container_1403688364015_1930_01_02 COMPLETED with 
> diagnostics set to [Resource 
> hdfs://ambari-sec-1403670773-others-2-1.cs1cloud.internal:8020/tmp/hive-hrt_qa/_tez_session_dir/3d3ef758-90f3-4bb3-86cb-902aeb3b8830/hive-hcatalog-core-0.13.0.2.1.3.0-554.jar
>  changed on src filesystem (expected 1403741969169, was 1403742347351
> ], AttemptID:attempt_1403688364015_1930_1_00_00_1 Info:Container 
> container_1403688364015_1930_01_03 COMPLETED with diagnostics set to 
> [Resource 
> hdfs://ambari-sec-1403670773-others-2-1.cs1cloud.internal:8020/tmp/hive-hrt_qa/_tez_session_dir/3d3ef758-90f3-4bb3-86cb-902aeb3b8830/hive-hcatalog-core-0.13.0.2.1.3.0-554.jar
>  changed on src filesystem (expected 1403741969169, was 1403742347351
> ], AttemptID:attempt_1403688364015_1930_1_00_00_2 Info:Container 
> container_1403688364015_1930_01_04 COMPLETED with diagnostics set to 
> [Resource 
> hdfs://ambari-sec-1403670773-others-2-1.cs1cloud.internal:8020/tmp/hive-hrt_qa/_tez_session_dir/3d3ef758-90f3-4bb3-86cb-902aeb3b8830/hive-hcatalog-core-0.13.0.2.1.3.0-554.jar
>  changed on src filesystem (expected 1403741969169, was 1403742347351
> ], AttemptID:attempt_1403688364015_1930_1_00_00_3 Info:Container 
> container_1403688364015_1930_01_05 COMPLETED with diagnostics set to 
> [Resource 
> hdfs://ambari-sec-1403670773-others-2-1.cs1cloud.internal:8020/tmp/hive-hrt_qa/_tez_session_dir/3d3ef758-90f3-4bb3-86cb-902aeb3b8830/hive-hcatalog-core-0.13.0.2.1.3.0-554.jar
>  changed on src filesystem (expected 1403741969169, was 1403742347351
> ]], Vertex failed as one or more tasks failed. failedTasks:1]
> DAG failed due to vertex failure. failedVertices:1 killedVertices:0
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.tez.TezTask
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-28 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---

Status: Patch Available  (was: In Progress)

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-28 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---

Attachment: HIVE-7405.92.patch

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

2014-08-28 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-7405:
---

Status: In Progress  (was: Patch Available)

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> --
>
> Key: HIVE-7405
> URL: https://issues.apache.org/jira/browse/HIVE-7405
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, 
> HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch, 
> HIVE-7405.8.patch, HIVE-7405.9.patch, HIVE-7405.91.patch, HIVE-7405.92.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each 
> input VectorizedRowBatch has only values for one key at a time.  Thus, the 
> values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter

2014-08-28 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-4329:
-

Attachment: HIVE-4329.3.patch

> HCatalog should use getHiveRecordWriter rather than getRecordWriter
> ---
>
> Key: HIVE-4329
> URL: https://issues.apache.org/jira/browse/HIVE-4329
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Serializers/Deserializers
>Affects Versions: 0.14.0
> Environment: discovered in Pig, but it looks like the root cause 
> impacts all non-Hive users
>Reporter: Sean Busbey
>Assignee: David Chen
> Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, 
> HIVE-4329.3.patch
>
>
> Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
> with the following stacktrace:
> {code}
> java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
> cast to org.apache.hadoop.io.LongWritable
>   at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
>   at 
> org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
>   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
>   at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
>   at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
> {code}
> The proximal cause of this failure is that the AvroContainerOutputFormat's 
> signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
> forces a NullWritable. I'm not sure of a general fix, other than redefining 
> HiveOutputFormat to mandate a WritableComparable.
> It looks like accepting WritableComparable is what's done in the other Hive 
> OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
> be changed, since it's ignoring the key. That way fixing things so 
> FileRecordWriterContainer can always use NullWritable could get spun into a 
> different issue?
> The underlying cause for failure to write to AvroSerde tables is that 
> AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
> fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24136: HIVE-4329: HCatalog should use getHiveRecordWriter.

2014-08-28 Thread David Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24136/
---

(Updated Aug. 28, 2014, 10:51 p.m.)


Review request for hive.


Changes
---

Use Table.getOutputFormatClass to retrieve the OutputFormat class.


Bugs: HIVE-4329
https://issues.apache.org/jira/browse/HIVE-4329


Repository: hive-git


Description
---

HIVE-4329: HCatalog should use getHiveRecordWriter.


Diffs (updated)
-

  hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
93a03adeab7ba3c3c91344955d303e4252005239 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultOutputFormatContainer.java
 3a07b0ca7c1956d45e611005cbc5ba2464596471 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultRecordWriterContainer.java
 209d7bcef5624100c6cdbc2a0a137dcaf1c1fc42 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicPartitionFileRecordWriterContainer.java
 4df912a935221e527c106c754ff233d212df9246 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java
 1a7595fd6dd0a5ffbe529bc24015c482068233bf 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileRecordWriterContainer.java
 2a883d6517bfe732b6a6dffa647d9d44e4145b38 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 bfa8657cd1b16aec664aab3e22b430b304a3698d 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatBaseOutputFormat.java
 4f7a74a002cedf3b54d0133041184fbcd9d9c4ab 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatMapRedUtil.java
 b651cb323771843da43667016a7dd2c9d9a1ddac 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java
 694739821a202780818924d54d10edb707cfbcfa 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java
 1980ef50af42499e0fed8863b6ff7a45f926d9fc 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InternalUtil.java
 9b979395e47e54aac87487cb990824e3c3a2ee19 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputFormatContainer.java
 d83b003f9c16e78a39b3cc7ce810ff19f70848c2 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/RecordWriterContainer.java
 5905b46178b510b3a43311739fea2b95f47b4ed7 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticPartitionFileRecordWriterContainer.java
 b3ea76e6a79f94e09972bc060c06105f60087b71 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/HCatMapReduceTest.java
 ee57f3fd126af2e36039f84686a4169ef6267593 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatDynamicPartitioned.java
 0d87c6ce2b9a2169c3b7c9d80ff33417279fb465 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitioned.java
 a386415fb406bb0cda18f7913650874d6a236e21 
  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java
 7c9003e86c61dc9e4f10e05b0c29e40ded73c793 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
 3b9bf433175309bac15fbe857b964aac857e06d8 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 
69545b046db06fd56f35a0da09d3d6960832484d 

Diff: https://reviews.apache.org/r/24136/diff/


Testing
---

Run unit tests.


Thanks,

David Chen



[jira] [Commented] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter

2014-08-28 Thread David Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114530#comment-14114530
 ] 

David Chen commented on HIVE-4329:
--

Hi Sushanth,

I really appreciate you taking your time to look at this patch and for your 
tips. However, I am still a bit unclear about some of the concerns you 
mentioned.

bq. Unfortunately, this will not work, because that simply fetches a substitute 
HiveOutputFormat from a map of substitutes, which contain substitutes for only 
IgnoreKeyTextOutputFormat and SequenceFileOutputFormat.

>From my understanding, {{HivePassThroughOutputFormat}} was introduced in order 
>to support generic OutputFormats and not just {{HiveOutputFormat}}. According 
>to {{[HiveFileFormatUtils. 
>getOutputFormatSubstitute|https://github.com/apache/hive/blob/b8250ac2f30539f6b23ce80a20a9e338d3d31458/ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java]}},
> {{HivePassThroughOutputFormat}} is returned if the {{OutputFormat}} does not 
>exist in the map but only if it is called with {{storageHandlerFlag = true}}. 
>From [searching the 
>codebase|https://github.com/apache/hive/search?utf8=%E2%9C%93&q=getOutputFormatSubstitute&type=Code],
> the only place where {{getOutputFormatSubstitute}} could be called with 
>{{storageHandlerFlag}} set to true is in {{Table.getOutputFormatClass}} and if 
>the {{storage_handler}} property is set.

As a result, I changed my patch to retrieve the {{OutputFormat}} class using 
{{Table.getOutputFormatClass}} so that HCatalog would follow the same codepath 
as Hive proper for getting the {{OutputFormat}}. Does this address your concern?

bq. If your patch were so that it fetches an underlying HiveOutputFormat, and 
if it were a HiveOutputFormat, using getHiveRecordWriter, and if it were not, 
using getRecordWriter, that solution would not break runtime backward 
compatibility, and would be acceptable

I tried this approach, but I think that it is cleaner to change 
{{OutputFormatContainer}} and {{RecordWriterContainer}} to wrap the Hive 
implementations ({{HiveOutputFormat}} and {{FileSinkOperator.RecordWriter}}) 
rather than introduce yet another set of wrappers. After all, Hive already has 
a mechanism for supporting both Hive OFs and MR OFs by wrapping MR OFs with 
{{HivePassThroughOutputFormat}}, and I think that HCatalog should evolve to 
share more common infrastructure with Hive.

I have attached a new revision of my patch that now fixes the original reason 
why this ticket is opened; writing to an Avro table via HCatalog now works. 
There are still a few remaining issues though:

 * The way that tables with static partitioning is handled is not completely 
correct. I have opened HIVE-7855 to address that issue.
 * Writing to a Parquet table does not work but more investigation is needed to 
determine whether this is caused by a bug in HCatalog or in the Parquet SerDe.

> HCatalog should use getHiveRecordWriter rather than getRecordWriter
> ---
>
> Key: HIVE-4329
> URL: https://issues.apache.org/jira/browse/HIVE-4329
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Serializers/Deserializers
>Affects Versions: 0.14.0
> Environment: discovered in Pig, but it looks like the root cause 
> impacts all non-Hive users
>Reporter: Sean Busbey
>Assignee: David Chen
> Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch
>
>
> Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
> with the following stacktrace:
> {code}
> java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
> cast to org.apache.hadoop.io.LongWritable
>   at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
>   at 
> org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
>   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
>   at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
>   at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
> {code}
> The proximal cause of this failure is that the AvroContainerOutputFormat's 
> signature mandates a LongWritable key an

[jira] [Commented] (HIVE-7557) When reduce is vectorized, dynpart_sort_opt_vectorization.q under Tez fails

2014-08-28 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114524#comment-14114524
 ] 

Matt McCline commented on HIVE-7557:


Also vector_non_string_partition fails with same problem.

> When reduce is vectorized, dynpart_sort_opt_vectorization.q under Tez fails
> ---
>
> Key: HIVE-7557
> URL: https://issues.apache.org/jira/browse/HIVE-7557
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-7557.1.patch
>
>
> Turned off dynpart_sort_opt_vectorization.q (Tez) since it fails when reduce 
> is vectorized to get HIVE-7029 checked in.
> Stack trace:
> {code}
> Container released by application, 
> AttemptID:attempt_1406747677386_0003_2_00_00_2 Info:Error: 
> java.lang.RuntimeException: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing vector batch (tag=0) [Error getting row data with exception 
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:168)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
>   at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:562)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:394)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:551)
>  ]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:188)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
>   at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:562)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:394)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:551)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing vector batch (tag=0) [Error getting row data with exception 
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DoubleColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
>   at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterLong.writeValue(VectorExpressionWriterFactory.java:168)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processVectors(ReduceRecordProcessor.java:481)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:371)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:291)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:165)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
>   at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:562)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:394)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>   at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:551)
>  ]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.processRows(ReduceRecordProcessor.java:382)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRec

[jira] [Commented] (HIVE-7904) Missing null check cause NPE when updating join column stats in statistics annotation

2014-08-28 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114520#comment-14114520
 ] 

Gunther Hagleitner commented on HIVE-7904:
--

+1

> Missing null check cause NPE when updating join column stats in statistics 
> annotation
> -
>
> Key: HIVE-7904
> URL: https://issues.apache.org/jira/browse/HIVE-7904
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Statistics
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
>Priority: Trivial
> Fix For: 0.13.0
>
> Attachments: HIVE-7904.1.patch
>
>
> Column stats updation in join stats rule annotation can cause NPE if column 
> stats is missing from one relation. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7904) Missing null check cause NPE when updating join column stats in statistics annotation

2014-08-28 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7904:
-

Attachment: HIVE-7904.1.patch

> Missing null check cause NPE when updating join column stats in statistics 
> annotation
> -
>
> Key: HIVE-7904
> URL: https://issues.apache.org/jira/browse/HIVE-7904
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Statistics
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
>Priority: Trivial
> Fix For: 0.13.0
>
> Attachments: HIVE-7904.1.patch
>
>
> Column stats updation in join stats rule annotation can cause NPE if column 
> stats is missing from one relation. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7904) Missing null check cause NPE when updating join column stats in statistics annotation

2014-08-28 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7904:
-

Status: Patch Available  (was: Open)

> Missing null check cause NPE when updating join column stats in statistics 
> annotation
> -
>
> Key: HIVE-7904
> URL: https://issues.apache.org/jira/browse/HIVE-7904
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor, Statistics
>Affects Versions: 0.14.0
>Reporter: Prasanth J
>Assignee: Prasanth J
>Priority: Trivial
> Fix For: 0.13.0
>
> Attachments: HIVE-7904.1.patch
>
>
> Column stats updation in join stats rule annotation can cause NPE if column 
> stats is missing from one relation. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7853) Make OrcNewInputFormat return row number as a key

2014-08-28 Thread john (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114501#comment-14114501
 ] 

john commented on HIVE-7853:


Navis: what do you think about the failed test case results?

> Make OrcNewInputFormat return row number as a key
> -
>
> Key: HIVE-7853
> URL: https://issues.apache.org/jira/browse/HIVE-7853
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.1
> Environment: all
>Reporter: john
>Assignee: Navis
>  Labels: Orc
> Attachments: HIVE-7853.1.patch.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Key is null in map when OrcNewInputFormat is used as Input Format Class
> When using OrcNewInputFormat as input format class for my map reduce job, I 
> find its key is always null in my map method. This gives me no way to get row 
> number in my map method.  If you compare RCFileInputFormat (for RC file), its 
> key in map method returns the row number so I know which row I am processing. 
> Is there any workaround for me to get the row number from my map method?  Of 
> course, I can count the row number by myself.  But that has two problems: #1 
> I have to assume the row is coming in the order; #2 I will get duplicated 
> (and wrong) row numbers if a big input file causes multiple file splits 
> (which will trigger my map method multiple times in different data nodes).   
> At this point, I am really seeking a better way to get row number for each 
> processed row in map method.
> Here is what I have in my map logs:
>   [2014-08-06 09:39:25 DEBUG com..hadoop.orcfile.OrcFileMap]: Mapper 
> Input Key: (null)
>   [2014-08-06 09:39:25 DEBUG com..hadoop.orcfile.OrcFileMap]: Mapper 
> Input Value: {Q8151, T9976, 69976, 8156756, 966798161, 
> 97898989898, Laura, laura...@gmail.com}
> My map method is:
>   protected void map(Object key, Writable value, Context context)
>   throws IOException, InterruptedException {
>   logger.debug("Mapper Input Key: " + key);
>   logger.debug("Mapper Input Value: " + value.toString());
>   .
>   }
> The fix should be: add  following statement in nextKeyValue() method and pass 
> the result all the way up to the map() method as its key:
>   reader.getRowNumber(); 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7803) Enable Hadoop speculative execution may cause corrupt output directory (dynamic partition)

2014-08-28 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114461#comment-14114461
 ] 

Mithun Radhakrishnan commented on HIVE-7803:


Thanks for reviewing this, Sush. Selina has a similar fix for this on 0.12. 
We'll be running some tests with speculative-execution enabled. We'll report 
back if anything turns up.

(I wish there were a deterministic way to test this. :/)

> Enable Hadoop speculative execution may cause corrupt output directory 
> (dynamic partition)
> --
>
> Key: HIVE-7803
> URL: https://issues.apache.org/jira/browse/HIVE-7803
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
> Environment: 
>Reporter: Selina Zhang
>Assignee: Selina Zhang
>Priority: Critical
> Attachments: HIVE-7803.1.patch
>
>
> One of our users reports they see intermittent failures due to attempt 
> directories in the input paths. We found with speculative execution turned 
> on, two mappers tried to commit task at the same time using the same 
> committed task path,  which cause the corrupt output directory. 
> The original Pig script:
> {code}
> STORE AdvertiserDataParsedClean INTO '$DB_NAME.$ADVERTISER_META_TABLE_NAME'
> USING org.apache.hcatalog.pig.HCatStorer();
> {code}
> Two mappers
> attempt_1405021984947_5394024_m_000523_0: KILLED
> attempt_1405021984947_5394024_m_000523_1: SUCCEEDED
> attempt_1405021984947_5394024_m_000523_0 was killed right after the commit.
> As a result, it created corrupt directory as 
>   
> /projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523/
> containing 
>part-m-00523 (from attempt_1405021984947_5394024_m_000523_0)
> and 
>attempt_1405021984947_5394024_m_000523_1/part-m-00523
> Namenode Audit log
> ==
> 1. 2014-08-05 05:04:36,811 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 2. 2014-08-05 05:04:53,112 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 3. 2014-08-05 05:05:13,001 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> 4. 2014-08-05 05:05:13,004 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> After consulting our Hadoop core team, we was pointed out some HCat code does 
> not participating in the two-phase commit protocol, for example in 
> FileRecordWriterContainer.close():
> {code}
> for (Map.Entry 
> entry : baseDynamicCommitters.entrySet()) {
> org.apache.hadoop.mapred.TaskAttemptContext currContext = 
> dynamicContexts.get(entry.getKey());
> OutputCommitter baseOutputCommitter = entry.getValue();
> if (baseOutputCommitter.needsTaskCommit(currContext)) {
> baseOutputCommitter.commitTask(currContext);
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7904) Missing null check cause NPE when updating join column stats in statistics annotation

2014-08-28 Thread Prasanth J (JIRA)
Prasanth J created HIVE-7904:


 Summary: Missing null check cause NPE when updating join column 
stats in statistics annotation
 Key: HIVE-7904
 URL: https://issues.apache.org/jira/browse/HIVE-7904
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
Priority: Trivial


Column stats updation in join stats rule annotation can cause NPE if column 
stats is missing from one relation. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7352) Queries without tables fail under Tez

2014-08-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114434#comment-14114434
 ] 

Hive QA commented on HIVE-7352:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12664817/HIVE-7352.2.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6127 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.ql.parse.TestGenTezWork.testCreateMap
org.apache.hadoop.hive.ql.parse.TestGenTezWork.testCreateReduce
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/546/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/546/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-546/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12664817

> Queries without tables fail under Tez
> -
>
> Key: HIVE-7352
> URL: https://issues.apache.org/jira/browse/HIVE-7352
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 0.13.0, 0.13.1
>Reporter: Craig Condit
>Assignee: Gunther Hagleitner
> Attachments: HIVE-7352.1.patch.txt, HIVE-7352.2.patch
>
>
> Hive 0.13.0 added support for queries that do not reference tables (such as 
> 'SELECT 1'). These queries fail under Tez:
> {noformat}
> Vertex failed as one or more tasks failed. failedTasks:1]
> 14/07/07 09:54:42 ERROR tez.TezJobMonitor: Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1404652697071_4487_1_00, diagnostics=[Task failed, 
> taskId=task_1404652697071_4487_1_00_00, 
> diagnostics=[AttemptID:attempt_1404652697071_4487_1_00_00_0 Info:Error: 
> java.lang.RuntimeException: java.lang.IllegalArgumentException: Can not 
> create a Path from an empty string
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:174)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:113)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:79)
>   at 
> org.apache.tez.mapreduce.input.MRInput.setupOldRecordReader(MRInput.java:205)
>   at 
> org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:362)
>   at 
> org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:341)
>   at 
> org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:99)
>   at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:68)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:141)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:307)
>   at 
> org.apache.hadoop.mapred.YarnTezDagChild$5.run(YarnTezDagChild.java:562)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>   at 
> org.apache.hadoop.mapred.YarnTezDagChild.main(YarnTezDagChild.java:551)
> Caused by: java.lang.IllegalArgumentException: Can not create a Path from an 
> empty string
>   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:127)
>   at org.apache.hadoop.fs.Path.(Path.java:135)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat$HiveInputSplit.getPath(HiveInputFormat.java:110)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:228)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:171)
>   ... 14 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter

2014-08-28 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-4329:
-

Attachment: HIVE-4329.2.patch

> HCatalog should use getHiveRecordWriter rather than getRecordWriter
> ---
>
> Key: HIVE-4329
> URL: https://issues.apache.org/jira/browse/HIVE-4329
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Serializers/Deserializers
>Affects Versions: 0.14.0
> Environment: discovered in Pig, but it looks like the root cause 
> impacts all non-Hive users
>Reporter: Sean Busbey
>Assignee: David Chen
> Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch
>
>
> Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
> with the following stacktrace:
> {code}
> java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
> cast to org.apache.hadoop.io.LongWritable
>   at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
>   at 
> org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
>   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
>   at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
>   at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
> {code}
> The proximal cause of this failure is that the AvroContainerOutputFormat's 
> signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
> forces a NullWritable. I'm not sure of a general fix, other than redefining 
> HiveOutputFormat to mandate a WritableComparable.
> It looks like accepting WritableComparable is what's done in the other Hive 
> OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
> be changed, since it's ignoring the key. That way fixing things so 
> FileRecordWriterContainer can always use NullWritable could get spun into a 
> different issue?
> The underlying cause for failure to write to AvroSerde tables is that 
> AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
> fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24136: HIVE-4329: HCatalog should use getHiveRecordWriter.

2014-08-28 Thread David Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24136/
---

(Updated Aug. 28, 2014, 9:56 p.m.)


Review request for hive.


Changes
---

Remove debug prints.


Bugs: HIVE-4329
https://issues.apache.org/jira/browse/HIVE-4329


Repository: hive-git


Description
---

HIVE-4329: HCatalog should use getHiveRecordWriter.


Diffs (updated)
-

  hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
93a03adeab7ba3c3c91344955d303e4252005239 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultOutputFormatContainer.java
 3a07b0ca7c1956d45e611005cbc5ba2464596471 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultRecordWriterContainer.java
 209d7bcef5624100c6cdbc2a0a137dcaf1c1fc42 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicPartitionFileRecordWriterContainer.java
 4df912a935221e527c106c754ff233d212df9246 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java
 1a7595fd6dd0a5ffbe529bc24015c482068233bf 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileRecordWriterContainer.java
 2a883d6517bfe732b6a6dffa647d9d44e4145b38 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 bfa8657cd1b16aec664aab3e22b430b304a3698d 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatBaseOutputFormat.java
 4f7a74a002cedf3b54d0133041184fbcd9d9c4ab 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatMapRedUtil.java
 b651cb323771843da43667016a7dd2c9d9a1ddac 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java
 694739821a202780818924d54d10edb707cfbcfa 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java
 1980ef50af42499e0fed8863b6ff7a45f926d9fc 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InternalUtil.java
 9b979395e47e54aac87487cb990824e3c3a2ee19 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputFormatContainer.java
 d83b003f9c16e78a39b3cc7ce810ff19f70848c2 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/RecordWriterContainer.java
 5905b46178b510b3a43311739fea2b95f47b4ed7 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticPartitionFileRecordWriterContainer.java
 b3ea76e6a79f94e09972bc060c06105f60087b71 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/HCatMapReduceTest.java
 ee57f3fd126af2e36039f84686a4169ef6267593 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatDynamicPartitioned.java
 0d87c6ce2b9a2169c3b7c9d80ff33417279fb465 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitioned.java
 a386415fb406bb0cda18f7913650874d6a236e21 
  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java
 7c9003e86c61dc9e4f10e05b0c29e40ded73c793 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
 3b9bf433175309bac15fbe857b964aac857e06d8 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 
69545b046db06fd56f35a0da09d3d6960832484d 

Diff: https://reviews.apache.org/r/24136/diff/


Testing
---

Run unit tests.


Thanks,

David Chen



[jira] [Updated] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter

2014-08-28 Thread David Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Chen updated HIVE-4329:
-

Attachment: HIVE-4329.1.patch

> HCatalog should use getHiveRecordWriter rather than getRecordWriter
> ---
>
> Key: HIVE-4329
> URL: https://issues.apache.org/jira/browse/HIVE-4329
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Serializers/Deserializers
>Affects Versions: 0.14.0
> Environment: discovered in Pig, but it looks like the root cause 
> impacts all non-Hive users
>Reporter: Sean Busbey
>Assignee: David Chen
> Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch
>
>
> Attempting to write to a HCatalog defined table backed by the AvroSerde fails 
> with the following stacktrace:
> {code}
> java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be 
> cast to org.apache.hadoop.io.LongWritable
>   at 
> org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253)
>   at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53)
>   at 
> org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242)
>   at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98)
>   at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559)
>   at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85)
> {code}
> The proximal cause of this failure is that the AvroContainerOutputFormat's 
> signature mandates a LongWritable key and HCat's FileRecordWriterContainer 
> forces a NullWritable. I'm not sure of a general fix, other than redefining 
> HiveOutputFormat to mandate a WritableComparable.
> It looks like accepting WritableComparable is what's done in the other Hive 
> OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also 
> be changed, since it's ignoring the key. That way fixing things so 
> FileRecordWriterContainer can always use NullWritable could get spun into a 
> different issue?
> The underlying cause for failure to write to AvroSerde tables is that 
> AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so 
> fixing the above will just push the failure into the placeholder RecordWriter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24136: HIVE-4329: HCatalog should use getHiveRecordWriter.

2014-08-28 Thread David Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24136/
---

(Updated Aug. 28, 2014, 9:53 p.m.)


Review request for hive.


Changes
---

Writing for Avro now works.


Bugs: HIVE-4329
https://issues.apache.org/jira/browse/HIVE-4329


Repository: hive-git


Description
---

HIVE-4329: HCatalog should use getHiveRecordWriter.


Diffs (updated)
-

  hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
93a03adeab7ba3c3c91344955d303e4252005239 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/data/schema/HCatSchema.java
 c0209dbf90f29808e21897db79c7e155b90445df 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultOutputFormatContainer.java
 3a07b0ca7c1956d45e611005cbc5ba2464596471 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultRecordWriterContainer.java
 209d7bcef5624100c6cdbc2a0a137dcaf1c1fc42 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicPartitionFileRecordWriterContainer.java
 4df912a935221e527c106c754ff233d212df9246 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java
 1a7595fd6dd0a5ffbe529bc24015c482068233bf 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileRecordWriterContainer.java
 2a883d6517bfe732b6a6dffa647d9d44e4145b38 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
 bfa8657cd1b16aec664aab3e22b430b304a3698d 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatBaseOutputFormat.java
 4f7a74a002cedf3b54d0133041184fbcd9d9c4ab 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatMapRedUtil.java
 b651cb323771843da43667016a7dd2c9d9a1ddac 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java
 694739821a202780818924d54d10edb707cfbcfa 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java
 1980ef50af42499e0fed8863b6ff7a45f926d9fc 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InternalUtil.java
 9b979395e47e54aac87487cb990824e3c3a2ee19 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputFormatContainer.java
 d83b003f9c16e78a39b3cc7ce810ff19f70848c2 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/RecordWriterContainer.java
 5905b46178b510b3a43311739fea2b95f47b4ed7 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticPartitionFileRecordWriterContainer.java
 b3ea76e6a79f94e09972bc060c06105f60087b71 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/HCatMapReduceTest.java
 ee57f3fd126af2e36039f84686a4169ef6267593 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatDynamicPartitioned.java
 0d87c6ce2b9a2169c3b7c9d80ff33417279fb465 
  
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitioned.java
 a386415fb406bb0cda18f7913650874d6a236e21 
  
hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java
 7c9003e86c61dc9e4f10e05b0c29e40ded73c793 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
 3b9bf433175309bac15fbe857b964aac857e06d8 
  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 
69545b046db06fd56f35a0da09d3d6960832484d 
  
serde/src/java/org/apache/hadoop/hive/serde2/columnar/LazyBinaryColumnarSerDe.java
 86d23f10a66f9a33023e2153202b970c386bae20 

Diff: https://reviews.apache.org/r/24136/diff/


Testing (updated)
---

Run unit tests.


Thanks,

David Chen



[jira] [Updated] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-08-28 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-7482:
-

Attachment: HIVE-7482.3.patch

> The execution side changes for SMB join in hive-tez
> ---
>
> Key: HIVE-7482
> URL: https://issues.apache.org/jira/browse/HIVE-7482
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-7482.1.patch, HIVE-7482.2.patch, HIVE-7482.3.patch, 
> HIVE-7482.WIP.2.patch, HIVE-7482.WIP.3.patch, HIVE-7482.WIP.4.patch, 
> HIVE-7482.WIP.patch
>
>
> A piece of HIVE-7430.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7803) Enable Hadoop speculative execution may cause corrupt output directory (dynamic partition)

2014-08-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114404#comment-14114404
 ] 

Sushanth Sowmyan commented on HIVE-7803:


Hi,

I like this change - it's simple and simplifies the flow to outputCommitter 
nicely. (This will need testing to verify that it works, and probably the full 
suite of e2e tests as well to verify that it continues to work from HCatStorer, 
but it's a good change.)

That said, in its current form, this patch will not work, I think - not without 
one more change to the FileOutputCommitterContainer - simply put, 
FileOutputCommitterContainer.needsTaskCommit() currently returns false if it 
detects that dynamic partitioning has been used (since it assumes that the 
recordwriter already did it). With your change, it will need to be updated to 
return true. Apart from that, this looks good to me. If you update your patch 
and set it to patch-available, we can have the tests run on it.

> Enable Hadoop speculative execution may cause corrupt output directory 
> (dynamic partition)
> --
>
> Key: HIVE-7803
> URL: https://issues.apache.org/jira/browse/HIVE-7803
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.13.1
> Environment: 
>Reporter: Selina Zhang
>Assignee: Selina Zhang
>Priority: Critical
> Attachments: HIVE-7803.1.patch
>
>
> One of our users reports they see intermittent failures due to attempt 
> directories in the input paths. We found with speculative execution turned 
> on, two mappers tried to commit task at the same time using the same 
> committed task path,  which cause the corrupt output directory. 
> The original Pig script:
> {code}
> STORE AdvertiserDataParsedClean INTO '$DB_NAME.$ADVERTISER_META_TABLE_NAME'
> USING org.apache.hcatalog.pig.HCatStorer();
> {code}
> Two mappers
> attempt_1405021984947_5394024_m_000523_0: KILLED
> attempt_1405021984947_5394024_m_000523_1: SUCCEEDED
> attempt_1405021984947_5394024_m_000523_0 was killed right after the commit.
> As a result, it created corrupt directory as 
>   
> /projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523/
> containing 
>part-m-00523 (from attempt_1405021984947_5394024_m_000523_0)
> and 
>attempt_1405021984947_5394024_m_000523_1/part-m-00523
> Namenode Audit log
> ==
> 1. 2014-08-05 05:04:36,811 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 2. 2014-08-05 05:04:53,112 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=create 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1/part-m-00523
>  dst=null  perm=user:group:rw-r-
> 3. 2014-08-05 05:05:13,001 INFO FSNamesystem.audit: ugi=* ip=ipaddress1 
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_0
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> 4. 2014-08-05 05:05:13,004 INFO FSNamesystem.audit: ugi=* ip=ipaddress2  
> cmd=rename 
> src=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/_temporary/attempt_1405021984947_5394024_m_000523_1
> dst=/projects/.../tablename/_DYN0.7192688458252056/load_time=20140805/type=complete/_temporary/1/task_1405021984947_5394024_m_000523
> perm=user:group:rwxr-x---
> After consulting our Hadoop core team, we was pointed out some HCat code does 
> not participating in the two-phase commit protocol, for example in 
> FileRecordWriterContainer.close():
> {code}
> for (Map.Entry 
> entry : baseDynamicCommitters.entrySet()) {
> org.apache.hadoop.mapred.TaskAttemptContext currContext = 
> dynamicContexts.get(entry.getKey());
> OutputCommitter baseOutputCommitter = entry.getValue();
> if (baseOutputCommitter.needsTaskCommit(currContext)) {
> baseOutputCommitter.commitTask(currContext);
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7903) [Documentation] Remove hive.metastore.warehouse.dir from Client Configuration Parameters list in Remote Metastore section

2014-08-28 Thread Mariano Dominguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mariano Dominguez updated HIVE-7903:


Description: 
Source: 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-RemoteMetastore

In Remote Metastore deployment mode, neither the Hive CLI nor the Beeline can 
change the value of the ‘hive.metastore.warehouse.dir’ property because it is a 
“server-side” property.

Changing the value can be accomplished, however, by running in Local Metastore 
mode (that is, bypassing the Hive Metastore Server and directly accessing the 
Metastore database):

1) At runtime
$ hive --hiveconf hive.metastore.warehouse.dir= -e “”
$ beeline --hiveconf hive.metastore.warehouse.dir= -n  -p 
 -u  -e ""

2) In the shell
hive > SET hive.metastore.warehouse.dir=;
beeline > SET hive.metastore.warehouse.dir=;

3) From configuration file: hive-site.xml

This property gets cached once a table/database is created; therefore, 
subsequent value changes will not take effect. You will need to start a new 
session to re-set the property.


  was:
Source: 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-RemoteMetastore

In Remote Metastore deployment mode, neither the Hive CLI nor the Beeline can 
change the value of the ‘hive.metastore.warehouse.dir’ property because it is a 
“server-side” property.

Changing the value can be accomplished, however, by running in Local Metastore 
mode (that is, bypassing the Hive Metastore Server and directly accessing the 
Metastore database):

1) At runtime
$ hive --hiveconf hive.metastore.warehouse.dir= -e “”
$ beeline --hiveconf hive.metastore.warehouse.dir= -n  -p 
 -u  -e 

2) In the shell
hive > SET hive.metastore.warehouse.dir=;
beeline > SET hive.metastore.warehouse.dir=;

3) From configuration file: hive-site.xml

This property gets cached once a table/database is created; therefore, 
subsequent value changes will not take effect. You will need to start a new 
session to re-set the property.



> [Documentation] Remove hive.metastore.warehouse.dir from Client Configuration 
> Parameters list in Remote Metastore section
> -
>
> Key: HIVE-7903
> URL: https://issues.apache.org/jira/browse/HIVE-7903
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.12.0
>Reporter: Mariano Dominguez
>
> Source: 
> https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-RemoteMetastore
> In Remote Metastore deployment mode, neither the Hive CLI nor the Beeline can 
> change the value of the ‘hive.metastore.warehouse.dir’ property because it is 
> a “server-side” property.
> Changing the value can be accomplished, however, by running in Local 
> Metastore mode (that is, bypassing the Hive Metastore Server and directly 
> accessing the Metastore database):
> 1) At runtime
> $ hive --hiveconf hive.metastore.warehouse.dir= -e “”
> $ beeline --hiveconf hive.metastore.warehouse.dir= -n  -p 
>  -u  -e ""
> 2) In the shell
> hive > SET hive.metastore.warehouse.dir=;
> beeline > SET hive.metastore.warehouse.dir=;
> 3) From configuration file: hive-site.xml
> This property gets cached once a table/database is created; therefore, 
> subsequent value changes will not take effect. You will need to start a new 
> session to re-set the property.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7903) [Documentation] Remove hive.metastore.warehouse.dir from Client Configuration Parameters list in Remote Metastore section

2014-08-28 Thread Mariano Dominguez (JIRA)
Mariano Dominguez created HIVE-7903:
---

 Summary: [Documentation] Remove hive.metastore.warehouse.dir from 
Client Configuration Parameters list in Remote Metastore section
 Key: HIVE-7903
 URL: https://issues.apache.org/jira/browse/HIVE-7903
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.12.0
Reporter: Mariano Dominguez


Source: 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-RemoteMetastore

In Remote Metastore deployment mode, neither the Hive CLI nor the Beeline can 
change the value of the ‘hive.metastore.warehouse.dir’ property because it is a 
“server-side” property.

Changing the value can be accomplished, however, by running in Local Metastore 
mode (that is, bypassing the Hive Metastore Server and directly accessing the 
Metastore database):

1) At runtime
$ hive --hiveconf hive.metastore.warehouse.dir= -e “”
$ beeline --hiveconf hive.metastore.warehouse.dir= -n  -p 
 -u  -e 

2) In the shell
hive > SET hive.metastore.warehouse.dir=;
beeline > SET hive.metastore.warehouse.dir=;

3) From configuration file: hive-site.xml

This property gets cached once a table/database is created; therefore, 
subsequent value changes will not take effect. You will need to start a new 
session to re-set the property.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-08-28 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114369#comment-14114369
 ] 

Vikram Dixit K commented on HIVE-7482:
--

I haven't addressed your comments yet. I will address them shortly and put up a 
.3.

> The execution side changes for SMB join in hive-tez
> ---
>
> Key: HIVE-7482
> URL: https://issues.apache.org/jira/browse/HIVE-7482
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-7482.1.patch, HIVE-7482.2.patch, 
> HIVE-7482.WIP.2.patch, HIVE-7482.WIP.3.patch, HIVE-7482.WIP.4.patch, 
> HIVE-7482.WIP.patch
>
>
> A piece of HIVE-7430.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6633) pig -useHCatalog with embedded metastore fails to pass command line args to metastore

2014-08-28 Thread Eric Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114367#comment-14114367
 ] 

Eric Hanson commented on HIVE-6633:
---

Thanks Sushanth for tracking down the problem. I'll regenerate the patch and 
track that on HIVE-7901.

> pig -useHCatalog with embedded metastore fails to pass command line args to 
> metastore
> -
>
> Key: HIVE-6633
> URL: https://issues.apache.org/jira/browse/HIVE-6633
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0
>Reporter: Eric Hanson
>Assignee: Eric Hanson
> Fix For: 0.13.0
>
> Attachments: HIVE-6633.01.patch
>
>
> This fails because the embedded metastore can't connect to the database 
> because the command line -D arguments passed to pig are not getting passed to 
> the metastore when the embedded metastore is created. Using 
> hive.metastore.uris set to the empty string causes creation of an embedded 
> metastore.
> pig -useHCatalog "-Dhive.metastore.uris=" 
> "-Djavax.jdo.option.ConnectionPassword=AzureSQLDBXYZ"
> The goal is to allow a pig job submitted via WebHCat to specify a metastore 
> to use via job arguments. That is not working because it is not possible to 
> pass Djavax.jdo.option.ConnectionPassword and other necessary arguments to 
> the embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24472: HIVE-7649: Support column stats with temporary tables

2014-08-28 Thread Jason Dere


> On Aug. 28, 2014, 7:56 a.m., Prasanth_J wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java,
> >  line 396
> > 
> >
> > Is there any reason why you are not using FieldSchema's equals() here?
> 
> Jason Dere wrote:
> FieldSchema.equals() also compares the column comment, which could be 
> changed during alter table. If just the column comment changed the columns 
> are still relatively similar.
> 
> Prasanth_J wrote:
> Can you add equalsIgnoreComment() to FieldSchema then?

Unfortunately FieldSchema is a generated class based on the metastore Thrift 
definition file so we can't really do that. I can add that as a utility method 
somewhere but that's it.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24472/#review51754
---


On Aug. 26, 2014, 6:37 p.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24472/
> ---
> 
> (Updated Aug. 26, 2014, 6:37 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Bugs: HIVE-7649
> https://issues.apache.org/jira/browse/HIVE-7649
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Update SessionHiveMetastoreClient to get column stats to work for temp tables.
> 
> 
> Diffs
> -
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 5a56ced 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  37b1669 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java 
> 24f3710 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java fcfcf42 
>   ql/src/test/queries/clientnegative/temp_table_column_stats.q 9b7aa4a 
>   ql/src/test/queries/clientpositive/temp_table_display_colstats_tbllvl.q 
> PRE-CREATION 
>   ql/src/test/results/clientnegative/temp_table_column_stats.q.out 4b0c0bc 
>   ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24472/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Dere
> 
>



[jira] [Commented] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-08-28 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114364#comment-14114364
 ] 

Lefty Leverenz commented on HIVE-7482:
--

I put some comments for HIVE-7482.1.patch on the RB (just after you posted 
patch 2).

> The execution side changes for SMB join in hive-tez
> ---
>
> Key: HIVE-7482
> URL: https://issues.apache.org/jira/browse/HIVE-7482
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-7482.1.patch, HIVE-7482.2.patch, 
> HIVE-7482.WIP.2.patch, HIVE-7482.WIP.3.patch, HIVE-7482.WIP.4.patch, 
> HIVE-7482.WIP.patch
>
>
> A piece of HIVE-7430.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24472: HIVE-7649: Support column stats with temporary tables

2014-08-28 Thread Jason Dere


> On Aug. 28, 2014, 8:02 a.m., Prasanth_J wrote:
> > ql/src/test/queries/clientpositive/temp_table_display_colstats_tbllvl.q, 
> > line 1
> > 
> >
> > Can you also add a testcase for partitioned table? similar to 
> > columnstats_partlvl.q
> 
> Jason Dere wrote:
> Not currently supporting partitioned temp tables.
> 
> Prasanth_J wrote:
> Will it throw an exception in that case? If so can you add a 
> NegativeCliDriver test just to make sure it throws some exception if used 
> with partitioned tables.

Yep, there is already clientnegative/temp_table_partitions.q


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24472/#review51763
---


On Aug. 26, 2014, 6:37 p.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24472/
> ---
> 
> (Updated Aug. 26, 2014, 6:37 p.m.)
> 
> 
> Review request for hive and Prasanth_J.
> 
> 
> Bugs: HIVE-7649
> https://issues.apache.org/jira/browse/HIVE-7649
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Update SessionHiveMetastoreClient to get column stats to work for temp tables.
> 
> 
> Diffs
> -
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
> 5a56ced 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  37b1669 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java 
> 24f3710 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java fcfcf42 
>   ql/src/test/queries/clientnegative/temp_table_column_stats.q 9b7aa4a 
>   ql/src/test/queries/clientpositive/temp_table_display_colstats_tbllvl.q 
> PRE-CREATION 
>   ql/src/test/results/clientnegative/temp_table_column_stats.q.out 4b0c0bc 
>   ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24472/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Dere
> 
>



[jira] [Commented] (HIVE-7902) Cleanup hbase-handler/pom.xml dependency list

2014-08-28 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114359#comment-14114359
 ] 

Brock Noland commented on HIVE-7902:


Likely my fault during the mavenization project... +1 pending tests.

> Cleanup hbase-handler/pom.xml dependency list
> -
>
> Key: HIVE-7902
> URL: https://issues.apache.org/jira/browse/HIVE-7902
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.13.0, 0.13.1
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7902.1.patch
>
>
> Noticed an extra dependency {{hive-service}} when changing dependency version 
> of {{hive-hbase-handler}} from 0.12.0 to 0.13.0 in a third party application. 
> Tracing the log of hbase-handler/pom.xml file, it is added as part of ant to 
> maven migration and not because of any specific functionality requirement. 
> Dependency {{hive-service}} is not needed in {{hive-hbase-handler}} and can 
> be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7902) Cleanup hbase-handler/pom.xml dependency list

2014-08-28 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated HIVE-7902:
--

Status: Patch Available  (was: Open)

> Cleanup hbase-handler/pom.xml dependency list
> -
>
> Key: HIVE-7902
> URL: https://issues.apache.org/jira/browse/HIVE-7902
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.13.1, 0.13.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
>Priority: Minor
> Fix For: 0.14.0
>
> Attachments: HIVE-7902.1.patch
>
>
> Noticed an extra dependency {{hive-service}} when changing dependency version 
> of {{hive-hbase-handler}} from 0.12.0 to 0.13.0 in a third party application. 
> Tracing the log of hbase-handler/pom.xml file, it is added as part of ant to 
> maven migration and not because of any specific functionality requirement. 
> Dependency {{hive-service}} is not needed in {{hive-hbase-handler}} and can 
> be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7881) enable Qtest scriptfile1.q [Spark Branch]

2014-08-28 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-7881:


   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

This test fails in other runs as well, so probably unrelated.

Committed to trunk.  Thanks Chengxiang for the contribution!

> enable Qtest scriptfile1.q [Spark Branch]
> -
>
> Key: HIVE-7881
> URL: https://issues.apache.org/jira/browse/HIVE-7881
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M1
> Fix For: spark-branch
>
> Attachments: HIVE-7881.1-spark.patch
>
>
> scriptfile1.q failed due to script file not found, should verify whether add 
> script file to SparkContext.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7881) enable Qtest scriptfile1.q [Spark Branch]

2014-08-28 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14114351#comment-14114351
 ] 

Szehon Ho commented on HIVE-7881:
-

Typo, committed to spark.

> enable Qtest scriptfile1.q [Spark Branch]
> -
>
> Key: HIVE-7881
> URL: https://issues.apache.org/jira/browse/HIVE-7881
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Chengxiang Li
>Assignee: Chengxiang Li
>  Labels: Spark-M1
> Fix For: spark-branch
>
> Attachments: HIVE-7881.1-spark.patch
>
>
> scriptfile1.q failed due to script file not found, should verify whether add 
> script file to SparkContext.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-08-28 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-7482:
-

Attachment: HIVE-7482.2.patch

> The execution side changes for SMB join in hive-tez
> ---
>
> Key: HIVE-7482
> URL: https://issues.apache.org/jira/browse/HIVE-7482
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: tez-branch
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-7482.1.patch, HIVE-7482.2.patch, 
> HIVE-7482.WIP.2.patch, HIVE-7482.WIP.3.patch, HIVE-7482.WIP.4.patch, 
> HIVE-7482.WIP.patch
>
>
> A piece of HIVE-7430.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >