[jira] [Commented] (HIVE-8395) CBO: enable by default

2014-11-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215870#comment-14215870
 ] 

Hive QA commented on HIVE-8395:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12682045/HIVE-8395.24.patch

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 6647 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_colname
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_gby_star
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_type_in_plan
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_skewjoin
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_gby_star
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_gby_star2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1831/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1831/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1831/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12682045 - PreCommit-HIVE-TRUNK-Build

 CBO: enable by default
 --

 Key: HIVE-8395
 URL: https://issues.apache.org/jira/browse/HIVE-8395
 Project: Hive
  Issue Type: Improvement
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.15.0

 Attachments: HIVE-8395.01.patch, HIVE-8395.02.patch, 
 HIVE-8395.03.patch, HIVE-8395.04.patch, HIVE-8395.05.patch, 
 HIVE-8395.06.patch, HIVE-8395.07.patch, HIVE-8395.08.patch, 
 HIVE-8395.09.patch, HIVE-8395.10.patch, HIVE-8395.11.patch, 
 HIVE-8395.12.patch, HIVE-8395.12.patch, HIVE-8395.13.patch, 
 HIVE-8395.13.patch, HIVE-8395.14.patch, HIVE-8395.15.patch, 
 HIVE-8395.16.patch, HIVE-8395.17.patch, HIVE-8395.18.patch, 
 HIVE-8395.18.patch, HIVE-8395.19.patch, HIVE-8395.20.patch, 
 HIVE-8395.21.patch, HIVE-8395.22.patch, HIVE-8395.23.patch, 
 HIVE-8395.23.withon.patch, HIVE-8395.24.patch, HIVE-8395.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8903) downgrade guava version for spark branch from 14.0.1 to 11.0.2.[Spark Branch]

2014-11-18 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8903:

Status: Open  (was: Patch Available)

 downgrade guava version for spark branch from 14.0.1 to 11.0.2.[Spark Branch]
 -

 Key: HIVE-8903
 URL: https://issues.apache.org/jira/browse/HIVE-8903
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8903.1-spark.patch


 Hive trunk depends on guava 11.0.2, same as Hadoop and Tez. Spark depends on 
 guava 14.0.1, which we shaded guava in its assembly jar to avoid conflict for 
 Hive on Spark(HIVE-7387). Guava version is upgrade to 14.0.2 in Hive spark 
 branch, which should be unnecessary and lead to guava  conflicts(HIVE-8854). 
 We should downgrade guava dependency from 14.0.1 to 11.0.2 to keep consist 
 with Hive trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8903) downgrade guava version for spark branch from 14.0.1 to 11.0.2.[Spark Branch]

2014-11-18 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215880#comment-14215880
 ] 

Chengxiang Li commented on HIVE-8903:
-

Hi, [~szehon], I think Marcelo means that, although spark assembly jar includes 
shared guava 14, spark-core, as an independent build, would still depend on 
guava 14, our qtest depends on spark-core, I want to check whether qtests 
success in local mode with guava11 here. 

 downgrade guava version for spark branch from 14.0.1 to 11.0.2.[Spark Branch]
 -

 Key: HIVE-8903
 URL: https://issues.apache.org/jira/browse/HIVE-8903
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8903.1-spark.patch


 Hive trunk depends on guava 11.0.2, same as Hadoop and Tez. Spark depends on 
 guava 14.0.1, which we shaded guava in its assembly jar to avoid conflict for 
 Hive on Spark(HIVE-7387). Guava version is upgrade to 14.0.2 in Hive spark 
 branch, which should be unnecessary and lead to guava  conflicts(HIVE-8854). 
 We should downgrade guava dependency from 14.0.1 to 11.0.2 to keep consist 
 with Hive trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5775) Introduce Cost Based Optimizer to Hive

2014-11-18 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-5775:
-
Labels:   (was: TODOC14)

 Introduce Cost Based Optimizer to Hive
 --

 Key: HIVE-5775
 URL: https://issues.apache.org/jira/browse/HIVE-5775
 Project: Hive
  Issue Type: New Feature
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.14.0

 Attachments: CBO-2.pdf, HIVE-5775.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5775) Introduce Cost Based Optimizer to Hive

2014-11-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215900#comment-14215900
 ] 

Lefty Leverenz commented on HIVE-5775:
--

Thanks [~jpullokkaran], I removed the TODOC14 label on the assumption that no 
updates are needed at this time.

 Introduce Cost Based Optimizer to Hive
 --

 Key: HIVE-5775
 URL: https://issues.apache.org/jira/browse/HIVE-5775
 Project: Hive
  Issue Type: New Feature
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 0.14.0

 Attachments: CBO-2.pdf, HIVE-5775.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8893) Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode

2014-11-18 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-8893:
--
Attachment: HIVE-8893.5.patch

 Implement whitelist for builtin UDFs to avoid untrused code execution in 
 multiuser mode
 ---

 Key: HIVE-8893
 URL: https://issues.apache.org/jira/browse/HIVE-8893
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HiveServer2, SQL
Affects Versions: 0.14.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.15.0

 Attachments: HIVE-8893.3.patch, HIVE-8893.4.patch, HIVE-8893.5.patch


 The udfs like reflect() or java_method() enables executing a java method as 
 udf. While this offers lot of flexibility in the standalone mode, it can 
 become a security loophole in a secure multiuser environment. For example, in 
  HiveServer2 one can execute any available java code with user hive's 
 credentials.
 We need a whitelist and blacklist to restrict builtin udfs in Hiveserver2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8893) Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode

2014-11-18 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-8893:
--
Attachment: (was: HIVE-8893.2.patch)

 Implement whitelist for builtin UDFs to avoid untrused code execution in 
 multiuser mode
 ---

 Key: HIVE-8893
 URL: https://issues.apache.org/jira/browse/HIVE-8893
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HiveServer2, SQL
Affects Versions: 0.14.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.15.0

 Attachments: HIVE-8893.3.patch, HIVE-8893.4.patch, HIVE-8893.5.patch


 The udfs like reflect() or java_method() enables executing a java method as 
 udf. While this offers lot of flexibility in the standalone mode, it can 
 become a security loophole in a secure multiuser environment. For example, in 
  HiveServer2 one can execute any available java code with user hive's 
 credentials.
 We need a whitelist and blacklist to restrict builtin udfs in Hiveserver2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8835) identify dependency scope for Remote Spark Context.[Spark Branch]

2014-11-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215915#comment-14215915
 ] 

Hive QA commented on HIVE-8835:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12682104/HIVE-8835.1-spark.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7180 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/390/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/390/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-390/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12682104 - PreCommit-HIVE-SPARK-Build

 identify dependency scope for Remote Spark Context.[Spark Branch]
 -

 Key: HIVE-8835
 URL: https://issues.apache.org/jira/browse/HIVE-8835
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8835.1-spark.patch


 While submit job through Remote Spark Context, spark RDD graph generation and 
 job submit is executed in remote side, so we have to add hive  related 
 dependency into its classpath with spark.driver.extraClassPath. instead of 
 add all hive/hadoop dependency, we should narrow the scope and identify what 
 dependency remote spark context required. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8904) Hive should support multiple Key provider modes

2014-11-18 Thread Ferdinand Xu (JIRA)
Ferdinand Xu created HIVE-8904:
--

 Summary: Hive should support multiple Key provider modes
 Key: HIVE-8904
 URL: https://issues.apache.org/jira/browse/HIVE-8904
 Project: Hive
  Issue Type: Bug
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu


In the hadoop cyptographic filesystem, JavaKeyStoreProvider, KMSClientProvider 
are both supported. Although in the product environment KMS is more preferable, 
We should enable both of them in hive side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8904) Hive should support multiple Key provider modes

2014-11-18 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-8904:
---
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-8065

 Hive should support multiple Key provider modes
 ---

 Key: HIVE-8904
 URL: https://issues.apache.org/jira/browse/HIVE-8904
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu

 In the hadoop cyptographic filesystem, JavaKeyStoreProvider, 
 KMSClientProvider are both supported. Although in the product environment KMS 
 is more preferable, We should enable both of them in hive side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Propose to put JIRA traffic on separate hive list

2014-11-18 Thread Lefty Leverenz
+1

Would it be possible to send commits to the dev list, as well as creates?
Or maybe all changes to the Resolution or Status?

-- Lefty

On Mon, Nov 17, 2014 at 2:27 PM, Alan Gates ga...@hortonworks.com wrote:

 The hive dev list generates a lot of traffic.  The average for October was
 192 messages per day.  As a result no one sends hive dev directly to their
 inbox.  They either unsubscribe or they build filters that ship most or all
 of it to a folder.  Chasing people off the dev list is obviously not what
 we want.  Sending messages to folders means missing messages or not seeing
 them until you get unbusy enough to go read back mail in folders.

 The vast majority of this traffic is comments on JIRA tickets.  The way
 I've seen other very active Apache projects manage this is JIRA creates go
 to the dev list, but all other JIRA operations go to a separate list.  Then
 everyone can see new tickets, and if they are interested they can watch
 that JIRA.  If not, they are not burdened with the email from it.

 I propose we do this same thing in Hive.

 Alan.
 --
 Sent with Postbox http://www.getpostbox.com

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



[jira] [Updated] (HIVE-8904) Hive should support multiple Key provider modes

2014-11-18 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-8904:
---
Status: Patch Available  (was: Open)

 Hive should support multiple Key provider modes
 ---

 Key: HIVE-8904
 URL: https://issues.apache.org/jira/browse/HIVE-8904
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-8904.patch


 In the hadoop cyptographic filesystem, JavaKeyStoreProvider, 
 KMSClientProvider are both supported. Although in the product environment KMS 
 is more preferable, We should enable both of them in hive side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8904) Hive should support multiple Key provider modes

2014-11-18 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-8904:
---
Attachment: HIVE-8904.patch

 Hive should support multiple Key provider modes
 ---

 Key: HIVE-8904
 URL: https://issues.apache.org/jira/browse/HIVE-8904
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-8904.patch


 In the hadoop cyptographic filesystem, JavaKeyStoreProvider, 
 KMSClientProvider are both supported. Although in the product environment KMS 
 is more preferable, We should enable both of them in hive side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7073) Implement Binary in ParquetSerDe

2014-11-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215974#comment-14215974
 ] 

Hive QA commented on HIVE-7073:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12682063/HIVE-7073.patch

{color:green}SUCCESS:{color} +1 6647 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1832/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1832/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1832/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12682063 - PreCommit-HIVE-TRUNK-Build

 Implement Binary in ParquetSerDe
 

 Key: HIVE-7073
 URL: https://issues.apache.org/jira/browse/HIVE-7073
 Project: Hive
  Issue Type: Sub-task
Reporter: David Chen
Assignee: Ferdinand Xu
 Attachments: HIVE-7073.patch


 The ParquetSerDe currently does not support the BINARY data type. This ticket 
 is to implement the BINARY data type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-18 Thread Mickael Lacour (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215985#comment-14215985
 ] 

Mickael Lacour commented on HIVE-8359:
--

[~brocknoland], normally I picked the patch that [~rdblue] told me about (the 
review on the Review Board), but maybe not the last version.

[~rdblue] wanted me to update this patch to handle the HIVE-6994 instead of 
having two patches that will have the same behavior/code. And I like the way 
[~spena] wrote the solution (better than mine in my opinion).

[~spena], basically I modified the WritableGroupConverter to clean the 'current 
value'. If you don't do that, you will never have a null value inside an array, 
but the previous one.
{code}
diff --git 
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java
 
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java
index 582a5df..052b36d 100644
--- 
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java
+++ 
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java
@@ -54,6 +54,7 @@ public void start() {
 if (isMap) {
   mapPairContainer = new Writable[2];
 }
+currentValue = null;
   }
 
   @Override
{code}

And the second part was to add Null values from the ParquetHiveSerDe (values 
that I was skipping before for no valid reason).

{code}
diff --git 
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java
index b689336..4b36767 100644
--- ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java
+++ ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java
@@ -202,13 +202,11 @@ private ArrayWritable createArray(final Object obj, final 
ListObjectInspector in
 if (sourceArray != null) {
   for (final Object curObj : sourceArray) {
-final Writable newObj = createObject(curObj, subInspector);
-if (newObj != null) {
-  array.add(newObj);
-}
+array.add(createObject(curObj, subInspector));
   }
 }
 if (array.size()  0) {
-  final ArrayWritable subArray = new ArrayWritable(array.get(0).getClass(),
+  final ArrayWritable subArray = new ArrayWritable(Writable.class,
   array.toArray(new Writable[array.size()]));
   return new ArrayWritable(Writable.class, new Writable[] {subArray});
 } else {
{code}

And the qtest was just to be sure to handle empty array, null array, array with 
null, and the same for map.

{code}
+++ data/files/parquet_array_null_element.txt
@@ -0,0 +1,3 @@
+1|,7|CARRELAGE,MOQUETTE|key11:value11,key12:value12,key13:value13
+2|,|CAILLEBOTIS,|
+3|,42,||key11:value11,key12:,key13:
{code}


If you want to integrate them into your patch, feel free to do it, else I might 
want to duplicate your patch (:p) and add this fix.

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8868) SparkSession and SparkClient mapping[Spark Branch]

2014-11-18 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8868:

Status: Patch Available  (was: Open)

 SparkSession and SparkClient mapping[Spark Branch]
 --

 Key: HIVE-8868
 URL: https://issues.apache.org/jira/browse/HIVE-8868
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3, TODOC-SPARK
 Attachments: HIVE-8868.1-spark.patch, HIVE-8868.2-spark.patch


 It should be a seperate spark context for each user session, currently we 
 share a singleton local spark context in all user sessions with local spark, 
 and create remote spark context for each spark job with spark cluster.
 To binding one spark context to each user session, we may construct spark 
 client on session open, one thing to notify is that, is SparkSession::conf is 
 consist with Context::getConf? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8750) Commit initial encryption work

2014-11-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215992#comment-14215992
 ] 

Lefty Leverenz commented on HIVE-8750:
--

Doc note:  This adds configuration parameters *hive.exec.stagingdir* and 
*hive.exec.copyfile.maxsize* to HiveConf.java in encryption-branch.  (When the 
branch gets merged into trunk, the parameters will need to be documented in the 
wiki.)

 Commit initial encryption work
 --

 Key: HIVE-8750
 URL: https://issues.apache.org/jira/browse/HIVE-8750
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Sergio Peña
 Fix For: encryption-branch

 Attachments: HIVE-8750.1.patch


 I believe Sergio has some work done for encryption. In this item we'll commit 
 it to branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6977) Delete Hiveserver1

2014-11-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216018#comment-14216018
 ] 

Lefty Leverenz commented on HIVE-6977:
--

Added same warning at beginning of the ODBC doc:

* [Hive ODBC Driver | https://cwiki.apache.org/confluence/display/Hive/HiveODBC]

 Delete Hiveserver1
 --

 Key: HIVE-6977
 URL: https://issues.apache.org/jira/browse/HIVE-6977
 Project: Hive
  Issue Type: Task
  Components: JDBC, Server Infrastructure
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-6977.1.patch, HIVE-6977.patch


 See mailing list discussion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Propose to put JIRA traffic on separate hive list

2014-11-18 Thread Lars Francke
+1

That's a great idea Alan.

On Tue, Nov 18, 2014 at 9:49 AM, Lefty Leverenz leftylever...@gmail.com
wrote:

 +1

 Would it be possible to send commits to the dev list, as well as creates?
 Or maybe all changes to the Resolution or Status?

 -- Lefty

 On Mon, Nov 17, 2014 at 2:27 PM, Alan Gates ga...@hortonworks.com wrote:

  The hive dev list generates a lot of traffic.  The average for October
 was
  192 messages per day.  As a result no one sends hive dev directly to
 their
  inbox.  They either unsubscribe or they build filters that ship most or
 all
  of it to a folder.  Chasing people off the dev list is obviously not what
  we want.  Sending messages to folders means missing messages or not
 seeing
  them until you get unbusy enough to go read back mail in folders.
 
  The vast majority of this traffic is comments on JIRA tickets.  The way
  I've seen other very active Apache projects manage this is JIRA creates
 go
  to the dev list, but all other JIRA operations go to a separate list.
 Then
  everyone can see new tickets, and if they are interested they can watch
  that JIRA.  If not, they are not burdened with the email from it.
 
  I propose we do this same thing in Hive.
 
  Alan.
  --
  Sent with Postbox http://www.getpostbox.com
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
  to which it is addressed and may contain information that is
 confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 



Re: Mail bounces from ebuddy.com

2014-11-18 Thread Lars Francke
This is still happening. If one of the admins could give it another go
that'd be great. I can also file an issue with INFRA.

On Mon, Sep 1, 2014 at 4:01 PM, Damien Carol dca...@blitzbs.com wrote:

  There are still these annoying ebuddy.com addresses :

 nelshe...@ebuddy.com
 bsc...@ebuddy.com

 Are there an administrator to send the 2 emails at 
 *dev-unsubscribe-user=email
 address@hive.apache.org http://hive.apache.org* ???

 Thanks in advance.

 Regards,

  Damien CAROL

- tél : +33 (0)4 74 96 88 14
- fax : +33 (0)4 74 96 31 88
- email : dca...@blitzbs.com

 BLITZ BUSINESS SERVICE
  Le 23/08/2014 01:22, Lars Francke a écrit :

 Likewise. From Alan's linked documentation it seems like the correct E-Mail
 address to use is:

 dev-unsubscribe-user=email address@hive.apache.org

 If you could try again maybe?


 On Wed, Aug 20, 2014 at 9:31 PM, Nick Dimiduk ndimi...@gmail.com 
 ndimi...@gmail.com wrote:


  Not quite taken care of. I'm still getting spam about these addresses.


 On Mon, Aug 18, 2014 at 9:18 AM, Lars Francke lars.fran...@gmail.com 
 lars.fran...@gmail.com
 wrote:


  Thanks Alan and Ashutosh for taking care of this!


 On Mon, Aug 18, 2014 at 5:45 PM, Ashutosh Chauhan hashut...@apache.org 
 hashut...@apache.org
 wrote:


  Thanks, Alan for the hint. I just unsubscribed those two email

  addresses

  from ebuddy.


 On Mon, Aug 18, 2014 at 8:23 AM, Alan Gates ga...@hortonworks.com 
 ga...@hortonworks.com

  wrote:

   Anyone who is an admin on the list (I don't who the admins are) can

   do

   this by doing user-unsubscribe-USERNAME=ebuddy@hive.apache.org

  where

  USERNAME is the name of the bouncing user 
 (seehttp://untroubled.org/ezmlm/ezman/ezman1.html )

 Alan.



   Thejas Nair the...@hortonworks.com the...@hortonworks.com
  August 17, 2014 at 17:02
 I don't know how to do this.

 Carl, Ashutosh,
 Do you guys know how to remove these two invalid emails from the

  mailing

  list ?


   Lars Francke lars.fran...@gmail.com lars.fran...@gmail.com
  August 17, 2014 at 15:41
 Hmm great, I see others mentioning this as well. I'm happy to contact

  INFRA

  but I'm not sure if they are even needed or if someone from the Hive

  team

  can do this?


 On Fri, Aug 8, 2014 at 3:43 AM, Lefty Leverenz 

  leftylever...@gmail.com

  leftylever...@gmail.com leftylever...@gmail.com

   Lefty Leverenz leftylever...@gmail.com leftylever...@gmail.com
  August 7, 2014 at 18:43
 (Excuse the spam.) Actually I'm getting two bounces per message, but

  gmail

  concatenates them so I didn't notice the second one.

 -- Lefty


 On Thu, Aug 7, 2014 at 9:36 PM, Lefty Leverenz 

  leftylever...@gmail.com

  leftylever...@gmail.com leftylever...@gmail.com

   Lefty Leverenz leftylever...@gmail.com leftylever...@gmail.com
  August 7, 2014 at 18:36
 Curious, I've only been getting one bounce per message. Anyway thanks

  for

  bringing this up.

 -- Lefty



   Lars Francke lars.fran...@gmail.com lars.fran...@gmail.com
  August 7, 2014 at 4:38
 Hi,

 every time I send a mail to dev@ I get two bounce mails from two

  people

  at

  ebuddy.com. I don't want to post the E-Mail addresses publicly but I

  can

  send them on if needed (and it can be triggered easily by just

   replying

  to

  this mail I guess).

 Could we maybe remove them from the list?

 Cheers,
 Lars


 --
 Sent with Postbox http://www.getpostbox.com http://www.getpostbox.com

 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or

  entity

  to which it is addressed and may contain information that is

  confidential,

  privileged and exempt from disclosure under applicable law. If the

  reader

  of this message is not the intended recipient, you are hereby

   notified

  that

  any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender

  immediately

  and delete it from your system. Thank You.






[jira] [Commented] (HIVE-8893) Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode

2014-11-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216052#comment-14216052
 ] 

Hive QA commented on HIVE-8893:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12682121/HIVE-8893.5.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6650 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1833/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1833/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1833/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12682121 - PreCommit-HIVE-TRUNK-Build

 Implement whitelist for builtin UDFs to avoid untrused code execution in 
 multiuser mode
 ---

 Key: HIVE-8893
 URL: https://issues.apache.org/jira/browse/HIVE-8893
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HiveServer2, SQL
Affects Versions: 0.14.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.15.0

 Attachments: HIVE-8893.3.patch, HIVE-8893.4.patch, HIVE-8893.5.patch


 The udfs like reflect() or java_method() enables executing a java method as 
 udf. While this offers lot of flexibility in the standalone mode, it can 
 become a security loophole in a secure multiuser environment. For example, in 
  HiveServer2 one can execute any available java code with user hive's 
 credentials.
 We need a whitelist and blacklist to restrict builtin udfs in Hiveserver2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8904) Hive should support multiple Key provider modes

2014-11-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216055#comment-14216055
 ] 

Hive QA commented on HIVE-8904:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12682124/HIVE-8904.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1834/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1834/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1834/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-1834/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 
'itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java'
Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java'
Reverted 'service/src/java/org/apache/hive/service/cli/CLIService.java'
Reverted 'ql/src/test/org/apache/hadoop/hive/metastore/TestMetastoreExpr.java'
Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java'
Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java'
Reverted 
'ql/src/test/org/apache/hadoop/hive/ql/exec/TestExpressionEvaluator.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/SqlFunctionConverter.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java'
++ egrep -v '^X|^Performing status on external'
++ awk '{print $2}'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target shims/scheduler/target 
packaging/target hbase-handler/target testutils/target jdbc/target 
metastore/target itests/target itests/hcatalog-unit/target 
itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target 
itests/hive-minikdc/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target hcatalog/target hcatalog/core/target 
hcatalog/streaming/target hcatalog/server-extensions/target 
hcatalog/hcatalog-pig-adapter/target hcatalog/webhcat/svr/target 
hcatalog/webhcat/java-client/target accumulo-handler/target hwi/target 
common/target common/src/gen contrib/target service/target serde/target 
beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1640306.

At revision 1640306.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12682124 - PreCommit-HIVE-TRUNK-Build

 Hive should 

[jira] [Commented] (HIVE-8868) SparkSession and SparkClient mapping[Spark Branch]

2014-11-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216065#comment-14216065
 ] 

Hive QA commented on HIVE-8868:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12682113/HIVE-8868.2-spark.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 7180 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/391/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/391/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-391/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12682113 - PreCommit-HIVE-SPARK-Build

 SparkSession and SparkClient mapping[Spark Branch]
 --

 Key: HIVE-8868
 URL: https://issues.apache.org/jira/browse/HIVE-8868
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3, TODOC-SPARK
 Attachments: HIVE-8868.1-spark.patch, HIVE-8868.2-spark.patch


 It should be a seperate spark context for each user session, currently we 
 share a singleton local spark context in all user sessions with local spark, 
 and create remote spark context for each spark job with spark cluster.
 To binding one spark context to each user session, we may construct spark 
 client on session open, one thing to notify is that, is SparkSession::conf is 
 consist with Context::getConf? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8905) Servlet classes signer information does not match[Spark branch]

2014-11-18 Thread Chengxiang Li (JIRA)
Chengxiang Li created HIVE-8905:
---

 Summary: Servlet classes signer information does not match[Spark 
branch] 
 Key: HIVE-8905
 URL: https://issues.apache.org/jira/browse/HIVE-8905
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li


{noformat}
2014-11-18 02:36:04,168 DEBUG spark.HttpFileServer (Logging.scala:logDebug(63)) 
- HTTP file server started at: http://10.203.137.143:46436
2014-11-18 02:36:04,172 ERROR session.TestSparkSessionManagerImpl 
(TestSparkSessionManagerImpl.java:run(127)) - Error executing 'Session thread 5'
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.
at 
org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
at 
org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:122)
at 
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl$SessionThread.run(TestSparkSessionManagerImpl.java:112)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.SecurityException: class 
javax.servlet.FilterRegistration's signer information does not match signer 
information of other classes in the same package
at java.lang.ClassLoader.checkCerts(ClassLoader.java:952)
at java.lang.ClassLoader.preDefineClass(ClassLoader.java:666)
at java.lang.ClassLoader.defineClass(ClassLoader.java:794)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at 
org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:136)
at 
org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:129)
at 
org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:98)
at 
org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:96)
at 
org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:87)
at org.apache.spark.ui.WebUI.attachPage(WebUI.scala:67)
at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:60)
at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:60)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.ui.WebUI.attachTab(WebUI.scala:60)
at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:49)
at org.apache.spark.ui.SparkUI.init(SparkUI.scala:60)
at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:150)
at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:105)
at org.apache.spark.SparkContext.init(SparkContext.scala:237)
at 
org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:58)
at 
org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.init(LocalHiveSparkClient.java:107)
at 
org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.getInstance(LocalHiveSparkClient.java:69)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:52)
at 
org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:53)
... 3 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8868) SparkSession and SparkClient mapping[Spark Branch]

2014-11-18 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216145#comment-14216145
 ] 

Chengxiang Li commented on HIVE-8868:
-

From hive log, session related tests failed due to servelet classes load 
exception, HIVE-8905 is created to track it.

 SparkSession and SparkClient mapping[Spark Branch]
 --

 Key: HIVE-8868
 URL: https://issues.apache.org/jira/browse/HIVE-8868
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3, TODOC-SPARK
 Attachments: HIVE-8868.1-spark.patch, HIVE-8868.2-spark.patch


 It should be a seperate spark context for each user session, currently we 
 share a singleton local spark context in all user sessions with local spark, 
 and create remote spark context for each spark job with spark cluster.
 To binding one spark context to each user session, we may construct spark 
 client on session open, one thing to notify is that, is SparkSession::conf is 
 consist with Context::getConf? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8868) SparkSession and SparkClient mapping[Spark Branch]

2014-11-18 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-8868:

Status: Open  (was: Patch Available)

 SparkSession and SparkClient mapping[Spark Branch]
 --

 Key: HIVE-8868
 URL: https://issues.apache.org/jira/browse/HIVE-8868
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3, TODOC-SPARK
 Attachments: HIVE-8868.1-spark.patch, HIVE-8868.2-spark.patch


 It should be a seperate spark context for each user session, currently we 
 share a singleton local spark context in all user sessions with local spark, 
 and create remote spark context for each spark job with spark cluster.
 To binding one spark context to each user session, we may construct spark 
 client on session open, one thing to notify is that, is SparkSession::conf is 
 consist with Context::getConf? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-18 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216336#comment-14216336
 ] 

Sergio Peña commented on HIVE-8359:
---

Thanks [~mickaellcr].

Sorry for the confusion. I did not see you uploaded another patch here. 
I just added two extra lines to the patch you uploaded. I will integrate your 
fixes there, and upload the patch again.

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-18 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8359:
--
Status: Open  (was: Patch Available)

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 HIVE-8359.5.patch, map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-18 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8359:
--
Status: Patch Available  (was: Open)

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 HIVE-8359.5.patch, map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-18 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-8359:
--
Attachment: HIVE-8359.5.patch

Attach new patch that integrates Mickael Lacour HIVE-6994 fix.

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 HIVE-8359.5.patch, map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] Apache Hive 0.14.0 Released

2014-11-18 Thread Damien Carol
Congrats !
It's definitively the most interesting release of HIVE.
Keep up the good work.

I know that Santa will add some boys in the good list this year.

Regards,

Damien CAROL

   - tél : +33 (0)4 74 96 88 14
   - email : dca...@blitzbs.com

BLITZ BUSINESS SERVICE

2014-11-18 0:33 GMT+01:00 Suhas Gogate vgog...@pivotal.io:

 Congrats! This is a big step for Hive!

 --Suhas

 On Mon, Nov 17, 2014 at 3:05 PM, Thejas Nair the...@hortonworks.com
 wrote:

  The link to the download page is now  -
  https://hive.apache.org/downloads.html
  (I have also corrected the email template in how-to-release wiki with new
  url).
 
 
  On Mon, Nov 17, 2014 at 1:59 PM, Roshan Naik ros...@hortonworks.com
  wrote:
 
  1)  fyi.. this link is broken:
 
  http://hive.apache.org/releases.html
 
  2) Java docs were not published for 0.14.0
 
  https://hive.apache.org/javadoc.html
 
 
 
  On Sun, Nov 16, 2014 at 7:04 PM, Clark Yang (杨卓荦) 
 yangzhuo...@gmail.com
  wrote:
 
   Great job! Congrats!
  
   Thanks,
   Zhuoluo (Clark) Yang
  
   2014-11-13 8:55 GMT+08:00 Gunther Hagleitner gunt...@apache.org:
  
The Apache Hive team is proud to announce the the release of Apache
Hive version 0.14.0.
   
The Apache Hive (TM) data warehouse software facilitates querying
 and
managing large datasets residing in distributed storage. Built on
 top
of Apache Hadoop (TM), it provides:
   
* Tools to enable easy data extract/transform/load (ETL).
   
* A mechanism to impose structure on a variety of data formats.
   
* Access to files stored either directly in Apache HDFS (TM) or in
  other
   data storage systems such as Apache HBase (TM) or Apache Accumulo
 (TM).
   
* Query execution via Apache Hadoop MapReduce and Apache Tez
  frameworks.
   
* Cost-based query planning via Apache Calcite
   
   
For Hive release details and downloads, please visit:
   http://hive.apache.org/releases.html
   
Hive 0.14.0 Release Notes are available here:
  
 
 https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12326450styleName=TextprojectId=12310843
   
We would like to thank the many contributors who made this release
possible.
   
Regards,
   
The Apache Hive Team
   
   
  
 
  --
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
  to
  which it is addressed and may contain information that is confidential,
  privileged and exempt from disclosure under applicable law. If the
 reader
  of this message is not the intended recipient, you are hereby notified
  that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
  immediately
  and delete it from your system. Thank You.
 
 
 
  CONFIDENTIALITY NOTICE
  NOTICE: This message is intended for the use of the individual or entity
  to which it is addressed and may contain information that is
 confidential,
  privileged and exempt from disclosure under applicable law. If the reader
  of this message is not the intended recipient, you are hereby notified
 that
  any printing, copying, dissemination, distribution, disclosure or
  forwarding of this communication is strictly prohibited. If you have
  received this communication in error, please contact the sender
 immediately
  and delete it from your system. Thank You.
 



Hive-0.14 - Build # 734 - Still Failing

2014-11-18 Thread Apache Jenkins Server
Changes for Build #696
[rohini] PIG-4186: Fix e2e run against new build of pig and some enhancements 
(rohini)


Changes for Build #697

Changes for Build #698

Changes for Build #699

Changes for Build #700

Changes for Build #701

Changes for Build #702

Changes for Build #703
[daijy] HIVE-8484: HCatalog throws an exception if Pig job is of type 'fetch' 
(Lorand Bendig via Daniel Dai)


Changes for Build #704
[gunther] HIVE-8781: Nullsafe joins are busted on Tez (Gunther Hagleitner, 
reviewed by Prasanth J)


Changes for Build #705
[gunther] HIVE-8760: Pass a copy of HiveConf to hooks (Gunther Hagleitner, 
reviewed by Gopal V)


Changes for Build #706
[thejas] HIVE-8772 : zookeeper info logs are always printed from beeline with 
service discovery mode (Thejas Nair, reviewed by Vaibhav Gumashta)


Changes for Build #707
[gunther] HIVE-8782: HBase handler doesn't compile with hadoop-1 (Jimmy Xiang, 
reviewed by Xuefu and Sergey)


Changes for Build #708

Changes for Build #709
[thejas] HIVE-8785 : HiveServer2 LogDivertAppender should be more selective for 
beeline getLogs (Thejas Nair, reviewed by Gopal V)


Changes for Build #710
[vgumashta] HIVE-8764: Windows: HiveServer2 TCP SSL cannot recognize localhost 
(Vaibhav Gumashta reviewed by Thejas Nair)


Changes for Build #711
[gunther] HIVE-8768: CBO: Fix filter selectivity for 'in clause'  '' (Laljo 
John Pullokkaran via Gunther Hagleitner)


Changes for Build #712
[gunther] HIVE-8794: Hive on Tez leaks AMs when killed before first dag is run 
(Gunther Hagleitner, reviewed by Gopal V)


Changes for Build #713
[gunther] HIVE-8798: Some Oracle deadlocks not being caught in TxnHandler (Alan 
Gates via Gunther Hagleitner)


Changes for Build #714
[gunther] HIVE-8800: Update release notes and notice for hive .14 (Gunther 
Hagleitner, reviewed by Prasanth J)

[gunther] HIVE-8799: boatload of missing apache headers (Gunther Hagleitner, 
reviewed by Thejas M Nair)


Changes for Build #715
[gunther] Preparing for release 0.14.0


Changes for Build #716
[gunther] Preparing for release 0.14.0

[gunther] Preparing for release 0.14.0


Changes for Build #717

Changes for Build #718

Changes for Build #719

Changes for Build #720
[gunther] HIVE-8811: Dynamic partition pruning can result in NPE during query 
compilation (Gunther Hagleitner, reviewed by Gopal V)


Changes for Build #721
[gunther] HIVE-8805: CBO skipped due to SemanticException: Line 0:-1 Both left 
and right aliases encountered in JOIN 'avg_cs_ext_discount_amt' (Laljo John 
Pullokkaran via Gunther Hagleitner)

[sershe] HIVE-8715 : Hive 14 upgrade scripts can fail for statistics if 
database was created using auto-create
 ADDENDUM (Sergey Shelukhin, reviewed by Ashutosh Chauhan and Gunther 
Hagleitner)


Changes for Build #722

Changes for Build #723

Changes for Build #724
[gunther] HIVE-8845: Switch to Tez 0.5.2 (Gunther Hagleitner, reviewed by Gopal 
V)


Changes for Build #725
[sershe] HIVE-8295 : Add batch retrieve partition objects for metastore direct 
sql (Selina Zhang and Sergey Shelukhin, reviewed by Ashutosh Chauhan)


Changes for Build #726

Changes for Build #727
[gunther] HIVE-8873: Switch to calcite 0.9.2 (Gunther Hagleitner, reviewed by 
Gopal V)


Changes for Build #728
[thejas] HIVE-8830 : hcatalog process don't exit because of non daemon thread 
(Thejas Nair, reviewed by Eugene Koifman, Sushanth Sowmyan)


Changes for Build #729

Changes for Build #730

Changes for Build #731

Changes for Build #732

Changes for Build #733

Changes for Build #734



No tests ran.

The Apache Jenkins build system has built Hive-0.14 (build #734)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-0.14/734/ to view 
the results.

[jira] [Commented] (HIVE-8893) Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode

2014-11-18 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216472#comment-14216472
 ] 

Prasad Mujumdar commented on HIVE-8893:
---

The failed test optimize_nullscan passes in my setup.

 Implement whitelist for builtin UDFs to avoid untrused code execution in 
 multiuser mode
 ---

 Key: HIVE-8893
 URL: https://issues.apache.org/jira/browse/HIVE-8893
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HiveServer2, SQL
Affects Versions: 0.14.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.15.0

 Attachments: HIVE-8893.3.patch, HIVE-8893.4.patch, HIVE-8893.5.patch


 The udfs like reflect() or java_method() enables executing a java method as 
 udf. While this offers lot of flexibility in the standalone mode, it can 
 become a security loophole in a secure multiuser environment. For example, in 
  HiveServer2 one can execute any available java code with user hive's 
 credentials.
 We need a whitelist and blacklist to restrict builtin udfs in Hiveserver2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]

2014-11-18 Thread Chao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216474#comment-14216474
 ] 

Chao commented on HIVE-8887:


This error happens when it SparkMapJoinOptimizer cannot find a big table 
candidate, and hence big table position is -1.
In this case, instead of continue entering the map join processing, we should 
fall back to common join. The code is there, but commented out.

 Investigate test failures on auto_join6, auto_join7, auto_join18, 
 auto_join18_multi_distinct [Spark Branch]
 ---

 Key: HIVE-8887
 URL: https://issues.apache.org/jira/browse/HIVE-8887
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Chao
Assignee: Chao

 These tests all failed with the same error, see below:
 {noformat}
 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver 
 (SessionState.java:printError(837)) - FAILED: NullPointerException null
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
   at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
   at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034)
   at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
   at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}
 This happens at compile time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8835) identify dependency scope for Remote Spark Context.[Spark Branch]

2014-11-18 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216485#comment-14216485
 ] 

Brock Noland commented on HIVE-8835:


+1

 identify dependency scope for Remote Spark Context.[Spark Branch]
 -

 Key: HIVE-8835
 URL: https://issues.apache.org/jira/browse/HIVE-8835
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Attachments: HIVE-8835.1-spark.patch


 While submit job through Remote Spark Context, spark RDD graph generation and 
 job submit is executed in remote side, so we have to add hive  related 
 dependency into its classpath with spark.driver.extraClassPath. instead of 
 add all hive/hadoop dependency, we should narrow the scope and identify what 
 dependency remote spark context required. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 28147: HIVE-7073:Implement Binary in ParquetSerDe

2014-11-18 Thread Mohit Sabharwal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28147/#review61946
---


LGTM. Mickaël Lacour's suggestion of adding a null value test is a great one.

- Mohit Sabharwal


On Nov. 18, 2014, 1:58 a.m., cheng xu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/28147/
 ---
 
 (Updated Nov. 18, 2014, 1:58 a.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 This patch includes:
 1. binary support for ParquetHiveSerde
 2. related test cases both in unit and ql test
 
 
 Diffs
 -
 
   data/files/parquet_types.txt d342062 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java
  472de8f 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java
  d5aae3b 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
 c57dd99 
   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
 8ac7864 
   ql/src/test/queries/clientpositive/parquet_types.q 22585c3 
   ql/src/test/results/clientpositive/parquet_types.q.out 275897c 
 
 Diff: https://reviews.apache.org/r/28147/diff/
 
 
 Testing
 ---
 
 related UT and QL tests passed
 
 
 Thanks,
 
 cheng xu
 




[jira] [Updated] (HIVE-8835) identify dependency scope for Remote Spark Context.[Spark Branch]

2014-11-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8835:
---
   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Thank you! I have committed this to spark.

 identify dependency scope for Remote Spark Context.[Spark Branch]
 -

 Key: HIVE-8835
 URL: https://issues.apache.org/jira/browse/HIVE-8835
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Fix For: spark-branch

 Attachments: HIVE-8835.1-spark.patch


 While submit job through Remote Spark Context, spark RDD graph generation and 
 job submit is executed in remote side, so we have to add hive  related 
 dependency into its classpath with spark.driver.extraClassPath. instead of 
 add all hive/hadoop dependency, we should narrow the scope and identify what 
 dependency remote spark context required. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]

2014-11-18 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-8887:
---
Attachment: HIVE-8887.1-spark.patch

 Investigate test failures on auto_join6, auto_join7, auto_join18, 
 auto_join18_multi_distinct [Spark Branch]
 ---

 Key: HIVE-8887
 URL: https://issues.apache.org/jira/browse/HIVE-8887
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Chao
Assignee: Chao
 Attachments: HIVE-8887.1-spark.patch


 These tests all failed with the same error, see below:
 {noformat}
 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver 
 (SessionState.java:printError(837)) - FAILED: NullPointerException null
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
   at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
   at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034)
   at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
   at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}
 This happens at compile time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]

2014-11-18 Thread Chao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-8887:
---
Status: Patch Available  (was: Open)

 Investigate test failures on auto_join6, auto_join7, auto_join18, 
 auto_join18_multi_distinct [Spark Branch]
 ---

 Key: HIVE-8887
 URL: https://issues.apache.org/jira/browse/HIVE-8887
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Chao
Assignee: Chao
 Attachments: HIVE-8887.1-spark.patch


 These tests all failed with the same error, see below:
 {noformat}
 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver 
 (SessionState.java:printError(837)) - FAILED: NullPointerException null
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
   at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
   at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034)
   at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
   at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}
 This happens at compile time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216562#comment-14216562
 ] 

Hive QA commented on HIVE-8359:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12682178/HIVE-8359.5.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6659 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1835/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1835/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1835/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12682178 - PreCommit-HIVE-TRUNK-Build

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 HIVE-8359.5.patch, map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8905) Servlet classes signer information does not match [Spark branch]

2014-11-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8905:
---
Summary: Servlet classes signer information does not match [Spark branch]   
(was: Servlet classes signer information does not match[Spark branch] )

 Servlet classes signer information does not match [Spark branch] 
 -

 Key: HIVE-8905
 URL: https://issues.apache.org/jira/browse/HIVE-8905
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
  Labels: Spark-M3

 {noformat}
 2014-11-18 02:36:04,168 DEBUG spark.HttpFileServer 
 (Logging.scala:logDebug(63)) - HTTP file server started at: 
 http://10.203.137.143:46436
 2014-11-18 02:36:04,172 ERROR session.TestSparkSessionManagerImpl 
 (TestSparkSessionManagerImpl.java:run(127)) - Error executing 'Session thread 
 5'
 org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark 
 client.
   at 
 org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55)
   at 
 org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:122)
   at 
 org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl$SessionThread.run(TestSparkSessionManagerImpl.java:112)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.SecurityException: class 
 javax.servlet.FilterRegistration's signer information does not match signer 
 information of other classes in the same package
   at java.lang.ClassLoader.checkCerts(ClassLoader.java:952)
   at java.lang.ClassLoader.preDefineClass(ClassLoader.java:666)
   at java.lang.ClassLoader.defineClass(ClassLoader.java:794)
   at 
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
   at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
   at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
   at 
 org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:136)
   at 
 org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:129)
   at 
 org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:98)
   at 
 org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:96)
   at 
 org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:87)
   at org.apache.spark.ui.WebUI.attachPage(WebUI.scala:67)
   at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:60)
   at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:60)
   at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
   at org.apache.spark.ui.WebUI.attachTab(WebUI.scala:60)
   at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:49)
   at org.apache.spark.ui.SparkUI.init(SparkUI.scala:60)
   at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:150)
   at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:105)
   at org.apache.spark.SparkContext.init(SparkContext.scala:237)
   at 
 org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:58)
   at 
 org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.init(LocalHiveSparkClient.java:107)
   at 
 org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.getInstance(LocalHiveSparkClient.java:69)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:52)
   at 
 org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:53)
   ... 3 more
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8609) Move beeline to jline2

2014-11-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8609:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Thank you very much [~Ferd]! I have committed this to trunk!

 Move beeline to jline2
 --

 Key: HIVE-8609
 URL: https://issues.apache.org/jira/browse/HIVE-8609
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Assignee: Ferdinand Xu
Priority: Blocker
 Fix For: 0.15.0

 Attachments: HIVE-8609.1.patch, HIVE-8609.2.patch, HIVE-8609.3.patch, 
 HIVE-8609.4.patch, HIVE-8609.5.patch, HIVE-8609.6.patch, HIVE-8609.7.patch, 
 HIVE-8609.patch


 We found a serious bug in jline in HIVE-8565. We should move to jline2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files

2014-11-18 Thread Ryan Blue (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216579#comment-14216579
 ] 

Ryan Blue commented on HIVE-8359:
-

I think with [~mickaellcr]'s addition, this is ready to go in.

Good catch in the SerDe code, I didn't realize that the nulls were stripped at 
that point as well. I'm a little confused about why we're translating the 
ArrayWritable again though: isn't this properly constructed by the Converter 
code? Why can't we just pass the ArrayWritable that was created already? It 
seems like we're doing a lot of unnecessary work here that we might be able to 
remove (in future patches). Ideally, we would detect that the structure matches 
what is expected by the following Hive code and pass it along.

 Map containing null values are not correctly written in Parquet files
 -

 Key: HIVE-8359
 URL: https://issues.apache.org/jira/browse/HIVE-8359
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.13.1
Reporter: Frédéric TERRAZZONI
Assignee: Sergio Peña
 Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, 
 HIVE-8359.5.patch, map_null_val.avro


 Tried write a mapstring,string column in a Parquet file. The table should 
 contain :
 {code}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {key1:null,key2:val2}
 {key3:val3,key4:null}
 {key3:val3,key4:null}
 {code}
 ... and when you do a query like {code}SELECT * from mytable{code}
 We can see that the table is corrupted :
 {code}
 {key3:val3}
 {key4:val3}
 {key3:val2}
 {key4:val3}
 {key1:val3}
 {code}
 I've not been able to read the Parquet file in our software afterwards, and 
 consequently I suspect it to be corrupted. 
 For those who are interested, I generated this Parquet table from an Avro 
 file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8122) Make use of SearchArgument classes for Parquet SERDE

2014-11-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8122:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Thank you [~Ferd] for the patch and [~mohitsabharwal] for the review! I have 
committed this to trunk!!

 Make use of SearchArgument classes for Parquet SERDE
 

 Key: HIVE-8122
 URL: https://issues.apache.org/jira/browse/HIVE-8122
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Ferdinand Xu
 Fix For: 0.15.0

 Attachments: HIVE-8122.1.patch, HIVE-8122.2.patch, HIVE-8122.3.patch, 
 HIVE-8122.4.patch, HIVE-8122.patch


 ParquetSerde could be much cleaner if we used SearchArgument and associated 
 classes like ORC does:
 https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27699: HIVE-8435

2014-11-18 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27699/
---

(Updated Nov. 18, 2014, 7:13 p.m.)


Review request for hive and Ashutosh Chauhan.


Summary (updated)
-

HIVE-8435


Repository: hive-git


Description (updated)
---

HIVE-8435


HIVE-8435


Diffs (updated)
-

  accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
254eeaba4b8d633c63c706c0c74bb1165089 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
a8411c9edb2f2db84cf2540deb20133c36152103 
  contrib/src/test/results/clientpositive/lateral_view_explode2.q.out 
74a7e1719f8e026aaecd53fc147258620a75ccc4 
  hbase-handler/src/test/results/positive/hbase_queries.q.out 
b1e7936738b1121c14132909178646290ee8b4d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 
95d2d76c80aa59b62e9464f704523d921302d401 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/IdentityProjectRemover.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 
5be0e4540a6843c6b40cb5c22db6e90e1f0da922 
  ql/src/test/queries/clientpositive/identity_proj_remove.q PRE-CREATION 
  ql/src/test/results/clientpositive/identity_proj_remove.q.out PRE-CREATION 
  ql/src/test/results/compiler/plan/groupby1.q.xml PRE-CREATION 

Diff: https://reviews.apache.org/r/27699/diff/


Testing
---


Thanks,

Jesús Camacho Rodríguez



[jira] [Updated] (HIVE-8829) Upgrade to Thrift 0.9.2

2014-11-18 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-8829:
--
Attachment: HIVE-8829.1.patch

 Upgrade to Thrift 0.9.2
 ---

 Key: HIVE-8829
 URL: https://issues.apache.org/jira/browse/HIVE-8829
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.15.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
  Labels: HiveServer2, metastore
 Fix For: 0.15.0

 Attachments: HIVE-8829.1.patch


 Apache Thrift 0.9.2 was released recently 
 (https://thrift.apache.org/download). It has a fix for THRIFT-2660 which can 
 cause HS2 (tcp mode) and Metastore processes to go OOM on getting a 
 non-thrift request when they use SASL transport. The reason ([thrift 
 code|https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/transport/TSaslTransport.java#L177]):
 {code}
   protected SaslResponse receiveSaslMessage() throws TTransportException {
 underlyingTransport.readAll(messageHeader, 0, messageHeader.length);
 byte statusByte = messageHeader[0];
 byte[] payload = new byte[EncodingUtils.decodeBigEndian(messageHeader, 
 STATUS_BYTES)];
 underlyingTransport.readAll(payload, 0, payload.length);
 NegotiationStatus status = NegotiationStatus.byValue(statusByte);
 if (status == null) {
   sendAndThrowMessage(NegotiationStatus.ERROR, Invalid status  + 
 statusByte);
 } else if (status == NegotiationStatus.BAD || status == 
 NegotiationStatus.ERROR) {
   try {
 String remoteMessage = new String(payload, UTF-8);
 throw new TTransportException(Peer indicated failure:  + 
 remoteMessage);
   } catch (UnsupportedEncodingException e) {
 throw new TTransportException(e);
   }
 }
 {code}
 Basically since there are no message format checks / size checks before 
 creating the byte array, on getting a non-SASL message this creates a huge 
 byte array from some garbage size.
 For HS2, an attempt was made to fix it here: HIVE-6468, which never went in.  
 I think for 0.15.0 it's best to upgarde to Thrift 0.9.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8893) Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode

2014-11-18 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216636#comment-14216636
 ] 

Szehon Ho commented on HIVE-8893:
-

Thanks for the changes!  Latest patch looks good, +1

 Implement whitelist for builtin UDFs to avoid untrused code execution in 
 multiuser mode
 ---

 Key: HIVE-8893
 URL: https://issues.apache.org/jira/browse/HIVE-8893
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HiveServer2, SQL
Affects Versions: 0.14.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.15.0

 Attachments: HIVE-8893.3.patch, HIVE-8893.4.patch, HIVE-8893.5.patch


 The udfs like reflect() or java_method() enables executing a java method as 
 udf. While this offers lot of flexibility in the standalone mode, it can 
 become a security loophole in a secure multiuser environment. For example, in 
  HiveServer2 one can execute any available java code with user hive's 
 credentials.
 We need a whitelist and blacklist to restrict builtin udfs in Hiveserver2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8829) Upgrade to Thrift 0.9.2

2014-11-18 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-8829:
--
Status: Patch Available  (was: Open)

 Upgrade to Thrift 0.9.2
 ---

 Key: HIVE-8829
 URL: https://issues.apache.org/jira/browse/HIVE-8829
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.15.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
  Labels: HiveServer2, metastore
 Fix For: 0.15.0

 Attachments: HIVE-8829.1.patch


 Apache Thrift 0.9.2 was released recently 
 (https://thrift.apache.org/download). It has a fix for THRIFT-2660 which can 
 cause HS2 (tcp mode) and Metastore processes to go OOM on getting a 
 non-thrift request when they use SASL transport. The reason ([thrift 
 code|https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/transport/TSaslTransport.java#L177]):
 {code}
   protected SaslResponse receiveSaslMessage() throws TTransportException {
 underlyingTransport.readAll(messageHeader, 0, messageHeader.length);
 byte statusByte = messageHeader[0];
 byte[] payload = new byte[EncodingUtils.decodeBigEndian(messageHeader, 
 STATUS_BYTES)];
 underlyingTransport.readAll(payload, 0, payload.length);
 NegotiationStatus status = NegotiationStatus.byValue(statusByte);
 if (status == null) {
   sendAndThrowMessage(NegotiationStatus.ERROR, Invalid status  + 
 statusByte);
 } else if (status == NegotiationStatus.BAD || status == 
 NegotiationStatus.ERROR) {
   try {
 String remoteMessage = new String(payload, UTF-8);
 throw new TTransportException(Peer indicated failure:  + 
 remoteMessage);
   } catch (UnsupportedEncodingException e) {
 throw new TTransportException(e);
   }
 }
 {code}
 Basically since there are no message format checks / size checks before 
 creating the byte array, on getting a non-SASL message this creates a huge 
 byte array from some garbage size.
 For HS2, an attempt was made to fix it here: HIVE-6468, which never went in.  
 I think for 0.15.0 it's best to upgarde to Thrift 0.9.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27699: HIVE-8435

2014-11-18 Thread Jesús Camacho Rodríguez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27699/
---

(Updated Nov. 18, 2014, 7:21 p.m.)


Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description (updated)
---

HIVE-8435


Patch with the most conservative approach of project remover optimization.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
a8411c9edb2f2db84cf2540deb20133c36152103 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 
95d2d76c80aa59b62e9464f704523d921302d401 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/IdentityProjectRemover.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 
5be0e4540a6843c6b40cb5c22db6e90e1f0da922 

Diff: https://reviews.apache.org/r/27699/diff/


Testing
---


Thanks,

Jesús Camacho Rodríguez



[jira] [Commented] (HIVE-8829) Upgrade to Thrift 0.9.2

2014-11-18 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216640#comment-14216640
 ] 

Prasad Mujumdar commented on HIVE-8829:
---

[~vgumashta] I didn't notice that the ticket is assigned to you.  If you 
already have a patch, please feel free to ignore this one.


 Upgrade to Thrift 0.9.2
 ---

 Key: HIVE-8829
 URL: https://issues.apache.org/jira/browse/HIVE-8829
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.15.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
  Labels: HiveServer2, metastore
 Fix For: 0.15.0

 Attachments: HIVE-8829.1.patch


 Apache Thrift 0.9.2 was released recently 
 (https://thrift.apache.org/download). It has a fix for THRIFT-2660 which can 
 cause HS2 (tcp mode) and Metastore processes to go OOM on getting a 
 non-thrift request when they use SASL transport. The reason ([thrift 
 code|https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/transport/TSaslTransport.java#L177]):
 {code}
   protected SaslResponse receiveSaslMessage() throws TTransportException {
 underlyingTransport.readAll(messageHeader, 0, messageHeader.length);
 byte statusByte = messageHeader[0];
 byte[] payload = new byte[EncodingUtils.decodeBigEndian(messageHeader, 
 STATUS_BYTES)];
 underlyingTransport.readAll(payload, 0, payload.length);
 NegotiationStatus status = NegotiationStatus.byValue(statusByte);
 if (status == null) {
   sendAndThrowMessage(NegotiationStatus.ERROR, Invalid status  + 
 statusByte);
 } else if (status == NegotiationStatus.BAD || status == 
 NegotiationStatus.ERROR) {
   try {
 String remoteMessage = new String(payload, UTF-8);
 throw new TTransportException(Peer indicated failure:  + 
 remoteMessage);
   } catch (UnsupportedEncodingException e) {
 throw new TTransportException(e);
   }
 }
 {code}
 Basically since there are no message format checks / size checks before 
 creating the byte array, on getting a non-SASL message this creates a huge 
 byte array from some garbage size.
 For HS2, an attempt was made to fix it here: HIVE-6468, which never went in.  
 I think for 0.15.0 it's best to upgarde to Thrift 0.9.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8829) Upgrade to Thrift 0.9.2

2014-11-18 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216644#comment-14216644
 ] 

Vaibhav Gumashta commented on HIVE-8829:


[~prasadm] No issues - I didn't have a patch any way. Assigning it to you - 
thanks for the patch.

 Upgrade to Thrift 0.9.2
 ---

 Key: HIVE-8829
 URL: https://issues.apache.org/jira/browse/HIVE-8829
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.15.0
Reporter: Vaibhav Gumashta
Assignee: Prasad Mujumdar
  Labels: HiveServer2, metastore
 Fix For: 0.15.0

 Attachments: HIVE-8829.1.patch


 Apache Thrift 0.9.2 was released recently 
 (https://thrift.apache.org/download). It has a fix for THRIFT-2660 which can 
 cause HS2 (tcp mode) and Metastore processes to go OOM on getting a 
 non-thrift request when they use SASL transport. The reason ([thrift 
 code|https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/transport/TSaslTransport.java#L177]):
 {code}
   protected SaslResponse receiveSaslMessage() throws TTransportException {
 underlyingTransport.readAll(messageHeader, 0, messageHeader.length);
 byte statusByte = messageHeader[0];
 byte[] payload = new byte[EncodingUtils.decodeBigEndian(messageHeader, 
 STATUS_BYTES)];
 underlyingTransport.readAll(payload, 0, payload.length);
 NegotiationStatus status = NegotiationStatus.byValue(statusByte);
 if (status == null) {
   sendAndThrowMessage(NegotiationStatus.ERROR, Invalid status  + 
 statusByte);
 } else if (status == NegotiationStatus.BAD || status == 
 NegotiationStatus.ERROR) {
   try {
 String remoteMessage = new String(payload, UTF-8);
 throw new TTransportException(Peer indicated failure:  + 
 remoteMessage);
   } catch (UnsupportedEncodingException e) {
 throw new TTransportException(e);
   }
 }
 {code}
 Basically since there are no message format checks / size checks before 
 creating the byte array, on getting a non-SASL message this creates a huge 
 byte array from some garbage size.
 For HS2, an attempt was made to fix it here: HIVE-6468, which never went in.  
 I think for 0.15.0 it's best to upgarde to Thrift 0.9.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 27699: HIVE-8435

2014-11-18 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27699/#review61985
---



contrib/src/test/results/clientpositive/lateral_view_explode2.q.out
https://reviews.apache.org/r/27699/#comment103905

Results are changed. Looks suspicious.



ql/src/test/results/compiler/plan/groupby1.q.xml
https://reviews.apache.org/r/27699/#comment103904

you need to rebase your git repo. These test cases were deleted via 
HIVE-8862


- Ashutosh Chauhan


On Nov. 18, 2014, 7:21 p.m., Jesús Camacho Rodríguez wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/27699/
 ---
 
 (Updated Nov. 18, 2014, 7:21 p.m.)
 
 
 Review request for hive and Ashutosh Chauhan.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 HIVE-8435
 
 
 Patch with the most conservative approach of project remover optimization.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
 a8411c9edb2f2db84cf2540deb20133c36152103 
   ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 
 95d2d76c80aa59b62e9464f704523d921302d401 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/IdentityProjectRemover.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 
 5be0e4540a6843c6b40cb5c22db6e90e1f0da922 
 
 Diff: https://reviews.apache.org/r/27699/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jesús Camacho Rodríguez
 




[jira] [Updated] (HIVE-8829) Upgrade to Thrift 0.9.2

2014-11-18 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-8829:
---
Assignee: Prasad Mujumdar  (was: Vaibhav Gumashta)

 Upgrade to Thrift 0.9.2
 ---

 Key: HIVE-8829
 URL: https://issues.apache.org/jira/browse/HIVE-8829
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.15.0
Reporter: Vaibhav Gumashta
Assignee: Prasad Mujumdar
  Labels: HiveServer2, metastore
 Fix For: 0.15.0

 Attachments: HIVE-8829.1.patch


 Apache Thrift 0.9.2 was released recently 
 (https://thrift.apache.org/download). It has a fix for THRIFT-2660 which can 
 cause HS2 (tcp mode) and Metastore processes to go OOM on getting a 
 non-thrift request when they use SASL transport. The reason ([thrift 
 code|https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/transport/TSaslTransport.java#L177]):
 {code}
   protected SaslResponse receiveSaslMessage() throws TTransportException {
 underlyingTransport.readAll(messageHeader, 0, messageHeader.length);
 byte statusByte = messageHeader[0];
 byte[] payload = new byte[EncodingUtils.decodeBigEndian(messageHeader, 
 STATUS_BYTES)];
 underlyingTransport.readAll(payload, 0, payload.length);
 NegotiationStatus status = NegotiationStatus.byValue(statusByte);
 if (status == null) {
   sendAndThrowMessage(NegotiationStatus.ERROR, Invalid status  + 
 statusByte);
 } else if (status == NegotiationStatus.BAD || status == 
 NegotiationStatus.ERROR) {
   try {
 String remoteMessage = new String(payload, UTF-8);
 throw new TTransportException(Peer indicated failure:  + 
 remoteMessage);
   } catch (UnsupportedEncodingException e) {
 throw new TTransportException(e);
   }
 }
 {code}
 Basically since there are no message format checks / size checks before 
 creating the byte array, on getting a non-SASL message this creates a huge 
 byte array from some garbage size.
 For HS2, an attempt was made to fix it here: HIVE-6468, which never went in.  
 I think for 0.15.0 it's best to upgarde to Thrift 0.9.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8435) Add identity project remover optimization

2014-11-18 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesús Camacho Rodríguez updated HIVE-8435:
--
Attachment: HIVE-8435.08.patch

Starting over.

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.1.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8639) Convert SMBJoin to MapJoin [Spark Branch]

2014-11-18 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216653#comment-14216653
 ] 

Brock Noland commented on HIVE-8639:


Hi [~chinnalalam],

How is the progress on this going? If you are not working on it we'll have 
folks freeing up soon who can take it over.

 Convert SMBJoin to MapJoin [Spark Branch]
 -

 Key: HIVE-8639
 URL: https://issues.apache.org/jira/browse/HIVE-8639
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Szehon Ho
Assignee: Chinna Rao Lalam

 HIVE-8202 supports auto-conversion of SMB Join.  However, if the tables are 
 partitioned, there could be a slow down as each mapper would need to get a 
 very small chunk of a partition which has a single key. Thus, in some 
 scenarios it's beneficial to convert SMB join to map join.
 The task is to research and support the conversion from SMB join to map join 
 for Spark execution engine.  See the equivalent of MapReduce in 
 SortMergeJoinResolver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8803) DESC SCHEMA DATABASE-NAME is not working

2014-11-18 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-8803:

Labels: TODOC15  (was: )

It's already doc'ed 
[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe]
 as part of Hive 0.14 and HIVE-6601, but we can change the link to this one and 
Hive 0.15.

 DESC SCHEMA DATABASE-NAME is not working
 --

 Key: HIVE-8803
 URL: https://issues.apache.org/jira/browse/HIVE-8803
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
  Labels: TODOC15
 Attachments: HIVE-8803.1.patch.txt, HIVE-8803.1.patch.txt


 Found that documenting HIVE-6601



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]

2014-11-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216655#comment-14216655
 ] 

Hive QA commented on HIVE-8887:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12682207/HIVE-8887.1-spark.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7180 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/392/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/392/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-392/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12682207 - PreCommit-HIVE-SPARK-Build

 Investigate test failures on auto_join6, auto_join7, auto_join18, 
 auto_join18_multi_distinct [Spark Branch]
 ---

 Key: HIVE-8887
 URL: https://issues.apache.org/jira/browse/HIVE-8887
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Chao
Assignee: Chao
 Attachments: HIVE-8887.1-spark.patch


 These tests all failed with the same error, see below:
 {noformat}
 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver 
 (SessionState.java:printError(837)) - FAILED: NullPointerException null
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
   at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
   at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034)
   at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
   at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 

[jira] [Commented] (HIVE-8435) Add identity project remover optimization

2014-11-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216656#comment-14216656
 ] 

Ashutosh Chauhan commented on HIVE-8435:


+1 for code changes which looks good. Lets see how tests go.

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.1.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8803) DESC SCHEMA DATABASE-NAME is not working

2014-11-18 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-8803:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk, thanks Navis.

 DESC SCHEMA DATABASE-NAME is not working
 --

 Key: HIVE-8803
 URL: https://issues.apache.org/jira/browse/HIVE-8803
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-8803.1.patch.txt, HIVE-8803.1.patch.txt


 Found that documenting HIVE-6601



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8906) Hive 0.14.0 release depends on Tez and Calcite SNAPSHOT artifacts

2014-11-18 Thread Carl Steinbach (JIRA)
Carl Steinbach created HIVE-8906:


 Summary: Hive 0.14.0 release depends on Tez and Calcite SNAPSHOT 
artifacts
 Key: HIVE-8906
 URL: https://issues.apache.org/jira/browse/HIVE-8906
 Project: Hive
  Issue Type: Bug
Reporter: Carl Steinbach


The Hive 0.14.0 release depends on SNAPSHOT versions of tez-0.5.2 and 
calcite-0.9.2. I believe this violates Apache release policy (can't find the 
reference, but I seem to remember this being a problem with HCatalog before the 
merger), and it implies that the folks who tested the release weren't 
necessarily testing the same thing. It also means that people who try to build 
Hive using the 0.14.0 src release will encounter errors unless they configure 
Maven to pull artifacts from the snapshot repository.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8839) Support alter table .. add/replace columns cascade

2014-11-18 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-8839:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk, thanks Chaoyu for the contribution.

 Support alter table .. add/replace columns cascade
 

 Key: HIVE-8839
 URL: https://issues.apache.org/jira/browse/HIVE-8839
 Project: Hive
  Issue Type: Improvement
  Components: SQL
 Environment: 
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8839.1.patch, HIVE-8839.2.patch, HIVE-8839.2.patch, 
 HIVE-8839.patch


 We often run into some issues like HIVE-6131which is due to inconsistent 
 column descriptors between table and partitions after alter table. 
 HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition 
 level. But most cases we have need change the table and partitions at same 
 time. In addition, alter table is usually required prior to alter table 
 partition .. since querying table partition data is also through table. 
 Instead of do that in two steps, here we provide a convenient ddl like alter 
 table ... cascade to cascade table changes to partitions as well. The 
 changes are only limited and applicable to add/replace columns and change 
 column name, datatype, position and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8839) Support alter table .. add/replace columns cascade

2014-11-18 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-8839:

Environment: 
















  was:


















Need to add this in 
[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe],
 and also explain the nuances

 Support alter table .. add/replace columns cascade
 

 Key: HIVE-8839
 URL: https://issues.apache.org/jira/browse/HIVE-8839
 Project: Hive
  Issue Type: Improvement
  Components: SQL
 Environment: 
Reporter: Chaoyu Tang
Assignee: Chaoyu Tang
  Labels: TODOC15
 Fix For: 0.15.0

 Attachments: HIVE-8839.1.patch, HIVE-8839.2.patch, HIVE-8839.2.patch, 
 HIVE-8839.patch


 We often run into some issues like HIVE-6131which is due to inconsistent 
 column descriptors between table and partitions after alter table. 
 HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition 
 level. But most cases we have need change the table and partitions at same 
 time. In addition, alter table is usually required prior to alter table 
 partition .. since querying table partition data is also through table. 
 Instead of do that in two steps, here we provide a convenient ddl like alter 
 table ... cascade to cascade table changes to partitions as well. The 
 changes are only limited and applicable to add/replace columns and change 
 column name, datatype, position and comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8611) grant/revoke syntax should support additional objects for authorization plugins

2014-11-18 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216731#comment-14216731
 ] 

Prasad Mujumdar commented on HIVE-8611:
---

[~leftylev] Updated the wiki for config change. Thanks!

 grant/revoke syntax should support additional objects for authorization 
 plugins
 ---

 Key: HIVE-8611
 URL: https://issues.apache.org/jira/browse/HIVE-8611
 Project: Hive
  Issue Type: Bug
  Components: Authentication, SQL
Affects Versions: 0.13.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.15.0

 Attachments: HIVE-8611.1.patch, HIVE-8611.2.patch, HIVE-8611.2.patch, 
 HIVE-8611.3.patch, HIVE-8611.4.patch


 The authorization framework supports URI and global objects. The SQL syntax 
 however doesn't allow granting privileges on these objects. We should allow 
 the compiler to parse these so that it can be handled by authorization 
 plugins.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8612) Support metadata result filter hooks

2014-11-18 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216744#comment-14216744
 ] 

Prasad Mujumdar commented on HIVE-8612:
---

[~leftylev] Documented the new config property on the metastore admin page. 
Thanks!

 Support metadata result filter hooks
 

 Key: HIVE-8612
 URL: https://issues.apache.org/jira/browse/HIVE-8612
 Project: Hive
  Issue Type: Bug
  Components: Authorization, Metastore
Affects Versions: 0.13.1
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar
 Fix For: 0.15.0

 Attachments: HIVE-8612.1.patch, HIVE-8612.2.patch, HIVE-8612.3.patch


 Support metadata filter hook for metastore client. This will be useful for 
 authorization plugins on hiveserver2 to filter metadata results, especially 
 in case of non-impersonation mode where the metastore doesn't know the end 
 user's identity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8888) Mapjoin with LateralViewJoin generates wrong plan in Tez

2014-11-18 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-:
-
Attachment: HIVE-.4.patch

 Mapjoin with LateralViewJoin generates wrong plan in Tez
 

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0, 0.14.0, 0.13.1, 0.15.0
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-.1.patch, HIVE-.2.patch, HIVE-.3.patch, 
 HIVE-.4.patch


 Queries like these 
 {code}
 with sub1 as
 (select aid, avalue from expod1 lateral view explode(av) avs as avalue ),
 sub2 as
 (select bid, bvalue from expod2 lateral view explode(bv) bvs as bvalue)
 select sub1.aid, sub1.avalue, sub2.bvalue
 from sub1,sub2
 where sub1.aid=sub2.bid;
 {code}
 generates twice the number of rows in Tez when compared to MR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8888) Mapjoin with LateralViewJoin generates wrong plan in Tez

2014-11-18 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-:
-
Status: Open  (was: Patch Available)

 Mapjoin with LateralViewJoin generates wrong plan in Tez
 

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1, 0.13.0, 0.14.0, 0.15.0
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-.1.patch, HIVE-.2.patch, HIVE-.3.patch, 
 HIVE-.4.patch


 Queries like these 
 {code}
 with sub1 as
 (select aid, avalue from expod1 lateral view explode(av) avs as avalue ),
 sub2 as
 (select bid, bvalue from expod2 lateral view explode(bv) bvs as bvalue)
 select sub1.aid, sub1.avalue, sub2.bvalue
 from sub1,sub2
 where sub1.aid=sub2.bid;
 {code}
 generates twice the number of rows in Tez when compared to MR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8888) Mapjoin with LateralViewJoin generates wrong plan in Tez

2014-11-18 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-:
-
Status: Patch Available  (was: Open)

 Mapjoin with LateralViewJoin generates wrong plan in Tez
 

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1, 0.13.0, 0.14.0, 0.15.0
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-.1.patch, HIVE-.2.patch, HIVE-.3.patch, 
 HIVE-.4.patch


 Queries like these 
 {code}
 with sub1 as
 (select aid, avalue from expod1 lateral view explode(av) avs as avalue ),
 sub2 as
 (select bid, bvalue from expod2 lateral view explode(bv) bvs as bvalue)
 select sub1.aid, sub1.avalue, sub2.bvalue
 from sub1,sub2
 where sub1.aid=sub2.bid;
 {code}
 generates twice the number of rows in Tez when compared to MR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-8904) Hive should support multiple Key provider modes

2014-11-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8904:
---
Comment: was deleted

(was: 

{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12682124/HIVE-8904.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1834/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1834/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1834/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-1834/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 
'itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java'
Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java'
Reverted 'service/src/java/org/apache/hive/service/cli/CLIService.java'
Reverted 'ql/src/test/org/apache/hadoop/hive/metastore/TestMetastoreExpr.java'
Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java'
Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java'
Reverted 
'ql/src/test/org/apache/hadoop/hive/ql/exec/TestExpressionEvaluator.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/SqlFunctionConverter.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java'
Reverted 
'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java'
++ egrep -v '^X|^Performing status on external'
++ awk '{print $2}'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target shims/scheduler/target 
packaging/target hbase-handler/target testutils/target jdbc/target 
metastore/target itests/target itests/hcatalog-unit/target 
itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target 
itests/hive-minikdc/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target hcatalog/target hcatalog/core/target 
hcatalog/streaming/target hcatalog/server-extensions/target 
hcatalog/hcatalog-pig-adapter/target hcatalog/webhcat/svr/target 
hcatalog/webhcat/java-client/target accumulo-handler/target hwi/target 
common/target common/src/gen contrib/target service/target serde/target 
beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target
+ svn update

Fetching external item into 'hcatalog/src/test/e2e/harness'
External at revision 1640306.

At revision 1640306.
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12682124 - PreCommit-HIVE-TRUNK-Build)

 Hive should support multiple 

[jira] [Created] (HIVE-8907) Partition Condition Remover doesn't remove conditions involving cast on partition column

2014-11-18 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-8907:
--

 Summary: Partition Condition Remover doesn't remove conditions 
involving cast on partition column
 Key: HIVE-8907
 URL: https://issues.apache.org/jira/browse/HIVE-8907
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Reporter: Ashutosh Chauhan
 Fix For: 0.14.0


e.g,
{code}
create table partition_test_partitioned(key string, value string) partitioned 
by (dt string)
 explain select * from partition_test_partitioned where cast(dt as double) 
=100.0 and cast(dt as double) = 102.0
{code}

For queries like above, although {{PartitionPruner}} is able to prune 
partitions correctly, filter is still not optimized away by PCR, where it could.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8907) Partition Condition Remover doesn't remove conditions involving cast on partition column

2014-11-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216822#comment-14216822
 ] 

Ashutosh Chauhan commented on HIVE-8907:


It runs into this if-block of {{PcrExprProcFactory}}
{code}
 if (result == null) {
// if the result is not boolean and not all partition agree on the
// result, we don't remove the condition. Potentially, it can miss
// the case like where ds % 3 == 1 or ds % 3 == 2
// TODO: handle this case by making result vector to handle all
// constant values.
return new NodeInfoWrapper(WalkState.UNKNOWN, null, getOutExpr(fd, 
nodeOutputs));
{code}

 Partition Condition Remover doesn't remove conditions involving cast on 
 partition column
 

 Key: HIVE-8907
 URL: https://issues.apache.org/jira/browse/HIVE-8907
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer
Reporter: Ashutosh Chauhan
 Fix For: 0.14.0


 e.g,
 {code}
 create table partition_test_partitioned(key string, value string) partitioned 
 by (dt string)
  explain select * from partition_test_partitioned where cast(dt as double) 
 =100.0 and cast(dt as double) = 102.0
 {code}
 For queries like above, although {{PartitionPruner}} is able to prune 
 partitions correctly, filter is still not optimized away by PCR, where it 
 could.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Propose to put JIRA traffic on separate hive list

2014-11-18 Thread Sergey Shelukhin
+1

On Tue, Nov 18, 2014 at 2:12 AM, Lars Francke lars.fran...@gmail.com
wrote:

 +1

 That's a great idea Alan.

 On Tue, Nov 18, 2014 at 9:49 AM, Lefty Leverenz leftylever...@gmail.com
 wrote:

  +1
 
  Would it be possible to send commits to the dev list, as well as creates?
  Or maybe all changes to the Resolution or Status?
 
  -- Lefty
 
  On Mon, Nov 17, 2014 at 2:27 PM, Alan Gates ga...@hortonworks.com
 wrote:
 
   The hive dev list generates a lot of traffic.  The average for October
  was
   192 messages per day.  As a result no one sends hive dev directly to
  their
   inbox.  They either unsubscribe or they build filters that ship most or
  all
   of it to a folder.  Chasing people off the dev list is obviously not
 what
   we want.  Sending messages to folders means missing messages or not
  seeing
   them until you get unbusy enough to go read back mail in folders.
  
   The vast majority of this traffic is comments on JIRA tickets.  The way
   I've seen other very active Apache projects manage this is JIRA creates
  go
   to the dev list, but all other JIRA operations go to a separate list.
  Then
   everyone can see new tickets, and if they are interested they can watch
   that JIRA.  If not, they are not burdened with the email from it.
  
   I propose we do this same thing in Hive.
  
   Alan.
   --
   Sent with Postbox http://www.getpostbox.com
  
   --
   CONFIDENTIALITY NOTICE
   NOTICE: This message is intended for the use of the individual or
 entity
   to which it is addressed and may contain information that is
  confidential,
   privileged and exempt from disclosure under applicable law. If the
 reader
   of this message is not the intended recipient, you are hereby notified
  that
   any printing, copying, dissemination, distribution, disclosure or
   forwarding of this communication is strictly prohibited. If you have
   received this communication in error, please contact the sender
  immediately
   and delete it from your system. Thank You.
  
 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Commented] (HIVE-7790) Update privileges to check for update and delete

2014-11-18 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216829#comment-14216829
 ] 

Alan Gates commented on HIVE-7790:
--

Most required changes were already made to 
https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization#SQLStandardBasedHiveAuthorization-PrivilegesRequiredforHiveOperations
 I made a few more

No changes are required to the legacy auth, as this didn't change that.

I don't think any more doc work is needed for this JIRA.

 Update privileges to check for update and delete
 

 Key: HIVE-7790
 URL: https://issues.apache.org/jira/browse/HIVE-7790
 Project: Hive
  Issue Type: Sub-task
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.14.0

 Attachments: HIVE-7790.2.patch, HIVE-7790.3.patch, HIVE-7790.patch


 In the new SQLStdAuth scheme, we need to add UPDATE and DELETE as operations 
 and add ability check for them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8104) Insert statements against ACID tables NPE when vectorization is on

2014-11-18 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216835#comment-14216835
 ] 

Alan Gates commented on HIVE-8104:
--

[~leftylev], what did you want transactions and vectorization to say about each 
other?  They work together, mostly.  And the transaction code handles turning 
off vectorization in the cases where they don't work together, so it is all 
transparent to the user.  So I'm not sure there's anything to put in the user 
docs.

 Insert statements against ACID tables NPE when vectorization is on
 --

 Key: HIVE-8104
 URL: https://issues.apache.org/jira/browse/HIVE-8104
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Vectorization
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8104.2.patch, HIVE-8104.patch


 Doing an insert against a table that is using ACID format with the 
 transaction manager set to DbTxnManager and vectorization turned on results 
 in an NPE.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8779) Tez in-place progress UI can show wrong estimated time for sub-second queries

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8779:
---
Fix Version/s: (was: 0.15.0)

 Tez in-place progress UI can show wrong estimated time for sub-second queries
 -

 Key: HIVE-8779
 URL: https://issues.apache.org/jira/browse/HIVE-8779
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
Priority: Trivial
 Fix For: 0.14.0

 Attachments: HIVE-8779.1.patch


 The in-place progress update UI added as part of HIVE-8495 can show wrong 
 estimated time for AM only job which goes from INITED to SUCCEEDED DAG state 
 directly without going to RUNNING state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8778) ORC split elimination can cause NPE when column statistics is null

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8778:
---
Fix Version/s: (was: 0.15.0)

 ORC split elimination can cause NPE when column statistics is null
 --

 Key: HIVE-8778
 URL: https://issues.apache.org/jira/browse/HIVE-8778
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8778.1.patch


 Row group elimination has protection for NULL statistics values in 
 RecordReaderImpl.evaluatePredicate() which then calls 
 evaluatePredicateRange(). But split elimination directly calls 
 evaluatePredicateRange() without NULL protection. This can lead to 
 NullPointerException when a column is NULL in entire stripe. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8740) Sorted dynamic partition does not work correctly with constant folding

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8740:
---
Fix Version/s: (was: 0.15.0)

 Sorted dynamic partition does not work correctly with constant folding
 --

 Key: HIVE-8740
 URL: https://issues.apache.org/jira/browse/HIVE-8740
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
 Fix For: 0.14.0

 Attachments: HIVE-8740.1.patch, HIVE-8740.2.patch, HIVE-8740.3.patch, 
 HIVE-8740.4.patch


 Sorted dynamic partition optimization looks for partition columns from the 
 operator above FileSinkOperator. As per hive convention it expects partition 
 columns at the last. But with HIVE-8585 equality filters on partition columns 
 gets folded to constant. The column pruner then prunes the constant 
 expression as they don't reference any columns. This in some cases will yield 
 unexpected results (throw ArrayIndexOutOfBounds exception) with sorted 
 dynamic partition insert optimization. In such cases we don't really need 
 sorted dynamic partition optimization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8727) Dag summary has incorrect row counts and duration per vertex

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8727:
---
Fix Version/s: (was: 0.15.0)

 Dag summary has incorrect row counts and duration per vertex
 

 Key: HIVE-8727
 URL: https://issues.apache.org/jira/browse/HIVE-8727
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth J
 Fix For: 0.14.0

 Attachments: HIVE-8727.1.patch


 During the code review for HIVE-8495 some code was reworked which broke some 
 of INPUT/OUTPUT counters and duration.
 Patch attached which fixes that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6468) HS2 Metastore using SASL out of memory error when curl sends a get request

2014-11-18 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6468:
---
Status: Open  (was: Patch Available)

 HS2  Metastore using SASL out of memory error when curl sends a get request
 

 Key: HIVE-6468
 URL: https://issues.apache.org/jira/browse/HIVE-6468
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Metastore
Affects Versions: 0.13.1, 0.13.0, 0.12.0, 0.14.0
 Environment: Centos 6.3, hive 12, hadoop-2.2
Reporter: Abin Shahab
Assignee: Navis
 Fix For: 0.14.1

 Attachments: HIVE-6468.1.patch.txt, HIVE-6468.2.patch.txt, 
 HIVE-6468.3.patch, HIVE-6468.4.patch


 We see an out of memory error when we run simple beeline calls.
 (The hive.server2.transport.mode is binary)
 curl localhost:1
 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap 
 space
   at 
 org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181)
   at 
 org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
   at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
   at 
 org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
   at 
 org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 28007: HS2 Metastore using SASL out of memory error when curl sends a get request

2014-11-18 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28007/
---

(Updated Nov. 18, 2014, 9:51 p.m.)


Review request for hive, Navis Ryu and Thejas Nair.


Bugs: HIVE-6468
https://issues.apache.org/jira/browse/HIVE-6468


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-6468


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ea5aed8 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
d1ef305 
  service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 23ba79c 
  service/src/java/org/apache/hive/service/auth/PlainSaslHelper.java afc1441 
  
shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 624ac6b 
  
shims/common/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 d011c67 

Diff: https://reviews.apache.org/r/28007/diff/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Updated] (HIVE-6468) HS2 Metastore using SASL out of memory error when curl sends a get request

2014-11-18 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6468:
---
Attachment: HIVE-6468.5.patch

Revised patch for 14.1. I'll upload one based on trunk just for precommit run 
(we're upgrading thrift version for trunk - not planning to use this patch).

 HS2  Metastore using SASL out of memory error when curl sends a get request
 

 Key: HIVE-6468
 URL: https://issues.apache.org/jira/browse/HIVE-6468
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Metastore
Affects Versions: 0.12.0, 0.13.0, 0.14.0, 0.13.1
 Environment: Centos 6.3, hive 12, hadoop-2.2
Reporter: Abin Shahab
Assignee: Navis
 Fix For: 0.14.1

 Attachments: HIVE-6468.1.patch.txt, HIVE-6468.2.patch.txt, 
 HIVE-6468.3.patch, HIVE-6468.4.patch, HIVE-6468.5.patch


 We see an out of memory error when we run simple beeline calls.
 (The hive.server2.transport.mode is binary)
 curl localhost:1
 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap 
 space
   at 
 org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181)
   at 
 org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
   at 
 org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
   at 
 org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
   at 
 org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
   at 
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8685:
---
Fix Version/s: (was: 0.15.0)

 DDL operations in WebHCat set proxy user to null in unsecure mode
 ---

 Key: HIVE-8685
 URL: https://issues.apache.org/jira/browse/HIVE-8685
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch


 This makes DDL commands fail
 This was stupidly broken in HIVE-8643



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8643:
---
Fix Version/s: (was: 0.15.0)

 DDL operations via WebHCat with doAs parameter in secure cluster fail
 -

 Key: HIVE-8643
 URL: https://issues.apache.org/jira/browse/HIVE-8643
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8643.2.patch, HIVE-8643.3.patch, HIVE-8643.patch


 webhcat handles DDL command by forking to 'hcat', i.e. HCatCli
 This starts a session.
 SessionState.start() creates scratch dir based on current user name
 via startSs.createSessionDirs(sessionUGI.getShortUserName());
 This UGI is not aware of doAs param, so the name of the dir always ends up 
 'hcat', but because a delegation token is generated in WebHCat for HDFS 
 access, the owner of the scratch dir is the calling user.  Thus next time a 
 session is started (because of a new DDL call from different user), it ends 
 up trying to use the same scratch dir but cannot as it has 700 permission set.
 We need to pass in doAs user into SessionState



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8588:
---
Fix Version/s: (was: 0.15.0)

 sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
 

 Key: HIVE-8588
 URL: https://issues.apache.org/jira/browse/HIVE-8588
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Critical
  Labels: TODOC14
 Fix For: 0.14.0

 Attachments: HIVE-8588.1.patch, HIVE-8588.2.patch


 This is originally discovered by [~deepesh]
 When running a Sqoop integration test from WebHCat
 {noformat}
 curl --show-error -d command=export -libjars 
 hdfs:///tmp/mysql-connector-java.jar --connect 
 jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop 
 --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person 
 -d statusdir=sqoop.output -X POST 
 http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa;
 {noformat}
 the job is failing with the following error:
 {noformat}
 $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr
 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897
 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the 
 command-line is insecure. Consider using -P instead.
 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL 
 streaming resultset.
 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation
 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: 
 java.lang.RuntimeException: Could not load db driver class: 
 com.mysql.jdbc.Driver
 java.lang.RuntimeException: Could not load db driver class: 
 com.mysql.jdbc.Driver
   at 
 org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848)
   at 
 org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
   at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736)
   at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759)
   at 
 org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269)
   at 
 org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240)
   at 
 org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226)
   at 
 org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295)
   at 
 org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773)
   at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578)
   at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96)
   at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)
   at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)
   at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
   at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
   at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
   at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
 {noformat}
 Note that the Sqoop tar bundle does not contain the JDBC connector jar. I 
 think the problem here maybe that the mysql connector jar added to libjars 
 isn't available to the Sqoop tool which first connects to the database 
 through JDBC driver to collect some table information before running the MR 
 job. libjars will only add the connector jar for the MR job and not the local 
 one.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8478) Vectorized Reduce-Side Group By doesn't handle Decimal type correctly

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8478:
---
Fix Version/s: (was: 0.15.0)

 Vectorized Reduce-Side Group By doesn't handle Decimal type correctly
 -

 Key: HIVE-8478
 URL: https://issues.apache.org/jira/browse/HIVE-8478
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8478.01.patch, HIVE-8478.02.patch, 
 HIVE-8478.03.patch, HIVE-8478.04.patch


 Note that DecimalColumnVector is different than LongColumnVector because it 
 keeps (an instance) reference to a Decimal128 class whereas the latter stores 
 a long primitive value. So, trouble if you set the reference instead of 
 updating the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8497) StatsNoJobTask doesn't close RecordReader, FSDataInputStream of which keeps open to prevent stale data clean

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8497:
---
Fix Version/s: (was: 0.15.0)

 StatsNoJobTask doesn't close RecordReader, FSDataInputStream of which keeps 
 open to prevent stale data clean
 

 Key: HIVE-8497
 URL: https://issues.apache.org/jira/browse/HIVE-8497
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
 Environment: Windows
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8497.1.patch, HIVE-8497.2.patch


 run the test
 {noformat}
 mvn -Phadoop-2  test -Dtest=TestCliDriver -Dqfile=alter_merge_stats_orc.q
 {noformat}
 to reproduce it. Simply, this query does three data loads which generates 
 three base orc files.
 ANALYZE TABLE...COMPUTE STATISTICS NOSCAN will execute StatsNoJobTask to get 
 stats, where file handle is held so as not able to clean base file. As a 
 result, after running ALTER TABLE..CONCATENATE, follow-up queries go to stale 
 base file and merged file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8455:
---
Fix Version/s: (was: 0.15.0)
   spark-branch

 Print Spark job progress format info on the console[Spark Branch]
 -

 Key: HIVE-8455
 URL: https://issues.apache.org/jira/browse/HIVE-8455
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Fix For: spark-branch

 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive 
 on spark job status.PNG


 We have add support of spark job status monitoring on HIVE-7439, but not 
 print job progress format info on the console, user may confuse about what 
 the  progress info means, so I would like to add job progress format info 
 here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8454) Select Operator does not rename column stats properly in case of select star

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8454:
---
Fix Version/s: (was: 0.15.0)

 Select Operator does not rename column stats properly in case of select star
 

 Key: HIVE-8454
 URL: https://issues.apache.org/jira/browse/HIVE-8454
 Project: Hive
  Issue Type: Sub-task
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Prasanth J
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8454.1.patch, HIVE-8454.2.patch, HIVE-8454.3.patch, 
 HIVE-8454.3.patch, HIVE-8454.4.patch, HIVE-8454.5.patch, HIVE-8454.6.patch, 
 HIVE-8454.7.patch


 The estimated data size of some Select Operators is 0. BytesBytesHashMap uses 
 data size to determine the estimated initial number of entries in the 
 hashmap. If this data size is 0 then exception is thrown (refer below)
 Query 
 {code}
 select count(*) from
  store_sales
 JOIN store_returns ON store_sales.ss_item_sk = 
 store_returns.sr_item_sk and store_sales.ss_ticket_number = 
 store_returns.sr_ticket_number
 JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk
 JOIN date_dim d1 ON store_sales.ss_sold_date_sk = d1.d_date_sk
 JOIN date_dim d2 ON customer.c_first_sales_date_sk = d2.d_date_sk 
 JOIN date_dim d3 ON customer.c_first_shipto_date_sk = d3.d_date_sk
 JOIN store ON store_sales.ss_store_sk = store.s_store_sk
   JOIN item ON store_sales.ss_item_sk = item.i_item_sk
   JOIN customer_demographics cd1 ON store_sales.ss_cdemo_sk= 
 cd1.cd_demo_sk
 JOIN customer_demographics cd2 ON customer.c_current_cdemo_sk = 
 cd2.cd_demo_sk
 JOIN promotion ON store_sales.ss_promo_sk = promotion.p_promo_sk
 JOIN household_demographics hd1 ON store_sales.ss_hdemo_sk = 
 hd1.hd_demo_sk
 JOIN household_demographics hd2 ON customer.c_current_hdemo_sk = 
 hd2.hd_demo_sk
 JOIN customer_address ad1 ON store_sales.ss_addr_sk = 
 ad1.ca_address_sk
 JOIN customer_address ad2 ON customer.c_current_addr_sk = 
 ad2.ca_address_sk
 JOIN income_band ib1 ON hd1.hd_income_band_sk = ib1.ib_income_band_sk
 JOIN income_band ib2 ON hd2.hd_income_band_sk = ib2.ib_income_band_sk
 JOIN
  (select cs_item_sk
 ,sum(cs_ext_list_price) as 
 sale,sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund
   from catalog_sales JOIN catalog_returns
   ON catalog_sales.cs_item_sk = catalog_returns.cr_item_sk
 and catalog_sales.cs_order_number = catalog_returns.cr_order_number
   group by cs_item_sk
   having 
 sum(cs_ext_list_price)2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit))
  cs_ui
 ON store_sales.ss_item_sk = cs_ui.cs_item_sk
   WHERE  
  cd1.cd_marital_status  cd2.cd_marital_status and
  i_color in ('maroon','burnished','dim','steel','navajo','chocolate') 
 and
  i_current_price between 35 and 35 + 10 and
  i_current_price between 35 + 1 and 35 + 15
and d1.d_year = 2001;
 {code}
 {code}
 ], TaskAttempt 3 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.RuntimeException: 
 java.lang.AssertionError: Capacity must be a power of two
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:187)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172)
   at 
 org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity 
 must be a power of two
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:93)
   at 
 

[jira] [Updated] (HIVE-8401) OrcFileMergeOperator only close last orc file it opened, which resulted in stale data in table directory

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8401:
---
Fix Version/s: (was: 0.15.0)

 OrcFileMergeOperator only close last orc file it opened, which resulted in 
 stale data in table directory
 

 Key: HIVE-8401
 URL: https://issues.apache.org/jira/browse/HIVE-8401
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
 Environment: Windows Server
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
Priority: Critical
 Fix For: 0.14.0

 Attachments: HIVE-8401.1.patch, alter_merge_2_orc.q.out


 run the test
 {noformat}
 mvn -Phadoop-2  test -Dtest=TestCliDriver -Dqfile=alter_merge_2_orc.q
 {noformat}
 to reproduce it. Simply, this query does three data loads which generates 
 three orc files, ALTER TABLE CONCATENATE tries to merge orc pieces into a 
 single one which is final file to queried.
 Output 
 \hive\itests\qtest\target\qfile-results\clientpositive\alter_merge_2_orc.q.out
  shows # records as 600 that is wrong as opposed to 610 expected.
 Because OrcFileMergeOperator only closes last orc file, the 1st and 2nd orc 
 files still remain in table directory due to failure of deleting unclosed 
 file for old data clean when MoveTask tries to copy merged orc file from 
 scratch dir to table dir. Eventually the query goes to old data(1st and 2nd 
 orc files).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8372) Potential NPE in Tez MergeFileRecordProcessor

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8372:
---
Fix Version/s: (was: 0.15.0)

 Potential NPE in Tez MergeFileRecordProcessor
 -

 Key: HIVE-8372
 URL: https://issues.apache.org/jira/browse/HIVE-8372
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
 Fix For: 0.14.0

 Attachments: HIVE-8372.1.patch


 MergeFileRecordProcessor retrieves map work from cache. This map work can be 
 instance of merge file work. When the merge file work already exists in the 
 cache casting the map work to merge file work is missing which will result in 
 NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8435) Add identity project remover optimization

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8435:
---
Assignee: Jesús Camacho Rodríguez  (was: Ashutosh Chauhan)

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Jesús Camacho Rodríguez
 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.1.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8435) Add identity project remover optimization

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8435:
---
Status: Patch Available  (was: In Progress)

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.13.0, 0.12.0, 0.11.0, 0.10.0, 0.9.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.1.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-8435) Add identity project remover optimization

2014-11-18 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-8435:
--

Assignee: Ashutosh Chauhan  (was: Jesús Camacho Rodríguez)

 Add identity project remover optimization
 -

 Key: HIVE-8435
 URL: https://issues.apache.org/jira/browse/HIVE-8435
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, 
 HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, 
 HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, 
 HIVE-8435.08.patch, HIVE-8435.1.patch, HIVE-8435.patch


 In some cases there is an identity project in plan which is useless. Better 
 to optimize it away to avoid evaluating it without any benefit at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]

2014-11-18 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216958#comment-14216958
 ] 

Brock Noland commented on HIVE-8455:


Thank you [~ashutoshc], sorry for putting the wrong fixVersion.

 Print Spark job progress format info on the console[Spark Branch]
 -

 Key: HIVE-8455
 URL: https://issues.apache.org/jira/browse/HIVE-8455
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M3
 Fix For: spark-branch

 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive 
 on spark job status.PNG


 We have add support of spark job status monitoring on HIVE-7439, but not 
 print job progress format info on the console, user may confuse about what 
 the  progress info means, so I would like to add job progress format info 
 here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8904) Hive should support multiple Key provider modes

2014-11-18 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216970#comment-14216970
 ] 

Brock Noland commented on HIVE-8904:


+1

 Hive should support multiple Key provider modes
 ---

 Key: HIVE-8904
 URL: https://issues.apache.org/jira/browse/HIVE-8904
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-8904.patch


 In the hadoop cyptographic filesystem, JavaKeyStoreProvider, 
 KMSClientProvider are both supported. Although in the product environment KMS 
 is more preferable, We should enable both of them in hive side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]

2014-11-18 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216971#comment-14216971
 ] 

Brock Noland commented on HIVE-8887:


+1

 Investigate test failures on auto_join6, auto_join7, auto_join18, 
 auto_join18_multi_distinct [Spark Branch]
 ---

 Key: HIVE-8887
 URL: https://issues.apache.org/jira/browse/HIVE-8887
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Chao
Assignee: Chao
 Attachments: HIVE-8887.1-spark.patch


 These tests all failed with the same error, see below:
 {noformat}
 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver 
 (SessionState.java:printError(837)) - FAILED: NullPointerException null
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
   at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
   at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034)
   at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
   at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}
 This happens at compile time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8904) Hive should support multiple Key provider modes

2014-11-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8904:
---
   Resolution: Fixed
Fix Version/s: encryption-branch
   Status: Resolved  (was: Patch Available)

Thank you very much! I have committed this to the encryption branch!

 Hive should support multiple Key provider modes
 ---

 Key: HIVE-8904
 URL: https://issues.apache.org/jira/browse/HIVE-8904
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Fix For: encryption-branch

 Attachments: HIVE-8904.patch


 In the hadoop cyptographic filesystem, JavaKeyStoreProvider, 
 KMSClientProvider are both supported. Although in the product environment KMS 
 is more preferable, We should enable both of them in hive side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]

2014-11-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8887:
---
   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Thank you Chao! I have committed this to spark!

 Investigate test failures on auto_join6, auto_join7, auto_join18, 
 auto_join18_multi_distinct [Spark Branch]
 ---

 Key: HIVE-8887
 URL: https://issues.apache.org/jira/browse/HIVE-8887
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Chao
Assignee: Chao
 Fix For: spark-branch

 Attachments: HIVE-8887.1-spark.patch


 These tests all failed with the same error, see below:
 {noformat}
 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver 
 (SessionState.java:printError(837)) - FAILED: NullPointerException null
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412)
   at 
 org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
   at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
   at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034)
   at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
   at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}
 This happens at compile time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8883) Investigate test failures on auto_join30.q [Spark Branch]

2014-11-18 Thread Chao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216996#comment-14216996
 ] 

Chao commented on HIVE-8883:


Talked with [~szehon] and we need to put the {{currentInputPath}} back. Will 
submit a patch later.

 Investigate test failures on auto_join30.q [Spark Branch]
 -

 Key: HIVE-8883
 URL: https://issues.apache.org/jira/browse/HIVE-8883
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Chao
Assignee: Chao
 Fix For: spark-branch

 Attachments: HIVE-8883.1-spark.patch, HIVE-8883.2-spark.patch


 This test fails with the following stack trace:
 {noformat}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
   at 
 scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
   at 
 org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
   at 
 org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
   at org.apache.spark.scheduler.Task.run(Task.scala:56)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 2014-11-14 17:05:09,206 ERROR [Executor task launch worker-4]: 
 spark.SparkReduceRecordHandler 
 (SparkReduceRecordHandler.java:processRow(285)) - 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row (tag=0) 
 {key:{reducesinkkey0:val_0},value:{_col0:0}}
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:328)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
   at 
 org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
   at 
 scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
   at 
 org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
   at 
 org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
   at 
 org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
   at org.apache.spark.scheduler.Task.run(Task.scala:56)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
 exception: null
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
   ... 14 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
   ... 17 more
 {noformat}
 {{auto_join27.q}} and {{auto_join31.q}} 

[jira] [Updated] (HIVE-8894) Move calcite.version to root pom

2014-11-18 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8894:
---
   Resolution: Fixed
Fix Version/s: 0.15.0
   Status: Resolved  (was: Patch Available)

Thank you guys for the review! I have committed this to trunk!

 Move calcite.version to root pom
 

 Key: HIVE-8894
 URL: https://issues.apache.org/jira/browse/HIVE-8894
 Project: Hive
  Issue Type: Task
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: 0.15.0

 Attachments: HIVE-8894.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8908) Investigate test failure on join34.q

2014-11-18 Thread Chao (JIRA)
Chao created HIVE-8908:
--

 Summary: Investigate test failure on join34.q
 Key: HIVE-8908
 URL: https://issues.apache.org/jira/browse/HIVE-8908
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Chao
Assignee: Chao


For this query, the plan doesn't look correct:

{noformat}
OK
STAGE DEPENDENCIES:
  Stage-4 is a root stage
  Stage-1 depends on stages: Stage-5, Stage-4
  Stage-2 depends on stages: Stage-1
  Stage-0 depends on stages: Stage-2
  Stage-3 depends on stages: Stage-0
  Stage-5 is a root stage

STAGE PLANS:
  Stage: Stage-4
Spark
  DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:6
  Vertices:
Map 4 
Map Operator Tree:
TableScan
  alias: x
  Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE 
Column stats: NONE
  Filter Operator
predicate: key is not null (type: boolean)
Statistics: Num rows: 1 Data size: 216 Basic stats: 
COMPLETE Column stats: NONE
Spark HashTable Sink Operator
  condition expressions:
0 {_col1}
1 {value}
  keys:
0 _col0 (type: string)
1 key (type: string)
Reduce Output Operator
  key expressions: key (type: string)
  sort order: +
  Map-reduce partition columns: key (type: string)
  Statistics: Num rows: 1 Data size: 216 Basic stats: 
COMPLETE Column stats: NONE
  value expressions: value (type: string)
Local Work:
  Map Reduce Local Work

  Stage: Stage-1
Spark
  Edges:
Union 2 - Map 1 (NONE, 0), Map 3 (NONE, 0)
  DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:4
  Vertices:
Map 1 
Map Operator Tree:
TableScan
  alias: x
  Filter Operator
predicate: (key  20) (type: boolean)
Select Operator
  expressions: key (type: string), value (type: string)
  outputColumnNames: _col0, _col1
  Map Join Operator
condition map:
 Inner Join 0 to 1
condition expressions:
  0 {_col1}
  1 {key} {value}
keys:
  0 _col0 (type: string)
  1 key (type: string)
outputColumnNames: _col1, _col2, _col3
input vertices:
  1 Map 4
Select Operator
  expressions: _col2 (type: string), _col3 (type: 
string), _col1 (type: string)
  outputColumnNames: _col0, _col1, _col2
  File Output Operator
compressed: false
table:
input format: 
org.apache.hadoop.mapred.TextInputFormat
output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
name: default.dest_j1
Local Work:
  Map Reduce Local Work
Map 3 
Map Operator Tree:
TableScan
  alias: x1
  Filter Operator
predicate: (key  100) (type: boolean)
Select Operator
  expressions: key (type: string), value (type: string)
  outputColumnNames: _col0, _col1
  Map Join Operator
condition map:
 Inner Join 0 to 1
condition expressions:
  0 {_col1}
  1 {key} {value}
keys:
  0 _col0 (type: string)
  1 key (type: string)
outputColumnNames: _col1, _col2, _col3
input vertices:
  1 Map 4
Select Operator
  expressions: _col2 (type: string), _col3 (type: 
string), _col1 (type: string)
  outputColumnNames: _col0, _col1, _col2
  File Output Operator
compressed: false
table:
input format: 

  1   2   >