[jira] [Commented] (HIVE-8395) CBO: enable by default
[ https://issues.apache.org/jira/browse/HIVE-8395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215870#comment-14215870 ] Hive QA commented on HIVE-8395: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12682045/HIVE-8395.24.patch {color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 6647 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_colname org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_gby_star org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_type_in_plan org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_skewjoin org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_gby_star org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_gby_star2 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1831/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1831/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1831/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 19 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12682045 - PreCommit-HIVE-TRUNK-Build CBO: enable by default -- Key: HIVE-8395 URL: https://issues.apache.org/jira/browse/HIVE-8395 Project: Hive Issue Type: Improvement Components: CBO Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.15.0 Attachments: HIVE-8395.01.patch, HIVE-8395.02.patch, HIVE-8395.03.patch, HIVE-8395.04.patch, HIVE-8395.05.patch, HIVE-8395.06.patch, HIVE-8395.07.patch, HIVE-8395.08.patch, HIVE-8395.09.patch, HIVE-8395.10.patch, HIVE-8395.11.patch, HIVE-8395.12.patch, HIVE-8395.12.patch, HIVE-8395.13.patch, HIVE-8395.13.patch, HIVE-8395.14.patch, HIVE-8395.15.patch, HIVE-8395.16.patch, HIVE-8395.17.patch, HIVE-8395.18.patch, HIVE-8395.18.patch, HIVE-8395.19.patch, HIVE-8395.20.patch, HIVE-8395.21.patch, HIVE-8395.22.patch, HIVE-8395.23.patch, HIVE-8395.23.withon.patch, HIVE-8395.24.patch, HIVE-8395.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8903) downgrade guava version for spark branch from 14.0.1 to 11.0.2.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-8903: Status: Open (was: Patch Available) downgrade guava version for spark branch from 14.0.1 to 11.0.2.[Spark Branch] - Key: HIVE-8903 URL: https://issues.apache.org/jira/browse/HIVE-8903 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8903.1-spark.patch Hive trunk depends on guava 11.0.2, same as Hadoop and Tez. Spark depends on guava 14.0.1, which we shaded guava in its assembly jar to avoid conflict for Hive on Spark(HIVE-7387). Guava version is upgrade to 14.0.2 in Hive spark branch, which should be unnecessary and lead to guava conflicts(HIVE-8854). We should downgrade guava dependency from 14.0.1 to 11.0.2 to keep consist with Hive trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8903) downgrade guava version for spark branch from 14.0.1 to 11.0.2.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215880#comment-14215880 ] Chengxiang Li commented on HIVE-8903: - Hi, [~szehon], I think Marcelo means that, although spark assembly jar includes shared guava 14, spark-core, as an independent build, would still depend on guava 14, our qtest depends on spark-core, I want to check whether qtests success in local mode with guava11 here. downgrade guava version for spark branch from 14.0.1 to 11.0.2.[Spark Branch] - Key: HIVE-8903 URL: https://issues.apache.org/jira/browse/HIVE-8903 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8903.1-spark.patch Hive trunk depends on guava 11.0.2, same as Hadoop and Tez. Spark depends on guava 14.0.1, which we shaded guava in its assembly jar to avoid conflict for Hive on Spark(HIVE-7387). Guava version is upgrade to 14.0.2 in Hive spark branch, which should be unnecessary and lead to guava conflicts(HIVE-8854). We should downgrade guava dependency from 14.0.1 to 11.0.2 to keep consist with Hive trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-5775) Introduce Cost Based Optimizer to Hive
[ https://issues.apache.org/jira/browse/HIVE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-5775: - Labels: (was: TODOC14) Introduce Cost Based Optimizer to Hive -- Key: HIVE-5775 URL: https://issues.apache.org/jira/browse/HIVE-5775 Project: Hive Issue Type: New Feature Components: CBO Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.14.0 Attachments: CBO-2.pdf, HIVE-5775.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5775) Introduce Cost Based Optimizer to Hive
[ https://issues.apache.org/jira/browse/HIVE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215900#comment-14215900 ] Lefty Leverenz commented on HIVE-5775: -- Thanks [~jpullokkaran], I removed the TODOC14 label on the assumption that no updates are needed at this time. Introduce Cost Based Optimizer to Hive -- Key: HIVE-5775 URL: https://issues.apache.org/jira/browse/HIVE-5775 Project: Hive Issue Type: New Feature Components: CBO Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 0.14.0 Attachments: CBO-2.pdf, HIVE-5775.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8893) Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode
[ https://issues.apache.org/jira/browse/HIVE-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-8893: -- Attachment: HIVE-8893.5.patch Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode --- Key: HIVE-8893 URL: https://issues.apache.org/jira/browse/HIVE-8893 Project: Hive Issue Type: Bug Components: Authorization, HiveServer2, SQL Affects Versions: 0.14.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.15.0 Attachments: HIVE-8893.3.patch, HIVE-8893.4.patch, HIVE-8893.5.patch The udfs like reflect() or java_method() enables executing a java method as udf. While this offers lot of flexibility in the standalone mode, it can become a security loophole in a secure multiuser environment. For example, in HiveServer2 one can execute any available java code with user hive's credentials. We need a whitelist and blacklist to restrict builtin udfs in Hiveserver2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8893) Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode
[ https://issues.apache.org/jira/browse/HIVE-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-8893: -- Attachment: (was: HIVE-8893.2.patch) Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode --- Key: HIVE-8893 URL: https://issues.apache.org/jira/browse/HIVE-8893 Project: Hive Issue Type: Bug Components: Authorization, HiveServer2, SQL Affects Versions: 0.14.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.15.0 Attachments: HIVE-8893.3.patch, HIVE-8893.4.patch, HIVE-8893.5.patch The udfs like reflect() or java_method() enables executing a java method as udf. While this offers lot of flexibility in the standalone mode, it can become a security loophole in a secure multiuser environment. For example, in HiveServer2 one can execute any available java code with user hive's credentials. We need a whitelist and blacklist to restrict builtin udfs in Hiveserver2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8835) identify dependency scope for Remote Spark Context.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215915#comment-14215915 ] Hive QA commented on HIVE-8835: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12682104/HIVE-8835.1-spark.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7180 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/390/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/390/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-390/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12682104 - PreCommit-HIVE-SPARK-Build identify dependency scope for Remote Spark Context.[Spark Branch] - Key: HIVE-8835 URL: https://issues.apache.org/jira/browse/HIVE-8835 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8835.1-spark.patch While submit job through Remote Spark Context, spark RDD graph generation and job submit is executed in remote side, so we have to add hive related dependency into its classpath with spark.driver.extraClassPath. instead of add all hive/hadoop dependency, we should narrow the scope and identify what dependency remote spark context required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8904) Hive should support multiple Key provider modes
Ferdinand Xu created HIVE-8904: -- Summary: Hive should support multiple Key provider modes Key: HIVE-8904 URL: https://issues.apache.org/jira/browse/HIVE-8904 Project: Hive Issue Type: Bug Reporter: Ferdinand Xu Assignee: Ferdinand Xu In the hadoop cyptographic filesystem, JavaKeyStoreProvider, KMSClientProvider are both supported. Although in the product environment KMS is more preferable, We should enable both of them in hive side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8904) Hive should support multiple Key provider modes
[ https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-8904: --- Issue Type: Sub-task (was: Bug) Parent: HIVE-8065 Hive should support multiple Key provider modes --- Key: HIVE-8904 URL: https://issues.apache.org/jira/browse/HIVE-8904 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu In the hadoop cyptographic filesystem, JavaKeyStoreProvider, KMSClientProvider are both supported. Although in the product environment KMS is more preferable, We should enable both of them in hive side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Propose to put JIRA traffic on separate hive list
+1 Would it be possible to send commits to the dev list, as well as creates? Or maybe all changes to the Resolution or Status? -- Lefty On Mon, Nov 17, 2014 at 2:27 PM, Alan Gates ga...@hortonworks.com wrote: The hive dev list generates a lot of traffic. The average for October was 192 messages per day. As a result no one sends hive dev directly to their inbox. They either unsubscribe or they build filters that ship most or all of it to a folder. Chasing people off the dev list is obviously not what we want. Sending messages to folders means missing messages or not seeing them until you get unbusy enough to go read back mail in folders. The vast majority of this traffic is comments on JIRA tickets. The way I've seen other very active Apache projects manage this is JIRA creates go to the dev list, but all other JIRA operations go to a separate list. Then everyone can see new tickets, and if they are interested they can watch that JIRA. If not, they are not burdened with the email from it. I propose we do this same thing in Hive. Alan. -- Sent with Postbox http://www.getpostbox.com -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Updated] (HIVE-8904) Hive should support multiple Key provider modes
[ https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-8904: --- Status: Patch Available (was: Open) Hive should support multiple Key provider modes --- Key: HIVE-8904 URL: https://issues.apache.org/jira/browse/HIVE-8904 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-8904.patch In the hadoop cyptographic filesystem, JavaKeyStoreProvider, KMSClientProvider are both supported. Although in the product environment KMS is more preferable, We should enable both of them in hive side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8904) Hive should support multiple Key provider modes
[ https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-8904: --- Attachment: HIVE-8904.patch Hive should support multiple Key provider modes --- Key: HIVE-8904 URL: https://issues.apache.org/jira/browse/HIVE-8904 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-8904.patch In the hadoop cyptographic filesystem, JavaKeyStoreProvider, KMSClientProvider are both supported. Although in the product environment KMS is more preferable, We should enable both of them in hive side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7073) Implement Binary in ParquetSerDe
[ https://issues.apache.org/jira/browse/HIVE-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215974#comment-14215974 ] Hive QA commented on HIVE-7073: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12682063/HIVE-7073.patch {color:green}SUCCESS:{color} +1 6647 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1832/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1832/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1832/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12682063 - PreCommit-HIVE-TRUNK-Build Implement Binary in ParquetSerDe Key: HIVE-7073 URL: https://issues.apache.org/jira/browse/HIVE-7073 Project: Hive Issue Type: Sub-task Reporter: David Chen Assignee: Ferdinand Xu Attachments: HIVE-7073.patch The ParquetSerDe currently does not support the BINARY data type. This ticket is to implement the BINARY data type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files
[ https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215985#comment-14215985 ] Mickael Lacour commented on HIVE-8359: -- [~brocknoland], normally I picked the patch that [~rdblue] told me about (the review on the Review Board), but maybe not the last version. [~rdblue] wanted me to update this patch to handle the HIVE-6994 instead of having two patches that will have the same behavior/code. And I like the way [~spena] wrote the solution (better than mine in my opinion). [~spena], basically I modified the WritableGroupConverter to clean the 'current value'. If you don't do that, you will never have a null value inside an array, but the previous one. {code} diff --git ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java index 582a5df..052b36d 100644 --- ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java +++ ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java @@ -54,6 +54,7 @@ public void start() { if (isMap) { mapPairContainer = new Writable[2]; } +currentValue = null; } @Override {code} And the second part was to add Null values from the ParquetHiveSerDe (values that I was skipping before for no valid reason). {code} diff --git ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java index b689336..4b36767 100644 --- ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java +++ ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java @@ -202,13 +202,11 @@ private ArrayWritable createArray(final Object obj, final ListObjectInspector in if (sourceArray != null) { for (final Object curObj : sourceArray) { -final Writable newObj = createObject(curObj, subInspector); -if (newObj != null) { - array.add(newObj); -} +array.add(createObject(curObj, subInspector)); } } if (array.size() 0) { - final ArrayWritable subArray = new ArrayWritable(array.get(0).getClass(), + final ArrayWritable subArray = new ArrayWritable(Writable.class, array.toArray(new Writable[array.size()])); return new ArrayWritable(Writable.class, new Writable[] {subArray}); } else { {code} And the qtest was just to be sure to handle empty array, null array, array with null, and the same for map. {code} +++ data/files/parquet_array_null_element.txt @@ -0,0 +1,3 @@ +1|,7|CARRELAGE,MOQUETTE|key11:value11,key12:value12,key13:value13 +2|,|CAILLEBOTIS,| +3|,42,||key11:value11,key12:,key13: {code} If you want to integrate them into your patch, feel free to do it, else I might want to duplicate your patch (:p) and add this fix. Map containing null values are not correctly written in Parquet files - Key: HIVE-8359 URL: https://issues.apache.org/jira/browse/HIVE-8359 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.13.1 Reporter: Frédéric TERRAZZONI Assignee: Sergio Peña Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, map_null_val.avro Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8868) SparkSession and SparkClient mapping[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-8868: Status: Patch Available (was: Open) SparkSession and SparkClient mapping[Spark Branch] -- Key: HIVE-8868 URL: https://issues.apache.org/jira/browse/HIVE-8868 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3, TODOC-SPARK Attachments: HIVE-8868.1-spark.patch, HIVE-8868.2-spark.patch It should be a seperate spark context for each user session, currently we share a singleton local spark context in all user sessions with local spark, and create remote spark context for each spark job with spark cluster. To binding one spark context to each user session, we may construct spark client on session open, one thing to notify is that, is SparkSession::conf is consist with Context::getConf? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8750) Commit initial encryption work
[ https://issues.apache.org/jira/browse/HIVE-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215992#comment-14215992 ] Lefty Leverenz commented on HIVE-8750: -- Doc note: This adds configuration parameters *hive.exec.stagingdir* and *hive.exec.copyfile.maxsize* to HiveConf.java in encryption-branch. (When the branch gets merged into trunk, the parameters will need to be documented in the wiki.) Commit initial encryption work -- Key: HIVE-8750 URL: https://issues.apache.org/jira/browse/HIVE-8750 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Sergio Peña Fix For: encryption-branch Attachments: HIVE-8750.1.patch I believe Sergio has some work done for encryption. In this item we'll commit it to branch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6977) Delete Hiveserver1
[ https://issues.apache.org/jira/browse/HIVE-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216018#comment-14216018 ] Lefty Leverenz commented on HIVE-6977: -- Added same warning at beginning of the ODBC doc: * [Hive ODBC Driver | https://cwiki.apache.org/confluence/display/Hive/HiveODBC] Delete Hiveserver1 -- Key: HIVE-6977 URL: https://issues.apache.org/jira/browse/HIVE-6977 Project: Hive Issue Type: Task Components: JDBC, Server Infrastructure Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Labels: TODOC15 Fix For: 0.15.0 Attachments: HIVE-6977.1.patch, HIVE-6977.patch See mailing list discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Propose to put JIRA traffic on separate hive list
+1 That's a great idea Alan. On Tue, Nov 18, 2014 at 9:49 AM, Lefty Leverenz leftylever...@gmail.com wrote: +1 Would it be possible to send commits to the dev list, as well as creates? Or maybe all changes to the Resolution or Status? -- Lefty On Mon, Nov 17, 2014 at 2:27 PM, Alan Gates ga...@hortonworks.com wrote: The hive dev list generates a lot of traffic. The average for October was 192 messages per day. As a result no one sends hive dev directly to their inbox. They either unsubscribe or they build filters that ship most or all of it to a folder. Chasing people off the dev list is obviously not what we want. Sending messages to folders means missing messages or not seeing them until you get unbusy enough to go read back mail in folders. The vast majority of this traffic is comments on JIRA tickets. The way I've seen other very active Apache projects manage this is JIRA creates go to the dev list, but all other JIRA operations go to a separate list. Then everyone can see new tickets, and if they are interested they can watch that JIRA. If not, they are not burdened with the email from it. I propose we do this same thing in Hive. Alan. -- Sent with Postbox http://www.getpostbox.com -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Mail bounces from ebuddy.com
This is still happening. If one of the admins could give it another go that'd be great. I can also file an issue with INFRA. On Mon, Sep 1, 2014 at 4:01 PM, Damien Carol dca...@blitzbs.com wrote: There are still these annoying ebuddy.com addresses : nelshe...@ebuddy.com bsc...@ebuddy.com Are there an administrator to send the 2 emails at *dev-unsubscribe-user=email address@hive.apache.org http://hive.apache.org* ??? Thanks in advance. Regards, Damien CAROL - tél : +33 (0)4 74 96 88 14 - fax : +33 (0)4 74 96 31 88 - email : dca...@blitzbs.com BLITZ BUSINESS SERVICE Le 23/08/2014 01:22, Lars Francke a écrit : Likewise. From Alan's linked documentation it seems like the correct E-Mail address to use is: dev-unsubscribe-user=email address@hive.apache.org If you could try again maybe? On Wed, Aug 20, 2014 at 9:31 PM, Nick Dimiduk ndimi...@gmail.com ndimi...@gmail.com wrote: Not quite taken care of. I'm still getting spam about these addresses. On Mon, Aug 18, 2014 at 9:18 AM, Lars Francke lars.fran...@gmail.com lars.fran...@gmail.com wrote: Thanks Alan and Ashutosh for taking care of this! On Mon, Aug 18, 2014 at 5:45 PM, Ashutosh Chauhan hashut...@apache.org hashut...@apache.org wrote: Thanks, Alan for the hint. I just unsubscribed those two email addresses from ebuddy. On Mon, Aug 18, 2014 at 8:23 AM, Alan Gates ga...@hortonworks.com ga...@hortonworks.com wrote: Anyone who is an admin on the list (I don't who the admins are) can do this by doing user-unsubscribe-USERNAME=ebuddy@hive.apache.org where USERNAME is the name of the bouncing user (seehttp://untroubled.org/ezmlm/ezman/ezman1.html ) Alan. Thejas Nair the...@hortonworks.com the...@hortonworks.com August 17, 2014 at 17:02 I don't know how to do this. Carl, Ashutosh, Do you guys know how to remove these two invalid emails from the mailing list ? Lars Francke lars.fran...@gmail.com lars.fran...@gmail.com August 17, 2014 at 15:41 Hmm great, I see others mentioning this as well. I'm happy to contact INFRA but I'm not sure if they are even needed or if someone from the Hive team can do this? On Fri, Aug 8, 2014 at 3:43 AM, Lefty Leverenz leftylever...@gmail.com leftylever...@gmail.com leftylever...@gmail.com Lefty Leverenz leftylever...@gmail.com leftylever...@gmail.com August 7, 2014 at 18:43 (Excuse the spam.) Actually I'm getting two bounces per message, but gmail concatenates them so I didn't notice the second one. -- Lefty On Thu, Aug 7, 2014 at 9:36 PM, Lefty Leverenz leftylever...@gmail.com leftylever...@gmail.com leftylever...@gmail.com Lefty Leverenz leftylever...@gmail.com leftylever...@gmail.com August 7, 2014 at 18:36 Curious, I've only been getting one bounce per message. Anyway thanks for bringing this up. -- Lefty Lars Francke lars.fran...@gmail.com lars.fran...@gmail.com August 7, 2014 at 4:38 Hi, every time I send a mail to dev@ I get two bounce mails from two people at ebuddy.com. I don't want to post the E-Mail addresses publicly but I can send them on if needed (and it can be triggered easily by just replying to this mail I guess). Could we maybe remove them from the list? Cheers, Lars -- Sent with Postbox http://www.getpostbox.com http://www.getpostbox.com CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-8893) Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode
[ https://issues.apache.org/jira/browse/HIVE-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216052#comment-14216052 ] Hive QA commented on HIVE-8893: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12682121/HIVE-8893.5.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6650 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1833/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1833/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1833/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12682121 - PreCommit-HIVE-TRUNK-Build Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode --- Key: HIVE-8893 URL: https://issues.apache.org/jira/browse/HIVE-8893 Project: Hive Issue Type: Bug Components: Authorization, HiveServer2, SQL Affects Versions: 0.14.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.15.0 Attachments: HIVE-8893.3.patch, HIVE-8893.4.patch, HIVE-8893.5.patch The udfs like reflect() or java_method() enables executing a java method as udf. While this offers lot of flexibility in the standalone mode, it can become a security loophole in a secure multiuser environment. For example, in HiveServer2 one can execute any available java code with user hive's credentials. We need a whitelist and blacklist to restrict builtin udfs in Hiveserver2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8904) Hive should support multiple Key provider modes
[ https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216055#comment-14216055 ] Hive QA commented on HIVE-8904: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12682124/HIVE-8904.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1834/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1834/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1834/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-1834/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java' Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java' Reverted 'service/src/java/org/apache/hive/service/cli/CLIService.java' Reverted 'ql/src/test/org/apache/hadoop/hive/metastore/TestMetastoreExpr.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestExpressionEvaluator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/SqlFunctionConverter.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/common-secure/target shims/scheduler/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/core/target hcatalog/streaming/target hcatalog/server-extensions/target hcatalog/hcatalog-pig-adapter/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target accumulo-handler/target hwi/target common/target common/src/gen contrib/target service/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1640306. At revision 1640306. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12682124 - PreCommit-HIVE-TRUNK-Build Hive should
[jira] [Commented] (HIVE-8868) SparkSession and SparkClient mapping[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216065#comment-14216065 ] Hive QA commented on HIVE-8868: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12682113/HIVE-8868.2-spark.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 7180 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/391/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/391/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-391/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12682113 - PreCommit-HIVE-SPARK-Build SparkSession and SparkClient mapping[Spark Branch] -- Key: HIVE-8868 URL: https://issues.apache.org/jira/browse/HIVE-8868 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3, TODOC-SPARK Attachments: HIVE-8868.1-spark.patch, HIVE-8868.2-spark.patch It should be a seperate spark context for each user session, currently we share a singleton local spark context in all user sessions with local spark, and create remote spark context for each spark job with spark cluster. To binding one spark context to each user session, we may construct spark client on session open, one thing to notify is that, is SparkSession::conf is consist with Context::getConf? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8905) Servlet classes signer information does not match[Spark branch]
Chengxiang Li created HIVE-8905: --- Summary: Servlet classes signer information does not match[Spark branch] Key: HIVE-8905 URL: https://issues.apache.org/jira/browse/HIVE-8905 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li {noformat} 2014-11-18 02:36:04,168 DEBUG spark.HttpFileServer (Logging.scala:logDebug(63)) - HTTP file server started at: http://10.203.137.143:46436 2014-11-18 02:36:04,172 ERROR session.TestSparkSessionManagerImpl (TestSparkSessionManagerImpl.java:run(127)) - Error executing 'Session thread 5' org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client. at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55) at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:122) at org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl$SessionThread.run(TestSparkSessionManagerImpl.java:112) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.SecurityException: class javax.servlet.FilterRegistration's signer information does not match signer information of other classes in the same package at java.lang.ClassLoader.checkCerts(ClassLoader.java:952) at java.lang.ClassLoader.preDefineClass(ClassLoader.java:666) at java.lang.ClassLoader.defineClass(ClassLoader.java:794) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:136) at org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:129) at org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:98) at org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:96) at org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:87) at org.apache.spark.ui.WebUI.attachPage(WebUI.scala:67) at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:60) at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:60) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.ui.WebUI.attachTab(WebUI.scala:60) at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:49) at org.apache.spark.ui.SparkUI.init(SparkUI.scala:60) at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:150) at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:105) at org.apache.spark.SparkContext.init(SparkContext.scala:237) at org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:58) at org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.init(LocalHiveSparkClient.java:107) at org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.getInstance(LocalHiveSparkClient.java:69) at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:52) at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:53) ... 3 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8868) SparkSession and SparkClient mapping[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216145#comment-14216145 ] Chengxiang Li commented on HIVE-8868: - From hive log, session related tests failed due to servelet classes load exception, HIVE-8905 is created to track it. SparkSession and SparkClient mapping[Spark Branch] -- Key: HIVE-8868 URL: https://issues.apache.org/jira/browse/HIVE-8868 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3, TODOC-SPARK Attachments: HIVE-8868.1-spark.patch, HIVE-8868.2-spark.patch It should be a seperate spark context for each user session, currently we share a singleton local spark context in all user sessions with local spark, and create remote spark context for each spark job with spark cluster. To binding one spark context to each user session, we may construct spark client on session open, one thing to notify is that, is SparkSession::conf is consist with Context::getConf? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8868) SparkSession and SparkClient mapping[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-8868: Status: Open (was: Patch Available) SparkSession and SparkClient mapping[Spark Branch] -- Key: HIVE-8868 URL: https://issues.apache.org/jira/browse/HIVE-8868 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3, TODOC-SPARK Attachments: HIVE-8868.1-spark.patch, HIVE-8868.2-spark.patch It should be a seperate spark context for each user session, currently we share a singleton local spark context in all user sessions with local spark, and create remote spark context for each spark job with spark cluster. To binding one spark context to each user session, we may construct spark client on session open, one thing to notify is that, is SparkSession::conf is consist with Context::getConf? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files
[ https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216336#comment-14216336 ] Sergio Peña commented on HIVE-8359: --- Thanks [~mickaellcr]. Sorry for the confusion. I did not see you uploaded another patch here. I just added two extra lines to the patch you uploaded. I will integrate your fixes there, and upload the patch again. Map containing null values are not correctly written in Parquet files - Key: HIVE-8359 URL: https://issues.apache.org/jira/browse/HIVE-8359 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.13.1 Reporter: Frédéric TERRAZZONI Assignee: Sergio Peña Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, map_null_val.avro Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files
[ https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-8359: -- Status: Open (was: Patch Available) Map containing null values are not correctly written in Parquet files - Key: HIVE-8359 URL: https://issues.apache.org/jira/browse/HIVE-8359 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.13.1 Reporter: Frédéric TERRAZZONI Assignee: Sergio Peña Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, HIVE-8359.5.patch, map_null_val.avro Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files
[ https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-8359: -- Status: Patch Available (was: Open) Map containing null values are not correctly written in Parquet files - Key: HIVE-8359 URL: https://issues.apache.org/jira/browse/HIVE-8359 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.13.1 Reporter: Frédéric TERRAZZONI Assignee: Sergio Peña Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, HIVE-8359.5.patch, map_null_val.avro Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files
[ https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-8359: -- Attachment: HIVE-8359.5.patch Attach new patch that integrates Mickael Lacour HIVE-6994 fix. Map containing null values are not correctly written in Parquet files - Key: HIVE-8359 URL: https://issues.apache.org/jira/browse/HIVE-8359 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.13.1 Reporter: Frédéric TERRAZZONI Assignee: Sergio Peña Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, HIVE-8359.5.patch, map_null_val.avro Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] Apache Hive 0.14.0 Released
Congrats ! It's definitively the most interesting release of HIVE. Keep up the good work. I know that Santa will add some boys in the good list this year. Regards, Damien CAROL - tél : +33 (0)4 74 96 88 14 - email : dca...@blitzbs.com BLITZ BUSINESS SERVICE 2014-11-18 0:33 GMT+01:00 Suhas Gogate vgog...@pivotal.io: Congrats! This is a big step for Hive! --Suhas On Mon, Nov 17, 2014 at 3:05 PM, Thejas Nair the...@hortonworks.com wrote: The link to the download page is now - https://hive.apache.org/downloads.html (I have also corrected the email template in how-to-release wiki with new url). On Mon, Nov 17, 2014 at 1:59 PM, Roshan Naik ros...@hortonworks.com wrote: 1) fyi.. this link is broken: http://hive.apache.org/releases.html 2) Java docs were not published for 0.14.0 https://hive.apache.org/javadoc.html On Sun, Nov 16, 2014 at 7:04 PM, Clark Yang (杨卓荦) yangzhuo...@gmail.com wrote: Great job! Congrats! Thanks, Zhuoluo (Clark) Yang 2014-11-13 8:55 GMT+08:00 Gunther Hagleitner gunt...@apache.org: The Apache Hive team is proud to announce the the release of Apache Hive version 0.14.0. The Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage. Built on top of Apache Hadoop (TM), it provides: * Tools to enable easy data extract/transform/load (ETL). * A mechanism to impose structure on a variety of data formats. * Access to files stored either directly in Apache HDFS (TM) or in other data storage systems such as Apache HBase (TM) or Apache Accumulo (TM). * Query execution via Apache Hadoop MapReduce and Apache Tez frameworks. * Cost-based query planning via Apache Calcite For Hive release details and downloads, please visit: http://hive.apache.org/releases.html Hive 0.14.0 Release Notes are available here: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12326450styleName=TextprojectId=12310843 We would like to thank the many contributors who made this release possible. Regards, The Apache Hive Team -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Hive-0.14 - Build # 734 - Still Failing
Changes for Build #696 [rohini] PIG-4186: Fix e2e run against new build of pig and some enhancements (rohini) Changes for Build #697 Changes for Build #698 Changes for Build #699 Changes for Build #700 Changes for Build #701 Changes for Build #702 Changes for Build #703 [daijy] HIVE-8484: HCatalog throws an exception if Pig job is of type 'fetch' (Lorand Bendig via Daniel Dai) Changes for Build #704 [gunther] HIVE-8781: Nullsafe joins are busted on Tez (Gunther Hagleitner, reviewed by Prasanth J) Changes for Build #705 [gunther] HIVE-8760: Pass a copy of HiveConf to hooks (Gunther Hagleitner, reviewed by Gopal V) Changes for Build #706 [thejas] HIVE-8772 : zookeeper info logs are always printed from beeline with service discovery mode (Thejas Nair, reviewed by Vaibhav Gumashta) Changes for Build #707 [gunther] HIVE-8782: HBase handler doesn't compile with hadoop-1 (Jimmy Xiang, reviewed by Xuefu and Sergey) Changes for Build #708 Changes for Build #709 [thejas] HIVE-8785 : HiveServer2 LogDivertAppender should be more selective for beeline getLogs (Thejas Nair, reviewed by Gopal V) Changes for Build #710 [vgumashta] HIVE-8764: Windows: HiveServer2 TCP SSL cannot recognize localhost (Vaibhav Gumashta reviewed by Thejas Nair) Changes for Build #711 [gunther] HIVE-8768: CBO: Fix filter selectivity for 'in clause' '' (Laljo John Pullokkaran via Gunther Hagleitner) Changes for Build #712 [gunther] HIVE-8794: Hive on Tez leaks AMs when killed before first dag is run (Gunther Hagleitner, reviewed by Gopal V) Changes for Build #713 [gunther] HIVE-8798: Some Oracle deadlocks not being caught in TxnHandler (Alan Gates via Gunther Hagleitner) Changes for Build #714 [gunther] HIVE-8800: Update release notes and notice for hive .14 (Gunther Hagleitner, reviewed by Prasanth J) [gunther] HIVE-8799: boatload of missing apache headers (Gunther Hagleitner, reviewed by Thejas M Nair) Changes for Build #715 [gunther] Preparing for release 0.14.0 Changes for Build #716 [gunther] Preparing for release 0.14.0 [gunther] Preparing for release 0.14.0 Changes for Build #717 Changes for Build #718 Changes for Build #719 Changes for Build #720 [gunther] HIVE-8811: Dynamic partition pruning can result in NPE during query compilation (Gunther Hagleitner, reviewed by Gopal V) Changes for Build #721 [gunther] HIVE-8805: CBO skipped due to SemanticException: Line 0:-1 Both left and right aliases encountered in JOIN 'avg_cs_ext_discount_amt' (Laljo John Pullokkaran via Gunther Hagleitner) [sershe] HIVE-8715 : Hive 14 upgrade scripts can fail for statistics if database was created using auto-create ADDENDUM (Sergey Shelukhin, reviewed by Ashutosh Chauhan and Gunther Hagleitner) Changes for Build #722 Changes for Build #723 Changes for Build #724 [gunther] HIVE-8845: Switch to Tez 0.5.2 (Gunther Hagleitner, reviewed by Gopal V) Changes for Build #725 [sershe] HIVE-8295 : Add batch retrieve partition objects for metastore direct sql (Selina Zhang and Sergey Shelukhin, reviewed by Ashutosh Chauhan) Changes for Build #726 Changes for Build #727 [gunther] HIVE-8873: Switch to calcite 0.9.2 (Gunther Hagleitner, reviewed by Gopal V) Changes for Build #728 [thejas] HIVE-8830 : hcatalog process don't exit because of non daemon thread (Thejas Nair, reviewed by Eugene Koifman, Sushanth Sowmyan) Changes for Build #729 Changes for Build #730 Changes for Build #731 Changes for Build #732 Changes for Build #733 Changes for Build #734 No tests ran. The Apache Jenkins build system has built Hive-0.14 (build #734) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-0.14/734/ to view the results.
[jira] [Commented] (HIVE-8893) Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode
[ https://issues.apache.org/jira/browse/HIVE-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216472#comment-14216472 ] Prasad Mujumdar commented on HIVE-8893: --- The failed test optimize_nullscan passes in my setup. Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode --- Key: HIVE-8893 URL: https://issues.apache.org/jira/browse/HIVE-8893 Project: Hive Issue Type: Bug Components: Authorization, HiveServer2, SQL Affects Versions: 0.14.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.15.0 Attachments: HIVE-8893.3.patch, HIVE-8893.4.patch, HIVE-8893.5.patch The udfs like reflect() or java_method() enables executing a java method as udf. While this offers lot of flexibility in the standalone mode, it can become a security loophole in a secure multiuser environment. For example, in HiveServer2 one can execute any available java code with user hive's credentials. We need a whitelist and blacklist to restrict builtin udfs in Hiveserver2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216474#comment-14216474 ] Chao commented on HIVE-8887: This error happens when it SparkMapJoinOptimizer cannot find a big table candidate, and hence big table position is -1. In this case, instead of continue entering the map join processing, we should fall back to common join. The code is there, but commented out. Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch] --- Key: HIVE-8887 URL: https://issues.apache.org/jira/browse/HIVE-8887 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chao Assignee: Chao These tests all failed with the same error, see below: {noformat} 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver (SessionState.java:printError(837)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} This happens at compile time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8835) identify dependency scope for Remote Spark Context.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216485#comment-14216485 ] Brock Noland commented on HIVE-8835: +1 identify dependency scope for Remote Spark Context.[Spark Branch] - Key: HIVE-8835 URL: https://issues.apache.org/jira/browse/HIVE-8835 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8835.1-spark.patch While submit job through Remote Spark Context, spark RDD graph generation and job submit is executed in remote side, so we have to add hive related dependency into its classpath with spark.driver.extraClassPath. instead of add all hive/hadoop dependency, we should narrow the scope and identify what dependency remote spark context required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28147: HIVE-7073:Implement Binary in ParquetSerDe
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28147/#review61946 --- LGTM. Mickaël Lacour's suggestion of adding a null value test is a great one. - Mohit Sabharwal On Nov. 18, 2014, 1:58 a.m., cheng xu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28147/ --- (Updated Nov. 18, 2014, 1:58 a.m.) Review request for hive. Repository: hive-git Description --- This patch includes: 1. binary support for ParquetHiveSerde 2. related test cases both in unit and ql test Diffs - data/files/parquet_types.txt d342062 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java 472de8f ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java d5aae3b ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java c57dd99 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 8ac7864 ql/src/test/queries/clientpositive/parquet_types.q 22585c3 ql/src/test/results/clientpositive/parquet_types.q.out 275897c Diff: https://reviews.apache.org/r/28147/diff/ Testing --- related UT and QL tests passed Thanks, cheng xu
[jira] [Updated] (HIVE-8835) identify dependency scope for Remote Spark Context.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8835: --- Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Thank you! I have committed this to spark. identify dependency scope for Remote Spark Context.[Spark Branch] - Key: HIVE-8835 URL: https://issues.apache.org/jira/browse/HIVE-8835 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Fix For: spark-branch Attachments: HIVE-8835.1-spark.patch While submit job through Remote Spark Context, spark RDD graph generation and job submit is executed in remote side, so we have to add hive related dependency into its classpath with spark.driver.extraClassPath. instead of add all hive/hadoop dependency, we should narrow the scope and identify what dependency remote spark context required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-8887: --- Attachment: HIVE-8887.1-spark.patch Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch] --- Key: HIVE-8887 URL: https://issues.apache.org/jira/browse/HIVE-8887 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chao Assignee: Chao Attachments: HIVE-8887.1-spark.patch These tests all failed with the same error, see below: {noformat} 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver (SessionState.java:printError(837)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} This happens at compile time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-8887: --- Status: Patch Available (was: Open) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch] --- Key: HIVE-8887 URL: https://issues.apache.org/jira/browse/HIVE-8887 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chao Assignee: Chao Attachments: HIVE-8887.1-spark.patch These tests all failed with the same error, see below: {noformat} 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver (SessionState.java:printError(837)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} This happens at compile time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files
[ https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216562#comment-14216562 ] Hive QA commented on HIVE-8359: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12682178/HIVE-8359.5.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6659 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1835/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1835/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1835/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12682178 - PreCommit-HIVE-TRUNK-Build Map containing null values are not correctly written in Parquet files - Key: HIVE-8359 URL: https://issues.apache.org/jira/browse/HIVE-8359 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.13.1 Reporter: Frédéric TERRAZZONI Assignee: Sergio Peña Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, HIVE-8359.5.patch, map_null_val.avro Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8905) Servlet classes signer information does not match [Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8905: --- Summary: Servlet classes signer information does not match [Spark branch] (was: Servlet classes signer information does not match[Spark branch] ) Servlet classes signer information does not match [Spark branch] - Key: HIVE-8905 URL: https://issues.apache.org/jira/browse/HIVE-8905 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Labels: Spark-M3 {noformat} 2014-11-18 02:36:04,168 DEBUG spark.HttpFileServer (Logging.scala:logDebug(63)) - HTTP file server started at: http://10.203.137.143:46436 2014-11-18 02:36:04,172 ERROR session.TestSparkSessionManagerImpl (TestSparkSessionManagerImpl.java:run(127)) - Error executing 'Session thread 5' org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client. at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55) at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:122) at org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl$SessionThread.run(TestSparkSessionManagerImpl.java:112) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.SecurityException: class javax.servlet.FilterRegistration's signer information does not match signer information of other classes in the same package at java.lang.ClassLoader.checkCerts(ClassLoader.java:952) at java.lang.ClassLoader.preDefineClass(ClassLoader.java:666) at java.lang.ClassLoader.defineClass(ClassLoader.java:794) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:136) at org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:129) at org.eclipse.jetty.servlet.ServletContextHandler.init(ServletContextHandler.java:98) at org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:96) at org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:87) at org.apache.spark.ui.WebUI.attachPage(WebUI.scala:67) at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:60) at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:60) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.ui.WebUI.attachTab(WebUI.scala:60) at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:49) at org.apache.spark.ui.SparkUI.init(SparkUI.scala:60) at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:150) at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:105) at org.apache.spark.SparkContext.init(SparkContext.scala:237) at org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:58) at org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.init(LocalHiveSparkClient.java:107) at org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.getInstance(LocalHiveSparkClient.java:69) at org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:52) at org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:53) ... 3 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8609) Move beeline to jline2
[ https://issues.apache.org/jira/browse/HIVE-8609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8609: --- Resolution: Fixed Fix Version/s: 0.15.0 Status: Resolved (was: Patch Available) Thank you very much [~Ferd]! I have committed this to trunk! Move beeline to jline2 -- Key: HIVE-8609 URL: https://issues.apache.org/jira/browse/HIVE-8609 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Ferdinand Xu Priority: Blocker Fix For: 0.15.0 Attachments: HIVE-8609.1.patch, HIVE-8609.2.patch, HIVE-8609.3.patch, HIVE-8609.4.patch, HIVE-8609.5.patch, HIVE-8609.6.patch, HIVE-8609.7.patch, HIVE-8609.patch We found a serious bug in jline in HIVE-8565. We should move to jline2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8359) Map containing null values are not correctly written in Parquet files
[ https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216579#comment-14216579 ] Ryan Blue commented on HIVE-8359: - I think with [~mickaellcr]'s addition, this is ready to go in. Good catch in the SerDe code, I didn't realize that the nulls were stripped at that point as well. I'm a little confused about why we're translating the ArrayWritable again though: isn't this properly constructed by the Converter code? Why can't we just pass the ArrayWritable that was created already? It seems like we're doing a lot of unnecessary work here that we might be able to remove (in future patches). Ideally, we would detect that the structure matches what is expected by the following Hive code and pass it along. Map containing null values are not correctly written in Parquet files - Key: HIVE-8359 URL: https://issues.apache.org/jira/browse/HIVE-8359 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.13.1 Reporter: Frédéric TERRAZZONI Assignee: Sergio Peña Attachments: HIVE-8359.1.patch, HIVE-8359.2.patch, HIVE-8359.4.patch, HIVE-8359.5.patch, map_null_val.avro Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8122) Make use of SearchArgument classes for Parquet SERDE
[ https://issues.apache.org/jira/browse/HIVE-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8122: --- Resolution: Fixed Fix Version/s: 0.15.0 Status: Resolved (was: Patch Available) Thank you [~Ferd] for the patch and [~mohitsabharwal] for the review! I have committed this to trunk!! Make use of SearchArgument classes for Parquet SERDE Key: HIVE-8122 URL: https://issues.apache.org/jira/browse/HIVE-8122 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Ferdinand Xu Fix For: 0.15.0 Attachments: HIVE-8122.1.patch, HIVE-8122.2.patch, HIVE-8122.3.patch, HIVE-8122.4.patch, HIVE-8122.patch ParquetSerde could be much cleaner if we used SearchArgument and associated classes like ORC does: https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27699: HIVE-8435
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27699/ --- (Updated Nov. 18, 2014, 7:13 p.m.) Review request for hive and Ashutosh Chauhan. Summary (updated) - HIVE-8435 Repository: hive-git Description (updated) --- HIVE-8435 HIVE-8435 Diffs (updated) - accumulo-handler/src/test/results/positive/accumulo_queries.q.out 254eeaba4b8d633c63c706c0c74bb1165089 common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8411c9edb2f2db84cf2540deb20133c36152103 contrib/src/test/results/clientpositive/lateral_view_explode2.q.out 74a7e1719f8e026aaecd53fc147258620a75ccc4 hbase-handler/src/test/results/positive/hbase_queries.q.out b1e7936738b1121c14132909178646290ee8b4d5 ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 95d2d76c80aa59b62e9464f704523d921302d401 ql/src/java/org/apache/hadoop/hive/ql/optimizer/IdentityProjectRemover.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 5be0e4540a6843c6b40cb5c22db6e90e1f0da922 ql/src/test/queries/clientpositive/identity_proj_remove.q PRE-CREATION ql/src/test/results/clientpositive/identity_proj_remove.q.out PRE-CREATION ql/src/test/results/compiler/plan/groupby1.q.xml PRE-CREATION Diff: https://reviews.apache.org/r/27699/diff/ Testing --- Thanks, Jesús Camacho Rodríguez
[jira] [Updated] (HIVE-8829) Upgrade to Thrift 0.9.2
[ https://issues.apache.org/jira/browse/HIVE-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-8829: -- Attachment: HIVE-8829.1.patch Upgrade to Thrift 0.9.2 --- Key: HIVE-8829 URL: https://issues.apache.org/jira/browse/HIVE-8829 Project: Hive Issue Type: Improvement Affects Versions: 0.15.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Labels: HiveServer2, metastore Fix For: 0.15.0 Attachments: HIVE-8829.1.patch Apache Thrift 0.9.2 was released recently (https://thrift.apache.org/download). It has a fix for THRIFT-2660 which can cause HS2 (tcp mode) and Metastore processes to go OOM on getting a non-thrift request when they use SASL transport. The reason ([thrift code|https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/transport/TSaslTransport.java#L177]): {code} protected SaslResponse receiveSaslMessage() throws TTransportException { underlyingTransport.readAll(messageHeader, 0, messageHeader.length); byte statusByte = messageHeader[0]; byte[] payload = new byte[EncodingUtils.decodeBigEndian(messageHeader, STATUS_BYTES)]; underlyingTransport.readAll(payload, 0, payload.length); NegotiationStatus status = NegotiationStatus.byValue(statusByte); if (status == null) { sendAndThrowMessage(NegotiationStatus.ERROR, Invalid status + statusByte); } else if (status == NegotiationStatus.BAD || status == NegotiationStatus.ERROR) { try { String remoteMessage = new String(payload, UTF-8); throw new TTransportException(Peer indicated failure: + remoteMessage); } catch (UnsupportedEncodingException e) { throw new TTransportException(e); } } {code} Basically since there are no message format checks / size checks before creating the byte array, on getting a non-SASL message this creates a huge byte array from some garbage size. For HS2, an attempt was made to fix it here: HIVE-6468, which never went in. I think for 0.15.0 it's best to upgarde to Thrift 0.9.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8893) Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode
[ https://issues.apache.org/jira/browse/HIVE-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216636#comment-14216636 ] Szehon Ho commented on HIVE-8893: - Thanks for the changes! Latest patch looks good, +1 Implement whitelist for builtin UDFs to avoid untrused code execution in multiuser mode --- Key: HIVE-8893 URL: https://issues.apache.org/jira/browse/HIVE-8893 Project: Hive Issue Type: Bug Components: Authorization, HiveServer2, SQL Affects Versions: 0.14.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.15.0 Attachments: HIVE-8893.3.patch, HIVE-8893.4.patch, HIVE-8893.5.patch The udfs like reflect() or java_method() enables executing a java method as udf. While this offers lot of flexibility in the standalone mode, it can become a security loophole in a secure multiuser environment. For example, in HiveServer2 one can execute any available java code with user hive's credentials. We need a whitelist and blacklist to restrict builtin udfs in Hiveserver2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8829) Upgrade to Thrift 0.9.2
[ https://issues.apache.org/jira/browse/HIVE-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-8829: -- Status: Patch Available (was: Open) Upgrade to Thrift 0.9.2 --- Key: HIVE-8829 URL: https://issues.apache.org/jira/browse/HIVE-8829 Project: Hive Issue Type: Improvement Affects Versions: 0.15.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Labels: HiveServer2, metastore Fix For: 0.15.0 Attachments: HIVE-8829.1.patch Apache Thrift 0.9.2 was released recently (https://thrift.apache.org/download). It has a fix for THRIFT-2660 which can cause HS2 (tcp mode) and Metastore processes to go OOM on getting a non-thrift request when they use SASL transport. The reason ([thrift code|https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/transport/TSaslTransport.java#L177]): {code} protected SaslResponse receiveSaslMessage() throws TTransportException { underlyingTransport.readAll(messageHeader, 0, messageHeader.length); byte statusByte = messageHeader[0]; byte[] payload = new byte[EncodingUtils.decodeBigEndian(messageHeader, STATUS_BYTES)]; underlyingTransport.readAll(payload, 0, payload.length); NegotiationStatus status = NegotiationStatus.byValue(statusByte); if (status == null) { sendAndThrowMessage(NegotiationStatus.ERROR, Invalid status + statusByte); } else if (status == NegotiationStatus.BAD || status == NegotiationStatus.ERROR) { try { String remoteMessage = new String(payload, UTF-8); throw new TTransportException(Peer indicated failure: + remoteMessage); } catch (UnsupportedEncodingException e) { throw new TTransportException(e); } } {code} Basically since there are no message format checks / size checks before creating the byte array, on getting a non-SASL message this creates a huge byte array from some garbage size. For HS2, an attempt was made to fix it here: HIVE-6468, which never went in. I think for 0.15.0 it's best to upgarde to Thrift 0.9.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27699: HIVE-8435
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27699/ --- (Updated Nov. 18, 2014, 7:21 p.m.) Review request for hive and Ashutosh Chauhan. Repository: hive-git Description (updated) --- HIVE-8435 Patch with the most conservative approach of project remover optimization. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8411c9edb2f2db84cf2540deb20133c36152103 ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 95d2d76c80aa59b62e9464f704523d921302d401 ql/src/java/org/apache/hadoop/hive/ql/optimizer/IdentityProjectRemover.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 5be0e4540a6843c6b40cb5c22db6e90e1f0da922 Diff: https://reviews.apache.org/r/27699/diff/ Testing --- Thanks, Jesús Camacho Rodríguez
[jira] [Commented] (HIVE-8829) Upgrade to Thrift 0.9.2
[ https://issues.apache.org/jira/browse/HIVE-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216640#comment-14216640 ] Prasad Mujumdar commented on HIVE-8829: --- [~vgumashta] I didn't notice that the ticket is assigned to you. If you already have a patch, please feel free to ignore this one. Upgrade to Thrift 0.9.2 --- Key: HIVE-8829 URL: https://issues.apache.org/jira/browse/HIVE-8829 Project: Hive Issue Type: Improvement Affects Versions: 0.15.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Labels: HiveServer2, metastore Fix For: 0.15.0 Attachments: HIVE-8829.1.patch Apache Thrift 0.9.2 was released recently (https://thrift.apache.org/download). It has a fix for THRIFT-2660 which can cause HS2 (tcp mode) and Metastore processes to go OOM on getting a non-thrift request when they use SASL transport. The reason ([thrift code|https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/transport/TSaslTransport.java#L177]): {code} protected SaslResponse receiveSaslMessage() throws TTransportException { underlyingTransport.readAll(messageHeader, 0, messageHeader.length); byte statusByte = messageHeader[0]; byte[] payload = new byte[EncodingUtils.decodeBigEndian(messageHeader, STATUS_BYTES)]; underlyingTransport.readAll(payload, 0, payload.length); NegotiationStatus status = NegotiationStatus.byValue(statusByte); if (status == null) { sendAndThrowMessage(NegotiationStatus.ERROR, Invalid status + statusByte); } else if (status == NegotiationStatus.BAD || status == NegotiationStatus.ERROR) { try { String remoteMessage = new String(payload, UTF-8); throw new TTransportException(Peer indicated failure: + remoteMessage); } catch (UnsupportedEncodingException e) { throw new TTransportException(e); } } {code} Basically since there are no message format checks / size checks before creating the byte array, on getting a non-SASL message this creates a huge byte array from some garbage size. For HS2, an attempt was made to fix it here: HIVE-6468, which never went in. I think for 0.15.0 it's best to upgarde to Thrift 0.9.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8829) Upgrade to Thrift 0.9.2
[ https://issues.apache.org/jira/browse/HIVE-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216644#comment-14216644 ] Vaibhav Gumashta commented on HIVE-8829: [~prasadm] No issues - I didn't have a patch any way. Assigning it to you - thanks for the patch. Upgrade to Thrift 0.9.2 --- Key: HIVE-8829 URL: https://issues.apache.org/jira/browse/HIVE-8829 Project: Hive Issue Type: Improvement Affects Versions: 0.15.0 Reporter: Vaibhav Gumashta Assignee: Prasad Mujumdar Labels: HiveServer2, metastore Fix For: 0.15.0 Attachments: HIVE-8829.1.patch Apache Thrift 0.9.2 was released recently (https://thrift.apache.org/download). It has a fix for THRIFT-2660 which can cause HS2 (tcp mode) and Metastore processes to go OOM on getting a non-thrift request when they use SASL transport. The reason ([thrift code|https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/transport/TSaslTransport.java#L177]): {code} protected SaslResponse receiveSaslMessage() throws TTransportException { underlyingTransport.readAll(messageHeader, 0, messageHeader.length); byte statusByte = messageHeader[0]; byte[] payload = new byte[EncodingUtils.decodeBigEndian(messageHeader, STATUS_BYTES)]; underlyingTransport.readAll(payload, 0, payload.length); NegotiationStatus status = NegotiationStatus.byValue(statusByte); if (status == null) { sendAndThrowMessage(NegotiationStatus.ERROR, Invalid status + statusByte); } else if (status == NegotiationStatus.BAD || status == NegotiationStatus.ERROR) { try { String remoteMessage = new String(payload, UTF-8); throw new TTransportException(Peer indicated failure: + remoteMessage); } catch (UnsupportedEncodingException e) { throw new TTransportException(e); } } {code} Basically since there are no message format checks / size checks before creating the byte array, on getting a non-SASL message this creates a huge byte array from some garbage size. For HS2, an attempt was made to fix it here: HIVE-6468, which never went in. I think for 0.15.0 it's best to upgarde to Thrift 0.9.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 27699: HIVE-8435
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27699/#review61985 --- contrib/src/test/results/clientpositive/lateral_view_explode2.q.out https://reviews.apache.org/r/27699/#comment103905 Results are changed. Looks suspicious. ql/src/test/results/compiler/plan/groupby1.q.xml https://reviews.apache.org/r/27699/#comment103904 you need to rebase your git repo. These test cases were deleted via HIVE-8862 - Ashutosh Chauhan On Nov. 18, 2014, 7:21 p.m., Jesús Camacho Rodríguez wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/27699/ --- (Updated Nov. 18, 2014, 7:21 p.m.) Review request for hive and Ashutosh Chauhan. Repository: hive-git Description --- HIVE-8435 Patch with the most conservative approach of project remover optimization. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a8411c9edb2f2db84cf2540deb20133c36152103 ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 95d2d76c80aa59b62e9464f704523d921302d401 ql/src/java/org/apache/hadoop/hive/ql/optimizer/IdentityProjectRemover.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 5be0e4540a6843c6b40cb5c22db6e90e1f0da922 Diff: https://reviews.apache.org/r/27699/diff/ Testing --- Thanks, Jesús Camacho Rodríguez
[jira] [Updated] (HIVE-8829) Upgrade to Thrift 0.9.2
[ https://issues.apache.org/jira/browse/HIVE-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-8829: --- Assignee: Prasad Mujumdar (was: Vaibhav Gumashta) Upgrade to Thrift 0.9.2 --- Key: HIVE-8829 URL: https://issues.apache.org/jira/browse/HIVE-8829 Project: Hive Issue Type: Improvement Affects Versions: 0.15.0 Reporter: Vaibhav Gumashta Assignee: Prasad Mujumdar Labels: HiveServer2, metastore Fix For: 0.15.0 Attachments: HIVE-8829.1.patch Apache Thrift 0.9.2 was released recently (https://thrift.apache.org/download). It has a fix for THRIFT-2660 which can cause HS2 (tcp mode) and Metastore processes to go OOM on getting a non-thrift request when they use SASL transport. The reason ([thrift code|https://github.com/apache/thrift/blob/0.9.x/lib/java/src/org/apache/thrift/transport/TSaslTransport.java#L177]): {code} protected SaslResponse receiveSaslMessage() throws TTransportException { underlyingTransport.readAll(messageHeader, 0, messageHeader.length); byte statusByte = messageHeader[0]; byte[] payload = new byte[EncodingUtils.decodeBigEndian(messageHeader, STATUS_BYTES)]; underlyingTransport.readAll(payload, 0, payload.length); NegotiationStatus status = NegotiationStatus.byValue(statusByte); if (status == null) { sendAndThrowMessage(NegotiationStatus.ERROR, Invalid status + statusByte); } else if (status == NegotiationStatus.BAD || status == NegotiationStatus.ERROR) { try { String remoteMessage = new String(payload, UTF-8); throw new TTransportException(Peer indicated failure: + remoteMessage); } catch (UnsupportedEncodingException e) { throw new TTransportException(e); } } {code} Basically since there are no message format checks / size checks before creating the byte array, on getting a non-SASL message this creates a huge byte array from some garbage size. For HS2, an attempt was made to fix it here: HIVE-6468, which never went in. I think for 0.15.0 it's best to upgarde to Thrift 0.9.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8435) Add identity project remover optimization
[ https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesús Camacho Rodríguez updated HIVE-8435: -- Attachment: HIVE-8435.08.patch Starting over. Add identity project remover optimization - Key: HIVE-8435 URL: https://issues.apache.org/jira/browse/HIVE-8435 Project: Hive Issue Type: New Feature Components: Logical Optimizer Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0 Reporter: Ashutosh Chauhan Assignee: Jesús Camacho Rodríguez Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, HIVE-8435.08.patch, HIVE-8435.1.patch, HIVE-8435.patch In some cases there is an identity project in plan which is useless. Better to optimize it away to avoid evaluating it without any benefit at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8639) Convert SMBJoin to MapJoin [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216653#comment-14216653 ] Brock Noland commented on HIVE-8639: Hi [~chinnalalam], How is the progress on this going? If you are not working on it we'll have folks freeing up soon who can take it over. Convert SMBJoin to MapJoin [Spark Branch] - Key: HIVE-8639 URL: https://issues.apache.org/jira/browse/HIVE-8639 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Szehon Ho Assignee: Chinna Rao Lalam HIVE-8202 supports auto-conversion of SMB Join. However, if the tables are partitioned, there could be a slow down as each mapper would need to get a very small chunk of a partition which has a single key. Thus, in some scenarios it's beneficial to convert SMB join to map join. The task is to research and support the conversion from SMB join to map join for Spark execution engine. See the equivalent of MapReduce in SortMergeJoinResolver. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8803) DESC SCHEMA DATABASE-NAME is not working
[ https://issues.apache.org/jira/browse/HIVE-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-8803: Labels: TODOC15 (was: ) It's already doc'ed [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe] as part of Hive 0.14 and HIVE-6601, but we can change the link to this one and Hive 0.15. DESC SCHEMA DATABASE-NAME is not working -- Key: HIVE-8803 URL: https://issues.apache.org/jira/browse/HIVE-8803 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Labels: TODOC15 Attachments: HIVE-8803.1.patch.txt, HIVE-8803.1.patch.txt Found that documenting HIVE-6601 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216655#comment-14216655 ] Hive QA commented on HIVE-8887: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12682207/HIVE-8887.1-spark.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7180 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/392/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/392/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-392/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12682207 - PreCommit-HIVE-SPARK-Build Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch] --- Key: HIVE-8887 URL: https://issues.apache.org/jira/browse/HIVE-8887 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chao Assignee: Chao Attachments: HIVE-8887.1-spark.patch These tests all failed with the same error, see below: {noformat} 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver (SessionState.java:printError(837)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
[jira] [Commented] (HIVE-8435) Add identity project remover optimization
[ https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216656#comment-14216656 ] Ashutosh Chauhan commented on HIVE-8435: +1 for code changes which looks good. Lets see how tests go. Add identity project remover optimization - Key: HIVE-8435 URL: https://issues.apache.org/jira/browse/HIVE-8435 Project: Hive Issue Type: New Feature Components: Logical Optimizer Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0 Reporter: Ashutosh Chauhan Assignee: Jesús Camacho Rodríguez Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, HIVE-8435.08.patch, HIVE-8435.1.patch, HIVE-8435.patch In some cases there is an identity project in plan which is useless. Better to optimize it away to avoid evaluating it without any benefit at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8803) DESC SCHEMA DATABASE-NAME is not working
[ https://issues.apache.org/jira/browse/HIVE-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-8803: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk, thanks Navis. DESC SCHEMA DATABASE-NAME is not working -- Key: HIVE-8803 URL: https://issues.apache.org/jira/browse/HIVE-8803 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-8803.1.patch.txt, HIVE-8803.1.patch.txt Found that documenting HIVE-6601 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8906) Hive 0.14.0 release depends on Tez and Calcite SNAPSHOT artifacts
Carl Steinbach created HIVE-8906: Summary: Hive 0.14.0 release depends on Tez and Calcite SNAPSHOT artifacts Key: HIVE-8906 URL: https://issues.apache.org/jira/browse/HIVE-8906 Project: Hive Issue Type: Bug Reporter: Carl Steinbach The Hive 0.14.0 release depends on SNAPSHOT versions of tez-0.5.2 and calcite-0.9.2. I believe this violates Apache release policy (can't find the reference, but I seem to remember this being a problem with HCatalog before the merger), and it implies that the folks who tested the release weren't necessarily testing the same thing. It also means that people who try to build Hive using the 0.14.0 src release will encounter errors unless they configure Maven to pull artifacts from the snapshot repository. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8839) Support alter table .. add/replace columns cascade
[ https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-8839: Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk, thanks Chaoyu for the contribution. Support alter table .. add/replace columns cascade Key: HIVE-8839 URL: https://issues.apache.org/jira/browse/HIVE-8839 Project: Hive Issue Type: Improvement Components: SQL Environment: Reporter: Chaoyu Tang Assignee: Chaoyu Tang Labels: TODOC15 Fix For: 0.15.0 Attachments: HIVE-8839.1.patch, HIVE-8839.2.patch, HIVE-8839.2.patch, HIVE-8839.patch We often run into some issues like HIVE-6131which is due to inconsistent column descriptors between table and partitions after alter table. HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition level. But most cases we have need change the table and partitions at same time. In addition, alter table is usually required prior to alter table partition .. since querying table partition data is also through table. Instead of do that in two steps, here we provide a convenient ddl like alter table ... cascade to cascade table changes to partitions as well. The changes are only limited and applicable to add/replace columns and change column name, datatype, position and comment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8839) Support alter table .. add/replace columns cascade
[ https://issues.apache.org/jira/browse/HIVE-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-8839: Environment: was: Need to add this in [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Describe], and also explain the nuances Support alter table .. add/replace columns cascade Key: HIVE-8839 URL: https://issues.apache.org/jira/browse/HIVE-8839 Project: Hive Issue Type: Improvement Components: SQL Environment: Reporter: Chaoyu Tang Assignee: Chaoyu Tang Labels: TODOC15 Fix For: 0.15.0 Attachments: HIVE-8839.1.patch, HIVE-8839.2.patch, HIVE-8839.2.patch, HIVE-8839.patch We often run into some issues like HIVE-6131which is due to inconsistent column descriptors between table and partitions after alter table. HIVE-8441/HIVE-7971 provided the flexibility to alter table at partition level. But most cases we have need change the table and partitions at same time. In addition, alter table is usually required prior to alter table partition .. since querying table partition data is also through table. Instead of do that in two steps, here we provide a convenient ddl like alter table ... cascade to cascade table changes to partitions as well. The changes are only limited and applicable to add/replace columns and change column name, datatype, position and comment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8611) grant/revoke syntax should support additional objects for authorization plugins
[ https://issues.apache.org/jira/browse/HIVE-8611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216731#comment-14216731 ] Prasad Mujumdar commented on HIVE-8611: --- [~leftylev] Updated the wiki for config change. Thanks! grant/revoke syntax should support additional objects for authorization plugins --- Key: HIVE-8611 URL: https://issues.apache.org/jira/browse/HIVE-8611 Project: Hive Issue Type: Bug Components: Authentication, SQL Affects Versions: 0.13.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.15.0 Attachments: HIVE-8611.1.patch, HIVE-8611.2.patch, HIVE-8611.2.patch, HIVE-8611.3.patch, HIVE-8611.4.patch The authorization framework supports URI and global objects. The SQL syntax however doesn't allow granting privileges on these objects. We should allow the compiler to parse these so that it can be handled by authorization plugins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8612) Support metadata result filter hooks
[ https://issues.apache.org/jira/browse/HIVE-8612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216744#comment-14216744 ] Prasad Mujumdar commented on HIVE-8612: --- [~leftylev] Documented the new config property on the metastore admin page. Thanks! Support metadata result filter hooks Key: HIVE-8612 URL: https://issues.apache.org/jira/browse/HIVE-8612 Project: Hive Issue Type: Bug Components: Authorization, Metastore Affects Versions: 0.13.1 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.15.0 Attachments: HIVE-8612.1.patch, HIVE-8612.2.patch, HIVE-8612.3.patch Support metadata filter hook for metastore client. This will be useful for authorization plugins on hiveserver2 to filter metadata results, especially in case of non-impersonation mode where the metastore doesn't know the end user's identity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8888) Mapjoin with LateralViewJoin generates wrong plan in Tez
[ https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-: - Attachment: HIVE-.4.patch Mapjoin with LateralViewJoin generates wrong plan in Tez Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0, 0.13.1, 0.15.0 Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-.1.patch, HIVE-.2.patch, HIVE-.3.patch, HIVE-.4.patch Queries like these {code} with sub1 as (select aid, avalue from expod1 lateral view explode(av) avs as avalue ), sub2 as (select bid, bvalue from expod2 lateral view explode(bv) bvs as bvalue) select sub1.aid, sub1.avalue, sub2.bvalue from sub1,sub2 where sub1.aid=sub2.bid; {code} generates twice the number of rows in Tez when compared to MR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8888) Mapjoin with LateralViewJoin generates wrong plan in Tez
[ https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-: - Status: Open (was: Patch Available) Mapjoin with LateralViewJoin generates wrong plan in Tez Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Bug Affects Versions: 0.13.1, 0.13.0, 0.14.0, 0.15.0 Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-.1.patch, HIVE-.2.patch, HIVE-.3.patch, HIVE-.4.patch Queries like these {code} with sub1 as (select aid, avalue from expod1 lateral view explode(av) avs as avalue ), sub2 as (select bid, bvalue from expod2 lateral view explode(bv) bvs as bvalue) select sub1.aid, sub1.avalue, sub2.bvalue from sub1,sub2 where sub1.aid=sub2.bid; {code} generates twice the number of rows in Tez when compared to MR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8888) Mapjoin with LateralViewJoin generates wrong plan in Tez
[ https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-: - Status: Patch Available (was: Open) Mapjoin with LateralViewJoin generates wrong plan in Tez Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Bug Affects Versions: 0.13.1, 0.13.0, 0.14.0, 0.15.0 Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-.1.patch, HIVE-.2.patch, HIVE-.3.patch, HIVE-.4.patch Queries like these {code} with sub1 as (select aid, avalue from expod1 lateral view explode(av) avs as avalue ), sub2 as (select bid, bvalue from expod2 lateral view explode(bv) bvs as bvalue) select sub1.aid, sub1.avalue, sub2.bvalue from sub1,sub2 where sub1.aid=sub2.bid; {code} generates twice the number of rows in Tez when compared to MR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-8904) Hive should support multiple Key provider modes
[ https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8904: --- Comment: was deleted (was: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12682124/HIVE-8904.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1834/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1834/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1834/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-1834/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniHS2.java' Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java' Reverted 'service/src/java/org/apache/hive/service/cli/CLIService.java' Reverted 'ql/src/test/org/apache/hadoop/hive/metastore/TestMetastoreExpr.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestExpressionEvaluator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/SqlFunctionConverter.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionInfo.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/common-secure/target shims/scheduler/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/core/target hcatalog/streaming/target hcatalog/server-extensions/target hcatalog/hcatalog-pig-adapter/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target accumulo-handler/target hwi/target common/target common/src/gen contrib/target service/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1640306. At revision 1640306. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12682124 - PreCommit-HIVE-TRUNK-Build) Hive should support multiple
[jira] [Created] (HIVE-8907) Partition Condition Remover doesn't remove conditions involving cast on partition column
Ashutosh Chauhan created HIVE-8907: -- Summary: Partition Condition Remover doesn't remove conditions involving cast on partition column Key: HIVE-8907 URL: https://issues.apache.org/jira/browse/HIVE-8907 Project: Hive Issue Type: Improvement Components: Logical Optimizer Reporter: Ashutosh Chauhan Fix For: 0.14.0 e.g, {code} create table partition_test_partitioned(key string, value string) partitioned by (dt string) explain select * from partition_test_partitioned where cast(dt as double) =100.0 and cast(dt as double) = 102.0 {code} For queries like above, although {{PartitionPruner}} is able to prune partitions correctly, filter is still not optimized away by PCR, where it could. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8907) Partition Condition Remover doesn't remove conditions involving cast on partition column
[ https://issues.apache.org/jira/browse/HIVE-8907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216822#comment-14216822 ] Ashutosh Chauhan commented on HIVE-8907: It runs into this if-block of {{PcrExprProcFactory}} {code} if (result == null) { // if the result is not boolean and not all partition agree on the // result, we don't remove the condition. Potentially, it can miss // the case like where ds % 3 == 1 or ds % 3 == 2 // TODO: handle this case by making result vector to handle all // constant values. return new NodeInfoWrapper(WalkState.UNKNOWN, null, getOutExpr(fd, nodeOutputs)); {code} Partition Condition Remover doesn't remove conditions involving cast on partition column Key: HIVE-8907 URL: https://issues.apache.org/jira/browse/HIVE-8907 Project: Hive Issue Type: Improvement Components: Logical Optimizer Reporter: Ashutosh Chauhan Fix For: 0.14.0 e.g, {code} create table partition_test_partitioned(key string, value string) partitioned by (dt string) explain select * from partition_test_partitioned where cast(dt as double) =100.0 and cast(dt as double) = 102.0 {code} For queries like above, although {{PartitionPruner}} is able to prune partitions correctly, filter is still not optimized away by PCR, where it could. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Propose to put JIRA traffic on separate hive list
+1 On Tue, Nov 18, 2014 at 2:12 AM, Lars Francke lars.fran...@gmail.com wrote: +1 That's a great idea Alan. On Tue, Nov 18, 2014 at 9:49 AM, Lefty Leverenz leftylever...@gmail.com wrote: +1 Would it be possible to send commits to the dev list, as well as creates? Or maybe all changes to the Resolution or Status? -- Lefty On Mon, Nov 17, 2014 at 2:27 PM, Alan Gates ga...@hortonworks.com wrote: The hive dev list generates a lot of traffic. The average for October was 192 messages per day. As a result no one sends hive dev directly to their inbox. They either unsubscribe or they build filters that ship most or all of it to a folder. Chasing people off the dev list is obviously not what we want. Sending messages to folders means missing messages or not seeing them until you get unbusy enough to go read back mail in folders. The vast majority of this traffic is comments on JIRA tickets. The way I've seen other very active Apache projects manage this is JIRA creates go to the dev list, but all other JIRA operations go to a separate list. Then everyone can see new tickets, and if they are interested they can watch that JIRA. If not, they are not burdened with the email from it. I propose we do this same thing in Hive. Alan. -- Sent with Postbox http://www.getpostbox.com -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-7790) Update privileges to check for update and delete
[ https://issues.apache.org/jira/browse/HIVE-7790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216829#comment-14216829 ] Alan Gates commented on HIVE-7790: -- Most required changes were already made to https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization#SQLStandardBasedHiveAuthorization-PrivilegesRequiredforHiveOperations I made a few more No changes are required to the legacy auth, as this didn't change that. I don't think any more doc work is needed for this JIRA. Update privileges to check for update and delete Key: HIVE-7790 URL: https://issues.apache.org/jira/browse/HIVE-7790 Project: Hive Issue Type: Sub-task Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.14.0 Attachments: HIVE-7790.2.patch, HIVE-7790.3.patch, HIVE-7790.patch In the new SQLStdAuth scheme, we need to add UPDATE and DELETE as operations and add ability check for them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8104) Insert statements against ACID tables NPE when vectorization is on
[ https://issues.apache.org/jira/browse/HIVE-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216835#comment-14216835 ] Alan Gates commented on HIVE-8104: -- [~leftylev], what did you want transactions and vectorization to say about each other? They work together, mostly. And the transaction code handles turning off vectorization in the cases where they don't work together, so it is all transparent to the user. So I'm not sure there's anything to put in the user docs. Insert statements against ACID tables NPE when vectorization is on -- Key: HIVE-8104 URL: https://issues.apache.org/jira/browse/HIVE-8104 Project: Hive Issue Type: Bug Components: Query Processor, Vectorization Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8104.2.patch, HIVE-8104.patch Doing an insert against a table that is using ACID format with the transaction manager set to DbTxnManager and vectorization turned on results in an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8779) Tez in-place progress UI can show wrong estimated time for sub-second queries
[ https://issues.apache.org/jira/browse/HIVE-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8779: --- Fix Version/s: (was: 0.15.0) Tez in-place progress UI can show wrong estimated time for sub-second queries - Key: HIVE-8779 URL: https://issues.apache.org/jira/browse/HIVE-8779 Project: Hive Issue Type: Improvement Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Trivial Fix For: 0.14.0 Attachments: HIVE-8779.1.patch The in-place progress update UI added as part of HIVE-8495 can show wrong estimated time for AM only job which goes from INITED to SUCCEEDED DAG state directly without going to RUNNING state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8778) ORC split elimination can cause NPE when column statistics is null
[ https://issues.apache.org/jira/browse/HIVE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8778: --- Fix Version/s: (was: 0.15.0) ORC split elimination can cause NPE when column statistics is null -- Key: HIVE-8778 URL: https://issues.apache.org/jira/browse/HIVE-8778 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8778.1.patch Row group elimination has protection for NULL statistics values in RecordReaderImpl.evaluatePredicate() which then calls evaluatePredicateRange(). But split elimination directly calls evaluatePredicateRange() without NULL protection. This can lead to NullPointerException when a column is NULL in entire stripe. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8740) Sorted dynamic partition does not work correctly with constant folding
[ https://issues.apache.org/jira/browse/HIVE-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8740: --- Fix Version/s: (was: 0.15.0) Sorted dynamic partition does not work correctly with constant folding -- Key: HIVE-8740 URL: https://issues.apache.org/jira/browse/HIVE-8740 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Fix For: 0.14.0 Attachments: HIVE-8740.1.patch, HIVE-8740.2.patch, HIVE-8740.3.patch, HIVE-8740.4.patch Sorted dynamic partition optimization looks for partition columns from the operator above FileSinkOperator. As per hive convention it expects partition columns at the last. But with HIVE-8585 equality filters on partition columns gets folded to constant. The column pruner then prunes the constant expression as they don't reference any columns. This in some cases will yield unexpected results (throw ArrayIndexOutOfBounds exception) with sorted dynamic partition insert optimization. In such cases we don't really need sorted dynamic partition optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8727) Dag summary has incorrect row counts and duration per vertex
[ https://issues.apache.org/jira/browse/HIVE-8727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8727: --- Fix Version/s: (was: 0.15.0) Dag summary has incorrect row counts and duration per vertex Key: HIVE-8727 URL: https://issues.apache.org/jira/browse/HIVE-8727 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Prasanth J Fix For: 0.14.0 Attachments: HIVE-8727.1.patch During the code review for HIVE-8495 some code was reworked which broke some of INPUT/OUTPUT counters and duration. Patch attached which fixes that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6468) HS2 Metastore using SASL out of memory error when curl sends a get request
[ https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-6468: --- Status: Open (was: Patch Available) HS2 Metastore using SASL out of memory error when curl sends a get request Key: HIVE-6468 URL: https://issues.apache.org/jira/browse/HIVE-6468 Project: Hive Issue Type: Bug Components: HiveServer2, Metastore Affects Versions: 0.13.1, 0.13.0, 0.12.0, 0.14.0 Environment: Centos 6.3, hive 12, hadoop-2.2 Reporter: Abin Shahab Assignee: Navis Fix For: 0.14.1 Attachments: HIVE-6468.1.patch.txt, HIVE-6468.2.patch.txt, HIVE-6468.3.patch, HIVE-6468.4.patch We see an out of memory error when we run simple beeline calls. (The hive.server2.transport.mode is binary) curl localhost:1 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181) at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28007: HS2 Metastore using SASL out of memory error when curl sends a get request
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28007/ --- (Updated Nov. 18, 2014, 9:51 p.m.) Review request for hive, Navis Ryu and Thejas Nair. Bugs: HIVE-6468 https://issues.apache.org/jira/browse/HIVE-6468 Repository: hive-git Description --- https://issues.apache.org/jira/browse/HIVE-6468 Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ea5aed8 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java d1ef305 service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 23ba79c service/src/java/org/apache/hive/service/auth/PlainSaslHelper.java afc1441 shims/common-secure/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java 624ac6b shims/common/src/main/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java d011c67 Diff: https://reviews.apache.org/r/28007/diff/ Testing --- Thanks, Vaibhav Gumashta
[jira] [Updated] (HIVE-6468) HS2 Metastore using SASL out of memory error when curl sends a get request
[ https://issues.apache.org/jira/browse/HIVE-6468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-6468: --- Attachment: HIVE-6468.5.patch Revised patch for 14.1. I'll upload one based on trunk just for precommit run (we're upgrading thrift version for trunk - not planning to use this patch). HS2 Metastore using SASL out of memory error when curl sends a get request Key: HIVE-6468 URL: https://issues.apache.org/jira/browse/HIVE-6468 Project: Hive Issue Type: Bug Components: HiveServer2, Metastore Affects Versions: 0.12.0, 0.13.0, 0.14.0, 0.13.1 Environment: Centos 6.3, hive 12, hadoop-2.2 Reporter: Abin Shahab Assignee: Navis Fix For: 0.14.1 Attachments: HIVE-6468.1.patch.txt, HIVE-6468.2.patch.txt, HIVE-6468.3.patch, HIVE-6468.4.patch, HIVE-6468.5.patch We see an out of memory error when we run simple beeline calls. (The hive.server2.transport.mode is binary) curl localhost:1 Exception in thread pool-2-thread-8 java.lang.OutOfMemoryError: Java heap space at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:181) at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8685: --- Fix Version/s: (was: 0.15.0) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8643: --- Fix Version/s: (was: 0.15.0) DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8643.2.patch, HIVE-8643.3.patch, HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8588: --- Fix Version/s: (was: 0.15.0) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-8588.1.patch, HIVE-8588.2.patch This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8478) Vectorized Reduce-Side Group By doesn't handle Decimal type correctly
[ https://issues.apache.org/jira/browse/HIVE-8478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8478: --- Fix Version/s: (was: 0.15.0) Vectorized Reduce-Side Group By doesn't handle Decimal type correctly - Key: HIVE-8478 URL: https://issues.apache.org/jira/browse/HIVE-8478 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8478.01.patch, HIVE-8478.02.patch, HIVE-8478.03.patch, HIVE-8478.04.patch Note that DecimalColumnVector is different than LongColumnVector because it keeps (an instance) reference to a Decimal128 class whereas the latter stores a long primitive value. So, trouble if you set the reference instead of updating the object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8497) StatsNoJobTask doesn't close RecordReader, FSDataInputStream of which keeps open to prevent stale data clean
[ https://issues.apache.org/jira/browse/HIVE-8497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8497: --- Fix Version/s: (was: 0.15.0) StatsNoJobTask doesn't close RecordReader, FSDataInputStream of which keeps open to prevent stale data clean Key: HIVE-8497 URL: https://issues.apache.org/jira/browse/HIVE-8497 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Environment: Windows Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8497.1.patch, HIVE-8497.2.patch run the test {noformat} mvn -Phadoop-2 test -Dtest=TestCliDriver -Dqfile=alter_merge_stats_orc.q {noformat} to reproduce it. Simply, this query does three data loads which generates three base orc files. ANALYZE TABLE...COMPUTE STATISTICS NOSCAN will execute StatsNoJobTask to get stats, where file handle is held so as not able to clean base file. As a result, after running ALTER TABLE..CONCATENATE, follow-up queries go to stale base file and merged file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8455: --- Fix Version/s: (was: 0.15.0) spark-branch Print Spark job progress format info on the console[Spark Branch] - Key: HIVE-8455 URL: https://issues.apache.org/jira/browse/HIVE-8455 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Fix For: spark-branch Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive on spark job status.PNG We have add support of spark job status monitoring on HIVE-7439, but not print job progress format info on the console, user may confuse about what the progress info means, so I would like to add job progress format info here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8454) Select Operator does not rename column stats properly in case of select star
[ https://issues.apache.org/jira/browse/HIVE-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8454: --- Fix Version/s: (was: 0.15.0) Select Operator does not rename column stats properly in case of select star Key: HIVE-8454 URL: https://issues.apache.org/jira/browse/HIVE-8454 Project: Hive Issue Type: Sub-task Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Prasanth J Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8454.1.patch, HIVE-8454.2.patch, HIVE-8454.3.patch, HIVE-8454.3.patch, HIVE-8454.4.patch, HIVE-8454.5.patch, HIVE-8454.6.patch, HIVE-8454.7.patch The estimated data size of some Select Operators is 0. BytesBytesHashMap uses data size to determine the estimated initial number of entries in the hashmap. If this data size is 0 then exception is thrown (refer below) Query {code} select count(*) from store_sales JOIN store_returns ON store_sales.ss_item_sk = store_returns.sr_item_sk and store_sales.ss_ticket_number = store_returns.sr_ticket_number JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk JOIN date_dim d1 ON store_sales.ss_sold_date_sk = d1.d_date_sk JOIN date_dim d2 ON customer.c_first_sales_date_sk = d2.d_date_sk JOIN date_dim d3 ON customer.c_first_shipto_date_sk = d3.d_date_sk JOIN store ON store_sales.ss_store_sk = store.s_store_sk JOIN item ON store_sales.ss_item_sk = item.i_item_sk JOIN customer_demographics cd1 ON store_sales.ss_cdemo_sk= cd1.cd_demo_sk JOIN customer_demographics cd2 ON customer.c_current_cdemo_sk = cd2.cd_demo_sk JOIN promotion ON store_sales.ss_promo_sk = promotion.p_promo_sk JOIN household_demographics hd1 ON store_sales.ss_hdemo_sk = hd1.hd_demo_sk JOIN household_demographics hd2 ON customer.c_current_hdemo_sk = hd2.hd_demo_sk JOIN customer_address ad1 ON store_sales.ss_addr_sk = ad1.ca_address_sk JOIN customer_address ad2 ON customer.c_current_addr_sk = ad2.ca_address_sk JOIN income_band ib1 ON hd1.hd_income_band_sk = ib1.ib_income_band_sk JOIN income_band ib2 ON hd2.hd_income_band_sk = ib2.ib_income_band_sk JOIN (select cs_item_sk ,sum(cs_ext_list_price) as sale,sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund from catalog_sales JOIN catalog_returns ON catalog_sales.cs_item_sk = catalog_returns.cr_item_sk and catalog_sales.cs_order_number = catalog_returns.cr_order_number group by cs_item_sk having sum(cs_ext_list_price)2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit)) cs_ui ON store_sales.ss_item_sk = cs_ui.cs_item_sk WHERE cd1.cd_marital_status cd2.cd_marital_status and i_color in ('maroon','burnished','dim','steel','navajo','chocolate') and i_current_price between 35 and 35 + 10 and i_current_price between 35 + 1 and 35 + 15 and d1.d_year = 2001; {code} {code} ], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:187) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:93) at
[jira] [Updated] (HIVE-8401) OrcFileMergeOperator only close last orc file it opened, which resulted in stale data in table directory
[ https://issues.apache.org/jira/browse/HIVE-8401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8401: --- Fix Version/s: (was: 0.15.0) OrcFileMergeOperator only close last orc file it opened, which resulted in stale data in table directory Key: HIVE-8401 URL: https://issues.apache.org/jira/browse/HIVE-8401 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Environment: Windows Server Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8401.1.patch, alter_merge_2_orc.q.out run the test {noformat} mvn -Phadoop-2 test -Dtest=TestCliDriver -Dqfile=alter_merge_2_orc.q {noformat} to reproduce it. Simply, this query does three data loads which generates three orc files, ALTER TABLE CONCATENATE tries to merge orc pieces into a single one which is final file to queried. Output \hive\itests\qtest\target\qfile-results\clientpositive\alter_merge_2_orc.q.out shows # records as 600 that is wrong as opposed to 610 expected. Because OrcFileMergeOperator only closes last orc file, the 1st and 2nd orc files still remain in table directory due to failure of deleting unclosed file for old data clean when MoveTask tries to copy merged orc file from scratch dir to table dir. Eventually the query goes to old data(1st and 2nd orc files). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8372) Potential NPE in Tez MergeFileRecordProcessor
[ https://issues.apache.org/jira/browse/HIVE-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8372: --- Fix Version/s: (was: 0.15.0) Potential NPE in Tez MergeFileRecordProcessor - Key: HIVE-8372 URL: https://issues.apache.org/jira/browse/HIVE-8372 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Fix For: 0.14.0 Attachments: HIVE-8372.1.patch MergeFileRecordProcessor retrieves map work from cache. This map work can be instance of merge file work. When the merge file work already exists in the cache casting the map work to merge file work is missing which will result in NullPointerException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8435) Add identity project remover optimization
[ https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8435: --- Assignee: Jesús Camacho Rodríguez (was: Ashutosh Chauhan) Add identity project remover optimization - Key: HIVE-8435 URL: https://issues.apache.org/jira/browse/HIVE-8435 Project: Hive Issue Type: New Feature Components: Logical Optimizer Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0 Reporter: Ashutosh Chauhan Assignee: Jesús Camacho Rodríguez Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, HIVE-8435.08.patch, HIVE-8435.1.patch, HIVE-8435.patch In some cases there is an identity project in plan which is useless. Better to optimize it away to avoid evaluating it without any benefit at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8435) Add identity project remover optimization
[ https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8435: --- Status: Patch Available (was: In Progress) Add identity project remover optimization - Key: HIVE-8435 URL: https://issues.apache.org/jira/browse/HIVE-8435 Project: Hive Issue Type: New Feature Components: Logical Optimizer Affects Versions: 0.13.0, 0.12.0, 0.11.0, 0.10.0, 0.9.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, HIVE-8435.08.patch, HIVE-8435.1.patch, HIVE-8435.patch In some cases there is an identity project in plan which is useless. Better to optimize it away to avoid evaluating it without any benefit at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-8435) Add identity project remover optimization
[ https://issues.apache.org/jira/browse/HIVE-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan reassigned HIVE-8435: -- Assignee: Ashutosh Chauhan (was: Jesús Camacho Rodríguez) Add identity project remover optimization - Key: HIVE-8435 URL: https://issues.apache.org/jira/browse/HIVE-8435 Project: Hive Issue Type: New Feature Components: Logical Optimizer Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-8435.02.patch, HIVE-8435.03.patch, HIVE-8435.03.patch, HIVE-8435.04.patch, HIVE-8435.05.patch, HIVE-8435.05.patch, HIVE-8435.06.patch, HIVE-8435.07.patch, HIVE-8435.08.patch, HIVE-8435.1.patch, HIVE-8435.patch In some cases there is an identity project in plan which is useless. Better to optimize it away to avoid evaluating it without any benefit at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216958#comment-14216958 ] Brock Noland commented on HIVE-8455: Thank you [~ashutoshc], sorry for putting the wrong fixVersion. Print Spark job progress format info on the console[Spark Branch] - Key: HIVE-8455 URL: https://issues.apache.org/jira/browse/HIVE-8455 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Fix For: spark-branch Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive on spark job status.PNG We have add support of spark job status monitoring on HIVE-7439, but not print job progress format info on the console, user may confuse about what the progress info means, so I would like to add job progress format info here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8904) Hive should support multiple Key provider modes
[ https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216970#comment-14216970 ] Brock Noland commented on HIVE-8904: +1 Hive should support multiple Key provider modes --- Key: HIVE-8904 URL: https://issues.apache.org/jira/browse/HIVE-8904 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-8904.patch In the hadoop cyptographic filesystem, JavaKeyStoreProvider, KMSClientProvider are both supported. Although in the product environment KMS is more preferable, We should enable both of them in hive side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216971#comment-14216971 ] Brock Noland commented on HIVE-8887: +1 Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch] --- Key: HIVE-8887 URL: https://issues.apache.org/jira/browse/HIVE-8887 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chao Assignee: Chao Attachments: HIVE-8887.1-spark.patch These tests all failed with the same error, see below: {noformat} 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver (SessionState.java:printError(837)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} This happens at compile time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8904) Hive should support multiple Key provider modes
[ https://issues.apache.org/jira/browse/HIVE-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8904: --- Resolution: Fixed Fix Version/s: encryption-branch Status: Resolved (was: Patch Available) Thank you very much! I have committed this to the encryption branch! Hive should support multiple Key provider modes --- Key: HIVE-8904 URL: https://issues.apache.org/jira/browse/HIVE-8904 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Fix For: encryption-branch Attachments: HIVE-8904.patch In the hadoop cyptographic filesystem, JavaKeyStoreProvider, KMSClientProvider are both supported. Although in the product environment KMS is more preferable, We should enable both of them in hive side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8887) Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8887: --- Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Thank you Chao! I have committed this to spark! Investigate test failures on auto_join6, auto_join7, auto_join18, auto_join18_multi_distinct [Spark Branch] --- Key: HIVE-8887 URL: https://issues.apache.org/jira/browse/HIVE-8887 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chao Assignee: Chao Fix For: spark-branch Attachments: HIVE-8887.1-spark.patch These tests all failed with the same error, see below: {noformat} 2014-11-14 19:09:11,330 ERROR [main]: ql.Driver (SessionState.java:printError(837)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.plan.PlanUtils.getFieldSchemasFromColumnList(PlanUtils.java:535) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.getMapJoinDesc(MapJoinProcessor.java:1177) at org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertJoinOpMapJoinOp(MapJoinProcessor.java:392) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.convertJoinMapJoin(SparkMapJoinOptimizer.java:412) at org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:165) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:61) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:131) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10169) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} This happens at compile time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8883) Investigate test failures on auto_join30.q [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216996#comment-14216996 ] Chao commented on HIVE-8883: Talked with [~szehon] and we need to put the {{currentInputPath}} back. Will submit a patch later. Investigate test failures on auto_join30.q [Spark Branch] - Key: HIVE-8883 URL: https://issues.apache.org/jira/browse/HIVE-8883 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chao Assignee: Chao Fix For: spark-branch Attachments: HIVE-8883.1-spark.patch, HIVE-8883.2-spark.patch This test fails with the following stack trace: {noformat} java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319) at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96) at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2014-11-14 17:05:09,206 ERROR [Executor task launch worker-4]: spark.SparkReduceRecordHandler (SparkReduceRecordHandler.java:processRow(285)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{reducesinkkey0:val_0},value:{_col0:0}} at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:328) at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28) at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96) at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: null at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319) ... 14 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257) ... 17 more {noformat} {{auto_join27.q}} and {{auto_join31.q}}
[jira] [Updated] (HIVE-8894) Move calcite.version to root pom
[ https://issues.apache.org/jira/browse/HIVE-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8894: --- Resolution: Fixed Fix Version/s: 0.15.0 Status: Resolved (was: Patch Available) Thank you guys for the review! I have committed this to trunk! Move calcite.version to root pom Key: HIVE-8894 URL: https://issues.apache.org/jira/browse/HIVE-8894 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.15.0 Attachments: HIVE-8894.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8908) Investigate test failure on join34.q
Chao created HIVE-8908: -- Summary: Investigate test failure on join34.q Key: HIVE-8908 URL: https://issues.apache.org/jira/browse/HIVE-8908 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Chao Assignee: Chao For this query, the plan doesn't look correct: {noformat} OK STAGE DEPENDENCIES: Stage-4 is a root stage Stage-1 depends on stages: Stage-5, Stage-4 Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 Stage-3 depends on stages: Stage-0 Stage-5 is a root stage STAGE PLANS: Stage: Stage-4 Spark DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:6 Vertices: Map 4 Map Operator Tree: TableScan alias: x Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE Spark HashTable Sink Operator condition expressions: 0 {_col1} 1 {value} keys: 0 _col0 (type: string) 1 key (type: string) Reduce Output Operator key expressions: key (type: string) sort order: + Map-reduce partition columns: key (type: string) Statistics: Num rows: 1 Data size: 216 Basic stats: COMPLETE Column stats: NONE value expressions: value (type: string) Local Work: Map Reduce Local Work Stage: Stage-1 Spark Edges: Union 2 - Map 1 (NONE, 0), Map 3 (NONE, 0) DagName: chao_20141118150101_a47a2d7b-e750-4764-be66-5ba95ebbe433:4 Vertices: Map 1 Map Operator Tree: TableScan alias: x Filter Operator predicate: (key 20) (type: boolean) Select Operator expressions: key (type: string), value (type: string) outputColumnNames: _col0, _col1 Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col1} 1 {key} {value} keys: 0 _col0 (type: string) 1 key (type: string) outputColumnNames: _col1, _col2, _col3 input vertices: 1 Map 4 Select Operator expressions: _col2 (type: string), _col3 (type: string), _col1 (type: string) outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe name: default.dest_j1 Local Work: Map Reduce Local Work Map 3 Map Operator Tree: TableScan alias: x1 Filter Operator predicate: (key 100) (type: boolean) Select Operator expressions: key (type: string), value (type: string) outputColumnNames: _col0, _col1 Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col1} 1 {key} {value} keys: 0 _col0 (type: string) 1 key (type: string) outputColumnNames: _col1, _col2, _col3 input vertices: 1 Map 4 Select Operator expressions: _col2 (type: string), _col3 (type: string), _col1 (type: string) outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false table: input format: