Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
Congratulations :) On Thu, Jan 29, 2015 at 10:23 AM, Chaoyu Tang ct...@cloudera.com wrote: Congratulations to everyone. On Thu, Jan 29, 2015 at 10:05 AM, Aihua Xu a...@cloudera.com wrote: +1. Cong~ everyone! On Jan 29, 2015, at 9:43 AM, Philippe Kernévez pkerne...@octo.com wrote: Congratulations everyone ! On Wed, Jan 28, 2015 at 10:15 PM, Carl Steinbach c...@apache.org wrote: I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl -- Philippe Kernévez Directeur technique (Suisse), pkerne...@octo.com +41 79 888 33 32 Retrouvez OCTO sur OCTO Talk : http://blog.octo.com OCTO Technology http://www.octo.com
[jira] [Commented] (HIVE-8136) Reduce table locking
[ https://issues.apache.org/jira/browse/HIVE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297131#comment-14297131 ] Hive QA commented on HIVE-8136: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695251/HIVE-8136.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7405 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in org.apache.hadoop.hive.metastore.TestMetaStoreAuthorization.testMetaStoreAuthorization org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2571/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2571/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2571/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695251 - PreCommit-HIVE-TRUNK-Build Reduce table locking Key: HIVE-8136 URL: https://issues.apache.org/jira/browse/HIVE-8136 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: HIVE-8136.patch When using ZK for concurrency control, some statements require an exclusive table lock when they are atomic. Such as setting a tables location. This JIRA is to analyze the scope of statements like ALTER TABLE and see if we can reduce the locking required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296967#comment-14296967 ] Xuefu Zhang commented on HIVE-9487: --- +1 Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296975#comment-14296975 ] Xuefu Zhang commented on HIVE-9487: --- The patch here seems containing more changes (such as itests/hive-jmh folder) than shown on RB. [~vanzin], could you check? Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9500) Support nested structs over 24 levels.
[ https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297139#comment-14297139 ] Thejas M Nair commented on HIVE-9500: - [~aihuaxu] Thanks for clarifying that the input is actually in Avro format. What part of query processing is the error happening ? Is it during some internal serialization that LazySimpleSerde is getting used ? If that is the case, maybe we should fix hive to use a better serde there. Support nested structs over 24 levels. -- Key: HIVE-9500 URL: https://issues.apache.org/jira/browse/HIVE-9500 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Labels: SerDe Customer has deeply nested avro structure and is receiving the following error when performing queries. 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting supported for LazySimpleSerde is 23 Unable to work with level 24 Currently we support up to 24 levels of nested structs when hive.serialization.extend.nesting.levels is set to true, while the customers have the requirement to support more than that. It would be better to make the supported levels configurable or completely removed (i.e., we can support any number of levels). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 30281: Move parquet serialize implementation to DataWritableWriter to improve write speeds
On Ene. 28, 2015, 5:23 a.m., cheng xu wrote: ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java, lines 218-225 https://reviews.apache.org/r/30281/diff/2-3/?file=835466#file835466line218 How about the following code snippet? recordConsummer.startField(fieldName, i); if(i % 2 == 0){ writeValue(keyElement, KeyInspector, fieldType); }else{ writeValue(valueElement, valueInspector, fieldType); } recordConsumer.endField(fieldName, i); Sergio Pena wrote: The parquet API does not accept NULL values inside startField/endField. This is why I had to check if key or value are nulls before starting the field. Or in the change I did, we check for null values everywhere, and then call startField/endField on writePrimitive. You can see the TestDataWritableWriter.testMapType() method for how null values should work. This is how Parquet adds map value 'key3 = null' startGroup(); startField(key, 0); addString(key3); endField(key, 0); endGroup(); cheng xu wrote: I see. The parquet does not handle the null value well for the startField endField methods. Sorry for missing this point. How about this? {noformat} Object elementValue = (i%2)?keyElement:valueElement; if(elementValue == null){ // field can not be NULL continue; } ObjectInspector elementInspector = (i%2)?keyInspector:valueInspector; recordConsummer.startField(fieldName, i); writeValue(elementValue, elementInspector, fieldType); recordConsumer.endField(fieldName, i); {noformat} Thanks Ferd. I liked your change. On Ene. 28, 2015, 5:23 a.m., Sergio Pena wrote: Hi Sergio, thank you for your changes. I have a few new comments left. Sergio Pena wrote: Thanks Ferd for your comments. I'll wait for your feedback before updating the other changes to see how we can make this code better. cheng xu wrote: Thank you for your reply. I prefer the previous one because it matches the method name better. For the WriteMap method, I have one little suggestion for the code. Please see my inline comments. Thanks Ferd for your comments. I uploaded another patch. - Sergio --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30281/#review69935 --- On Ene. 27, 2015, 6:47 p.m., Sergio Pena wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30281/ --- (Updated Ene. 27, 2015, 6:47 p.m.) Review request for hive, Ryan Blue, cheng xu, and Dong Chen. Bugs: HIVE-9333 https://issues.apache.org/jira/browse/HIVE-9333 Repository: hive-git Description --- This patch moves the ParquetHiveSerDe.serialize() implementation to DataWritableWriter class in order to save time in materializing data on serialize(). Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java ea4109d358f7c48d1e2042e5da299475de4a0a29 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9caa4ed169ba92dbd863e4a2dc6d06ab226a4465 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java 060b1b722d32f3b2f88304a1a73eb249e150294b ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 41b5f1c3b0ab43f734f8a211e3e03d5060c75434 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java a693aff18516d133abf0aae4847d3fe00b9f1c96 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java 667d3671547190d363107019cd9a2d105d26d336 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 007a665529857bcec612f638a157aa5043562a15 serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetWritable.java PRE-CREATION Diff: https://reviews.apache.org/r/30281/diff/ Testing --- The tests run were the following: 1. JMH (Java microbenchmark) This benchmark called parquet serialize/write methods using text writable objects. Class.method Before Change (ops/s) After Change (ops/s) --- ParquetHiveSerDe.serialize: 19,113 249,528 - 19x speed increase DataWritableWriter.write: 5,033 5,201 - 3.34% speed increase 2. Write 20 million rows (~1GB file) from Text to Parquet I wrote a ~1Gb
Re: Review Request 30281: Move parquet serialize implementation to DataWritableWriter to improve write speeds
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30281/ --- (Updated Ene. 29, 2015, 5:12 p.m.) Review request for hive, Ryan Blue, cheng xu, and Dong Chen. Changes --- Patch with Ferd changes recommendations. I also checking for the inspector category on writeValue() in order to pass the correct object inspector to the rest of the methods. I thinkg this makes other methods clean. Bugs: HIVE-9333 https://issues.apache.org/jira/browse/HIVE-9333 Repository: hive-git Description --- This patch moves the ParquetHiveSerDe.serialize() implementation to DataWritableWriter class in order to save time in materializing data on serialize(). Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java ea4109d358f7c48d1e2042e5da299475de4a0a29 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9caa4ed169ba92dbd863e4a2dc6d06ab226a4465 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java 060b1b722d32f3b2f88304a1a73eb249e150294b ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 41b5f1c3b0ab43f734f8a211e3e03d5060c75434 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java a693aff18516d133abf0aae4847d3fe00b9f1c96 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java 667d3671547190d363107019cd9a2d105d26d336 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 007a665529857bcec612f638a157aa5043562a15 serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetWritable.java PRE-CREATION Diff: https://reviews.apache.org/r/30281/diff/ Testing --- The tests run were the following: 1. JMH (Java microbenchmark) This benchmark called parquet serialize/write methods using text writable objects. Class.method Before Change (ops/s) After Change (ops/s) --- ParquetHiveSerDe.serialize: 19,113 249,528 - 19x speed increase DataWritableWriter.write: 5,033 5,201 - 3.34% speed increase 2. Write 20 million rows (~1GB file) from Text to Parquet I wrote a ~1Gb file in Textfile format, then convert it to a Parquet format using the following statement: CREATE TABLE parquet STORED AS parquet AS SELECT * FROM text; Time (s) it took to write the whole file BEFORE changes: 93.758 s Time (s) it took to write the whole file AFTER changes: 83.903 s It got a 10% of speed inscrease. Thanks, Sergio Pena
[jira] [Commented] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297147#comment-14297147 ] Brock Noland commented on HIVE-9211: Hi [~chengxiang li], When we [moved over to a none SNAPSHOT version of Spark|https://github.com/apache/hive/commit/dab416b2c492d22ab76fa2782f434d165c1144ab] I used a tarball which does not include the hadoop jars in the spark assembly. This can been seen by extracting the spark assembly in [our tarball|http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz]. As such you'll see I put {{\$\{test.hive.hadoop.classpath\}}} in the classpath to replace the missing hadoop jars from the spark assembly. As such, I have the following questions: # which class are you not finding that is required? # when you say you need the latest branch-1.2, do you mean a released version of spark? We can have a snapshot on the spark branch but not on trunk. Research on build mini HoS cluster on YARN for unit test[Spark Branch] -- Key: HIVE-9211 URL: https://issues.apache.org/jira/browse/HIVE-9211 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M5 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch HoS on YARN is a common use case in product environment, we'd better enable unit test for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297147#comment-14297147 ] Brock Noland edited comment on HIVE-9211 at 1/29/15 5:14 PM: - Hi [~chengxiang li], When we [moved over to a none SNAPSHOT version of Spark|https://github.com/apache/hive/commit/dab416b2c492d22ab76fa2782f434d165c1144ab] I used a tarball which does not include the hadoop jars in the spark assembly. This can been seen by extracting the spark assembly in [our tarball|http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.2.0-bin-hadoop2-without-hive.tgz]. As such you'll see I put {{$\{test.hive.hadoop.classpath\}}} in the classpath to replace the missing hadoop jars from the spark assembly. As such, I have the following questions: # which class are you not finding that is required? # when you say you need the latest branch-1.2, do you mean a released version of spark? We can have a snapshot on the spark branch but not on trunk. was (Author: brocknoland): Hi [~chengxiang li], When we [moved over to a none SNAPSHOT version of Spark|https://github.com/apache/hive/commit/dab416b2c492d22ab76fa2782f434d165c1144ab] I used a tarball which does not include the hadoop jars in the spark assembly. This can been seen by extracting the spark assembly in [our tarball|http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz]. As such you'll see I put {{$\{test.hive.hadoop.classpath\}}} in the classpath to replace the missing hadoop jars from the spark assembly. As such, I have the following questions: # which class are you not finding that is required? # when you say you need the latest branch-1.2, do you mean a released version of spark? We can have a snapshot on the spark branch but not on trunk. Research on build mini HoS cluster on YARN for unit test[Spark Branch] -- Key: HIVE-9211 URL: https://issues.apache.org/jira/browse/HIVE-9211 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M5 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch HoS on YARN is a common use case in product environment, we'd better enable unit test for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-9489: Attachment: HIVE-9489.3.patch Incorporated additional changes suggested by Lefty. Lefty, Even for fixes and feedback - its better late than never! Thanks again for looking into it! add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-9489: Attachment: HIVE-9489.3.patch add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch, HIVE-9489.3.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297147#comment-14297147 ] Brock Noland edited comment on HIVE-9211 at 1/29/15 5:13 PM: - Hi [~chengxiang li], When we [moved over to a none SNAPSHOT version of Spark|https://github.com/apache/hive/commit/dab416b2c492d22ab76fa2782f434d165c1144ab] I used a tarball which does not include the hadoop jars in the spark assembly. This can been seen by extracting the spark assembly in [our tarball|http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz]. As such you'll see I put {{$\{test.hive.hadoop.classpath\}}} in the classpath to replace the missing hadoop jars from the spark assembly. As such, I have the following questions: # which class are you not finding that is required? # when you say you need the latest branch-1.2, do you mean a released version of spark? We can have a snapshot on the spark branch but not on trunk. was (Author: brocknoland): Hi [~chengxiang li], When we [moved over to a none SNAPSHOT version of Spark|https://github.com/apache/hive/commit/dab416b2c492d22ab76fa2782f434d165c1144ab] I used a tarball which does not include the hadoop jars in the spark assembly. This can been seen by extracting the spark assembly in [our tarball|http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-${spark.version}-bin-hadoop2-without-hive.tgz]. As such you'll see I put {{\$\{test.hive.hadoop.classpath\}}} in the classpath to replace the missing hadoop jars from the spark assembly. As such, I have the following questions: # which class are you not finding that is required? # when you say you need the latest branch-1.2, do you mean a released version of spark? We can have a snapshot on the spark branch but not on trunk. Research on build mini HoS cluster on YARN for unit test[Spark Branch] -- Key: HIVE-9211 URL: https://issues.apache.org/jira/browse/HIVE-9211 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M5 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch HoS on YARN is a common use case in product environment, we'd better enable unit test for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-9489: Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks for the reviews [~leftylev] [~ashutoshc] add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch, HIVE-9489.3.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9451) Add max size of column dictionaries to ORC metadata
[ https://issues.apache.org/jira/browse/HIVE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297178#comment-14297178 ] Owen O'Malley commented on HIVE-9451: - We should also record the stripe size that was used as the file was written. That gives a strict upper bound on the size of memory in the writer. Add max size of column dictionaries to ORC metadata --- Key: HIVE-9451 URL: https://issues.apache.org/jira/browse/HIVE-9451 Project: Hive Issue Type: Improvement Reporter: Owen O'Malley To predict the amount of memory required to read an ORC file we need to know the size of the dictionaries for the columns that we are reading. I propose adding the number of bytes for each column's dictionary to the stripe's column statistics. The file's column statistics would have the maximum dictionary size for each column. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-9489: Attachment: (was: HIVE-9489.3.patch) add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9473) sql std auth should disallow built-in udfs that allow any java methods to be called
[ https://issues.apache.org/jira/browse/HIVE-9473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296582#comment-14296582 ] Hive QA commented on HIVE-9473: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12694865/HIVE-9473.1.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 7407 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2564/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2564/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2564/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12694865 - PreCommit-HIVE-TRUNK-Build sql std auth should disallow built-in udfs that allow any java methods to be called --- Key: HIVE-9473 URL: https://issues.apache.org/jira/browse/HIVE-9473 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-9473.1.patch As mentioned in HIVE-8893, some udfs can be used to execute arbitrary java methods. This should be disallowed when sql standard authorization is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.
[ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-9471: --- Attachment: HIVE-9471.3.patch Here's the same, with the LENGTH stream suppressed. Bad seek in uncompressed ORC, at row-group boundary. Key: HIVE-9471 URL: https://issues.apache.org/jira/browse/HIVE-9471 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group. {code:title=stacktrace} java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) ... Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112) at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852) {code} I'll attach the script to reproduce the problem herewith. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.
[ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-9471: --- Status: Open (was: Patch Available) Modifying the comment for the second null-check. Bad seek in uncompressed ORC, at row-group boundary. Key: HIVE-9471 URL: https://issues.apache.org/jira/browse/HIVE-9471 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group. {code:title=stacktrace} java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) ... Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112) at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852) {code} I'll attach the script to reproduce the problem herewith. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.
[ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-9471: --- Attachment: (was: HIVE-9471.3.patch) Bad seek in uncompressed ORC, at row-group boundary. Key: HIVE-9471 URL: https://issues.apache.org/jira/browse/HIVE-9471 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group. {code:title=stacktrace} java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) ... Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112) at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852) {code} I'll attach the script to reproduce the problem herewith. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext
[ https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296631#comment-14296631 ] Jesus Camacho Rodriguez commented on HIVE-9431: --- [~jpullokkaran], fails are not related to the patch (HIVE-9498). I think it can go in. Thanks CBO (Calcite Return Path): Removing AST from ParseContext - Key: HIVE-9431 URL: https://issues.apache.org/jira/browse/HIVE-9431 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 0.15.0 Attachments: HIVE-9431.01.patch, HIVE-9431.02.patch, HIVE-9431.03.patch, HIVE-9431.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.
[ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-9471: --- Status: Open (was: Patch Available) Bad seek in uncompressed ORC, at row-group boundary. Key: HIVE-9471 URL: https://issues.apache.org/jira/browse/HIVE-9471 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group. {code:title=stacktrace} java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) ... Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112) at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852) {code} I'll attach the script to reproduce the problem herewith. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.
[ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-9471: --- Status: Patch Available (was: Open) Bad seek in uncompressed ORC, at row-group boundary. Key: HIVE-9471 URL: https://issues.apache.org/jira/browse/HIVE-9471 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group. {code:title=stacktrace} java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) ... Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112) at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852) {code} I'll attach the script to reproduce the problem herewith. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Moustafa Aboul Atta updated HIVE-9507: -- Attachment: parial_log.log Make LATERAL VIEW inline(expression) mytable tolerant to nulls Key: HIVE-9507 URL: https://issues.apache.org/jira/browse/HIVE-9507 Project: Hive Issue Type: Bug Components: Query Processor, UDF Affects Versions: 0.14.0 Environment: hdp 2.2 Windows server 2012 R2 64-bit Reporter: Moustafa Aboul Atta Assignee: Navis Priority: Minor Attachments: HIVE-9507.1.patch.txt, parial_log.log I have tweets stored with avro on hdfs with the default twitter status (tweet) schema. There's an object called entities that contains arrays of structs. When I run {{SELECT mytable.*}} {{FROM tweets}} {{LATERAL VIEW INLINE(entities.media) mytable}} I get the exception found hereunder, however if I add {{WHERE entities.media IS NOT NULL}} it runs perfectly. Here's the partial log: 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Status: Running (Executing on YARN cluster with App id application_1422267635031_0618) 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: -/- 2015-01-29 10:15:02,526 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:05,551 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:08,722 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,354 INFO log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor 2015-01-29 10:15:12,354 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+5)/13 2015-01-29 10:15:12,557 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:15,691 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:18,892 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-1)/13 2015-01-29 10:15:19,094 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-3)/13 2015-01-29 10:15:19,304 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-5)/13 2015-01-29 10:15:19,507 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:22,641 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:24,704 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:27,735 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:30,957 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:34,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:35,138 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-9)/13 2015-01-29 10:15:36,503 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-10)/13 2015-01-29 10:15:36,710 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-11)/13 2015-01-29 10:15:37,971 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-12)/13 2015-01-29 10:15:39,800 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-13)/13 2015-01-29 10:15:41,175 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:44,414 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:45,447 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-15)/13 2015-01-29 10:15:47,413 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-16)/13 2015-01-29 10:15:47,618 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-17)/13 2015-01-29 10:15:49,568 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-18)/13 2015-01-29 10:15:51,099 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+0,-19)/13 2015-01-29 10:15:51,331 ERROR SessionState (SessionState.java:printError(833)) - Status: Failed 2015-01-29 10:15:51,417 ERROR SessionState (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while
[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Moustafa Aboul Atta updated HIVE-9507: -- Description: I have tweets stored with avro on hdfs with the default twitter status (tweet) schema. There's an object called entities that contains arrays of structs. When I run {{SELECT mytable.*}} {{FROM tweets}} {{LATERAL VIEW INLINE(entities.media) mytable}} I get the exception attached as partial_log.log, however, if I add {{WHERE entities.media IS NOT NULL}} it runs perfectly. was: I have tweets stored with avro on hdfs with the default twitter status (tweet) schema. There's an object called entities that contains arrays of structs. When I run {{SELECT mytable.*}} {{FROM tweets}} {{LATERAL VIEW INLINE(entities.media) mytable}} I get the exception found hereunder, however if I add {{WHERE entities.media IS NOT NULL}} it runs perfectly. Here's the partial log: 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Status: Running (Executing on YARN cluster with App id application_1422267635031_0618) 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: -/- 2015-01-29 10:15:02,526 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:05,551 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:08,722 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,354 INFO log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor 2015-01-29 10:15:12,354 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+5)/13 2015-01-29 10:15:12,557 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:15,691 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:18,892 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-1)/13 2015-01-29 10:15:19,094 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-3)/13 2015-01-29 10:15:19,304 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-5)/13 2015-01-29 10:15:19,507 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:22,641 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:24,704 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:27,735 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:30,957 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:34,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:35,138 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-9)/13 2015-01-29 10:15:36,503 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-10)/13 2015-01-29 10:15:36,710 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-11)/13 2015-01-29 10:15:37,971 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-12)/13 2015-01-29 10:15:39,800 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-13)/13 2015-01-29 10:15:41,175 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:44,414 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:45,447 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-15)/13 2015-01-29 10:15:47,413 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-16)/13 2015-01-29 10:15:47,618 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-17)/13 2015-01-29 10:15:49,568 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-18)/13 2015-01-29 10:15:51,099 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+0,-19)/13 2015-01-29 10:15:51,331 ERROR SessionState (SessionState.java:printError(833)) - Status: Failed 2015-01-29 10:15:51,417 ERROR SessionState (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Moustafa Aboul Atta updated HIVE-9507: -- Priority: Minor (was: Major) Make LATERAL VIEW inline(expression) mytable tolerant to nulls Key: HIVE-9507 URL: https://issues.apache.org/jira/browse/HIVE-9507 Project: Hive Issue Type: Bug Components: Query Processor, UDF Affects Versions: 0.14.0 Environment: hdp 2.2 Windows server 2012 R2 64-bit Reporter: Moustafa Aboul Atta Assignee: Navis Priority: Minor Attachments: HIVE-9507.1.patch.txt I have tweets stored with avro on hdfs with the default twitter status (tweet) schema. There's an object called entities that contains arrays of structs. When I run {{SELECT mytable.*}} {{FROM tweets}} {{LATERAL VIEW INLINE(entities.media) mytable}} I get the exception found hereunder, however if I add {{WHERE entities.media IS NOT NULL}} it runs perfectly. Here's the partial log: 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Status: Running (Executing on YARN cluster with App id application_1422267635031_0618) 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: -/- 2015-01-29 10:15:02,526 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:05,551 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:08,722 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,354 INFO log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor 2015-01-29 10:15:12,354 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+5)/13 2015-01-29 10:15:12,557 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:15,691 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:18,892 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-1)/13 2015-01-29 10:15:19,094 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-3)/13 2015-01-29 10:15:19,304 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-5)/13 2015-01-29 10:15:19,507 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:22,641 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:24,704 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:27,735 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:30,957 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:34,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:35,138 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-9)/13 2015-01-29 10:15:36,503 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-10)/13 2015-01-29 10:15:36,710 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-11)/13 2015-01-29 10:15:37,971 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-12)/13 2015-01-29 10:15:39,800 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-13)/13 2015-01-29 10:15:41,175 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:44,414 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:45,447 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-15)/13 2015-01-29 10:15:47,413 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-16)/13 2015-01-29 10:15:47,618 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-17)/13 2015-01-29 10:15:49,568 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-18)/13 2015-01-29 10:15:51,099 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+0,-19)/13 2015-01-29 10:15:51,331 ERROR SessionState (SessionState.java:printError(833)) - Status: Failed 2015-01-29 10:15:51,417 ERROR SessionState (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running
[jira] [Commented] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296557#comment-14296557 ] Lefty Leverenz commented on HIVE-9489: -- +1 ... although two more quibbles could be fixed (@return true if the udf is deterministic - UDF; non deterministic - non-deterministic). Sorry I missed them the first time. add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8136) Reduce table locking
[ https://issues.apache.org/jira/browse/HIVE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296607#comment-14296607 ] Ferdinand Xu commented on HIVE-8136: Currently the following alter table write type is trying to acquire an exclusive lock. DDL_EXCLUSIVE; RENAMECOLUMN ADDCLUSTERSORTCOLUMN: ADDFILEFORMAT: DROPPROPS: REPLACECOLS: ARCHIVE: UNARCHIVE: ALTERPROTECTMODE: ALTERPARTITIONPROTECTMODE: ALTERLOCATION: DROPPARTITION: RENAMEPARTITION: ADDSKEWEDBY: ALTERSKEWEDLOCATION: ALTERBUCKETNUM: ALTERPARTITION: ADDCOLS: RENAME: TRUNCATE: MERGEFILES: Other following is using shared lock: ADDSERDE ADDPARTITION ADDSERDEPROPS ADDPROPS Others has no lock: COMPACT TOUCH For changing table structure, an exclusive lock is a must. Most of the cases use the exclusive lock since it changes the table or partition structure currently. For adding cluster column and sort column, we can use shared lock for the following reason. {quote} The CLUSTERED BY and SORTED BY creation commands do not affect how data is inserted into a table – only how it is read. This means that users must be careful to insert data correctly by specifying the number of reducers to be equal to the number of buckets, and using CLUSTER BY and SORT BY commands in their query. {quote} For changing the properties, I think we can use no lock if it doesn't change the structure of the table. We can do a follow-up jira. Any thought about it, [~brocknoland]? Reduce table locking Key: HIVE-8136 URL: https://issues.apache.org/jira/browse/HIVE-8136 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Ferdinand Xu When using ZK for concurrency control, some statements require an exclusive table lock when they are atomic. Such as setting a tables location. This JIRA is to analyze the scope of statements like ALTER TABLE and see if we can reduce the locking required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls
Moustafa Aboul Atta created HIVE-9507: - Summary: Make LATERAL VIEW inline(expression) mytable tolerant to nulls Key: HIVE-9507 URL: https://issues.apache.org/jira/browse/HIVE-9507 Project: Hive Issue Type: Bug Components: Query Processor, UDF Affects Versions: 0.14.0 Environment: hdp 2.2 Windows server 2012 R2 64-bit Reporter: Moustafa Aboul Atta I have tweets stored with avro on hdfs with the default twitter status (tweet) schema. There's an object called entities that contains arrays of structs. When I run {{SELECT mytable.*}} {{FROM tweets}} {{LATERAL VIEW INLINE(entities.media) mytable}} I get the exception found hereunder, however if I add {{WHERE entities.media IS NOT NULL}} it runs perfectly. Here's the partial log: 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Status: Running (Executing on YARN cluster with App id application_1422267635031_0618) 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: -/- 2015-01-29 10:15:02,526 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:05,551 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:08,722 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,354 INFO log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor 2015-01-29 10:15:12,354 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+5)/13 2015-01-29 10:15:12,557 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:15,691 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:18,892 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-1)/13 2015-01-29 10:15:19,094 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-3)/13 2015-01-29 10:15:19,304 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-5)/13 2015-01-29 10:15:19,507 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:22,641 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:24,704 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:27,735 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:30,957 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:34,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:35,138 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-9)/13 2015-01-29 10:15:36,503 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-10)/13 2015-01-29 10:15:36,710 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-11)/13 2015-01-29 10:15:37,971 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-12)/13 2015-01-29 10:15:39,800 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-13)/13 2015-01-29 10:15:41,175 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:44,414 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:45,447 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-15)/13 2015-01-29 10:15:47,413 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-16)/13 2015-01-29 10:15:47,618 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-17)/13 2015-01-29 10:15:49,568 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-18)/13 2015-01-29 10:15:51,099 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+0,-19)/13 2015-01-29 10:15:51,331 ERROR SessionState (SessionState.java:printError(833)) - Status: Failed 2015-01-29 10:15:51,417 ERROR SessionState (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {metadata:{result_type:recent,iso_language_code:it},query_id:4013,data_source_type:1,search_date:1422300806,created_at:Mon Jan 26 04:31:11 +
[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9507: Assignee: Navis Status: Patch Available (was: Open) Make LATERAL VIEW inline(expression) mytable tolerant to nulls Key: HIVE-9507 URL: https://issues.apache.org/jira/browse/HIVE-9507 Project: Hive Issue Type: Bug Components: Query Processor, UDF Affects Versions: 0.14.0 Environment: hdp 2.2 Windows server 2012 R2 64-bit Reporter: Moustafa Aboul Atta Assignee: Navis Attachments: HIVE-9507.1.patch.txt I have tweets stored with avro on hdfs with the default twitter status (tweet) schema. There's an object called entities that contains arrays of structs. When I run {{SELECT mytable.*}} {{FROM tweets}} {{LATERAL VIEW INLINE(entities.media) mytable}} I get the exception found hereunder, however if I add {{WHERE entities.media IS NOT NULL}} it runs perfectly. Here's the partial log: 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Status: Running (Executing on YARN cluster with App id application_1422267635031_0618) 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: -/- 2015-01-29 10:15:02,526 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:05,551 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:08,722 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,354 INFO log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor 2015-01-29 10:15:12,354 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+5)/13 2015-01-29 10:15:12,557 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:15,691 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:18,892 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-1)/13 2015-01-29 10:15:19,094 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-3)/13 2015-01-29 10:15:19,304 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-5)/13 2015-01-29 10:15:19,507 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:22,641 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:24,704 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:27,735 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:30,957 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:34,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:35,138 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-9)/13 2015-01-29 10:15:36,503 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-10)/13 2015-01-29 10:15:36,710 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-11)/13 2015-01-29 10:15:37,971 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-12)/13 2015-01-29 10:15:39,800 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-13)/13 2015-01-29 10:15:41,175 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:44,414 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:45,447 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-15)/13 2015-01-29 10:15:47,413 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-16)/13 2015-01-29 10:15:47,618 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-17)/13 2015-01-29 10:15:49,568 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-18)/13 2015-01-29 10:15:51,099 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+0,-19)/13 2015-01-29 10:15:51,331 ERROR SessionState (SessionState.java:printError(833)) - Status: Failed 2015-01-29 10:15:51,417 ERROR SessionState (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException:
[jira] [Updated] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9507: Attachment: HIVE-9507.1.patch.txt Make LATERAL VIEW inline(expression) mytable tolerant to nulls Key: HIVE-9507 URL: https://issues.apache.org/jira/browse/HIVE-9507 Project: Hive Issue Type: Bug Components: Query Processor, UDF Affects Versions: 0.14.0 Environment: hdp 2.2 Windows server 2012 R2 64-bit Reporter: Moustafa Aboul Atta Attachments: HIVE-9507.1.patch.txt I have tweets stored with avro on hdfs with the default twitter status (tweet) schema. There's an object called entities that contains arrays of structs. When I run {{SELECT mytable.*}} {{FROM tweets}} {{LATERAL VIEW INLINE(entities.media) mytable}} I get the exception found hereunder, however if I add {{WHERE entities.media IS NOT NULL}} it runs perfectly. Here's the partial log: 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Status: Running (Executing on YARN cluster with App id application_1422267635031_0618) 2015-01-29 10:15:00,879 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: -/- 2015-01-29 10:15:02,526 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:05,551 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:08,722 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0/13 2015-01-29 10:15:12,354 INFO log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TezRunVertex.Map 1 from=org.apache.hadoop.hive.ql.exec.tez.TezJobMonitor 2015-01-29 10:15:12,354 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+5)/13 2015-01-29 10:15:12,557 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:15,691 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6)/13 2015-01-29 10:15:18,892 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-1)/13 2015-01-29 10:15:19,094 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-3)/13 2015-01-29 10:15:19,304 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-5)/13 2015-01-29 10:15:19,507 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:22,641 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-6)/13 2015-01-29 10:15:24,704 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:27,735 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:30,957 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:34,095 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-8)/13 2015-01-29 10:15:35,138 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-9)/13 2015-01-29 10:15:36,503 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-10)/13 2015-01-29 10:15:36,710 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-11)/13 2015-01-29 10:15:37,971 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-12)/13 2015-01-29 10:15:39,800 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-13)/13 2015-01-29 10:15:41,175 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:44,414 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-14)/13 2015-01-29 10:15:45,447 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-15)/13 2015-01-29 10:15:47,413 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-16)/13 2015-01-29 10:15:47,618 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-17)/13 2015-01-29 10:15:49,568 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+6,-18)/13 2015-01-29 10:15:51,099 INFO SessionState (SessionState.java:printInfo(824)) - Map 1: 0(+0,-19)/13 2015-01-29 10:15:51,331 ERROR SessionState (SessionState.java:printError(833)) - Status: Failed 2015-01-29 10:15:51,417 ERROR SessionState (SessionState.java:printError(833)) - Vertex failed, vertexName=Map 1, vertexId=vertex_1422267635031_0618_1_00, diagnostics=[Task failed, taskId=task_1422267635031_0618_1_00_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException:
Re: Review Request 30254: HIVE-9444
On Jan. 28, 2015, 10:45 p.m., John Pullokkaran wrote: ql/src/java/org/apache/hadoop/hive/ql/optimizer/GlobalLimitOptimizer.java, line 155 https://reviews.apache.org/r/30254/diff/1/?file=833501#file833501line155 How are carrying forward the assumptions? ClusterBy, DistributeBy, OrderBy... is empty? OrderBy, SortBy, and ClusterBy are covered by the condition that if there is a RS in the tree, order is empty (line 154 in the patched code). DistributeBy is covered by the condition that if there is a RS in the tree, its partitionCols are empty (lines 150,151 in the patched code). - Jesús --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30254/#review70102 --- On Jan. 25, 2015, 1:11 p.m., Jesús Camacho Rodríguez wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30254/ --- (Updated Jan. 25, 2015, 1:11 p.m.) Review request for hive and John Pullokkaran. Bugs: HIVE-9444 https://issues.apache.org/jira/browse/HIVE-9444 Repository: hive-git Description --- HIVE-9444 Diffs - ql/src/java/org/apache/hadoop/hive/ql/optimizer/GlobalLimitOptimizer.java c9848dacd1a02db321583c2b91eb6d7317c295ff Diff: https://reviews.apache.org/r/30254/diff/ Testing --- Existing tests. Thanks, Jesús Camacho Rodríguez
[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.
[ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-9471: --- Status: Patch Available (was: Open) Bad seek in uncompressed ORC, at row-group boundary. Key: HIVE-9471 URL: https://issues.apache.org/jira/browse/HIVE-9471 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group. {code:title=stacktrace} java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) ... Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112) at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852) {code} I'll attach the script to reproduce the problem herewith. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.
[ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-9471: --- Attachment: HIVE-9471.3.patch Bad seek in uncompressed ORC, at row-group boundary. Key: HIVE-9471 URL: https://issues.apache.org/jira/browse/HIVE-9471 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group. {code:title=stacktrace} java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) ... Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112) at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852) {code} I'll attach the script to reproduce the problem herewith. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime
[ https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9508: --- Fix Version/s: (was: 0.15.0) 1.2.0 MetaStore client socket connection should have a lifetime - Key: HIVE-9508 URL: https://issues.apache.org/jira/browse/HIVE-9508 Project: Hive Issue Type: Improvement Components: CLI, Metastore Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Labels: metastore, rolling_upgrade Fix For: 1.2.0 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore server until the connection is closed or there is a problem. I would like to introduce the concept of a MetaStore client socket life time. The MS client will reconnect if the socket lifetime is reached. This will help during rolling upgrade of Metastore. When there are multiple Metastore servers behind a VIP (load balancer), it is easy to take one server out of rotation and wait for 10+ mins for all existing connections will die down (if the lifetime is 5mins say) and the server can be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8136) Reduce table locking
[ https://issues.apache.org/jira/browse/HIVE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-8136: --- Status: Patch Available (was: In Progress) Reduce table locking Key: HIVE-8136 URL: https://issues.apache.org/jira/browse/HIVE-8136 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: HIVE-8136.patch When using ZK for concurrency control, some statements require an exclusive table lock when they are atomic. Such as setting a tables location. This JIRA is to analyze the scope of statements like ALTER TABLE and see if we can reduce the locking required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime
[ https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9508: --- Attachment: HIVE-9508.1.patch Attaching basic patch. The connection lifetime is disabled by default so existing users should not be affected. MetaStore client socket connection should have a lifetime - Key: HIVE-9508 URL: https://issues.apache.org/jira/browse/HIVE-9508 Project: Hive Issue Type: Improvement Components: CLI, Metastore Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Labels: metastore, rolling_upgrade Fix For: 1.2.0 Attachments: HIVE-9508.1.patch Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore server until the connection is closed or there is a problem. I would like to introduce the concept of a MetaStore client socket life time. The MS client will reconnect if the socket lifetime is reached. This will help during rolling upgrade of Metastore. When there are multiple Metastore servers behind a VIP (load balancer), it is easy to take one server out of rotation and wait for 10+ mins for all existing connections will die down (if the lifetime is 5mins say) and the server can be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296653#comment-14296653 ] Hive QA commented on HIVE-9489: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695157/HIVE-9489.2.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7405 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2565/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2565/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2565/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695157 - PreCommit-HIVE-TRUNK-Build add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9508) MetaStore client socket connection should have a lifetime
Thiruvel Thirumoolan created HIVE-9508: -- Summary: MetaStore client socket connection should have a lifetime Key: HIVE-9508 URL: https://issues.apache.org/jira/browse/HIVE-9508 Project: Hive Issue Type: Improvement Components: CLI, Metastore Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.15.0 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore server until the connection is closed or there is a problem. I would like to introduce the concept of a MetaStore client socket life time. The MS client will reconnect if the socket lifetime is reached. This will help during rolling upgrade of Metastore. When there are multiple Metastore servers behind a VIP (load balancer), it is easy to take one server out of rotation and wait for 10+ mins for all existing connections will die down (if the lifetime is 5mins say) and the server can be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9416) Get rid of Extract Operator
[ https://issues.apache.org/jira/browse/HIVE-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296707#comment-14296707 ] Hive QA commented on HIVE-9416: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695166/HIVE-9416.6.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7405 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2566/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2566/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2566/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695166 - PreCommit-HIVE-TRUNK-Build Get rid of Extract Operator --- Key: HIVE-9416 URL: https://issues.apache.org/jira/browse/HIVE-9416 Project: Hive Issue Type: Task Components: Query Processor Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-9416.1.patch, HIVE-9416.2.patch, HIVE-9416.3.patch, HIVE-9416.4.patch, HIVE-9416.5.patch, HIVE-9416.6.patch, HIVE-9416.patch {{Extract Operator}} has been there for legacy reasons. But there is no functionality it provides which cant be provided by {{Select Operator}} Instead of having two operators, one being subset of another we should just get rid of {{Extract}} and simplify our codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8136) Reduce table locking
[ https://issues.apache.org/jira/browse/HIVE-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-8136: --- Attachment: HIVE-8136.patch Reduce table locking Key: HIVE-8136 URL: https://issues.apache.org/jira/browse/HIVE-8136 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: HIVE-8136.patch When using ZK for concurrency control, some statements require an exclusive table lock when they are atomic. Such as setting a tables location. This JIRA is to analyze the scope of statements like ALTER TABLE and see if we can reduce the locking required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.
[ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297013#comment-14297013 ] Hive QA commented on HIVE-9471: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695233/HIVE-9471.3.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7405 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testMetastoreProxyUser org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2569/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2569/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2569/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695233 - PreCommit-HIVE-TRUNK-Build Bad seek in uncompressed ORC, at row-group boundary. Key: HIVE-9471 URL: https://issues.apache.org/jira/browse/HIVE-9471 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group. {code:title=stacktrace} java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) ... Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112) at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852) {code} I'll attach the script to reproduce the problem herewith. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5472) support a simple scalar which returns the current timestamp
[ https://issues.apache.org/jira/browse/HIVE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296780#comment-14296780 ] Hive QA commented on HIVE-5472: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695173/HIVE-5472.4.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7406 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2567/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2567/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2567/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695173 - PreCommit-HIVE-TRUNK-Build support a simple scalar which returns the current timestamp --- Key: HIVE-5472 URL: https://issues.apache.org/jira/browse/HIVE-5472 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0 Reporter: N Campbell Assignee: Jason Dere Attachments: HIVE-5472.1.patch, HIVE-5472.2.patch, HIVE-5472.3.patch, HIVE-5472.4.patch ISO-SQL has two forms of functions local and current timestamp where the former is a TIMESTAMP WITHOUT TIMEZONE and the latter with TIME ZONE select cast ( unix_timestamp() as timestamp ) from T implement a function which computes LOCAL TIMESTAMP which would be the current timestamp for the users session time zone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9252) Linking custom SerDe jar to table definition.
[ https://issues.apache.org/jira/browse/HIVE-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu reassigned HIVE-9252: -- Assignee: Ferdinand Xu Linking custom SerDe jar to table definition. - Key: HIVE-9252 URL: https://issues.apache.org/jira/browse/HIVE-9252 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Niels Basjes Assignee: Ferdinand Xu In HIVE-6047 the option was created that a jar file can be hooked to the definition of a function. (See: [Language Manual DDL: Permanent Functions|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-PermanentFunctions] ) I propose to add something similar that can be used when defining an external table that relies on a custom Serde (I expect to usually only have the Deserializer). Something like this: {code} CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name ... STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)] [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ]; {code} Using this you can define (and share !!!) a Hive table on top of a custom fileformat without the need to let the IT operations people deploy a custom SerDe jar file on all nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Apache Hive 1.0 Release Candidate 1
+1. Downloaded it, checked out the signatures, did a build, checked there were no snapshot dependencies. Alan. Vikram Dixit K mailto:vikram.di...@gmail.com January 27, 2015 at 14:28 Apache Hive 1.0 Release Candidate 1 is available here: http://people.apache.org/~vikram/hive/apache-hive-1.0-rc1/ Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-1020/ Source tag for RC1 is at: http://svn.apache.org/repos/asf/hive/branches/branch-1.0/ Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks Vikram. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297200#comment-14297200 ] Marcelo Vanzin commented on HIVE-9487: -- Hmm, weird. I definitely did not touch those. Maybe some merge issue, I'll take a look. Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9317) move Microsoft copyright to NOTICE file
[ https://issues.apache.org/jira/browse/HIVE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297216#comment-14297216 ] Alan Gates commented on HIVE-9317: -- I think we're ok without this in 1.0. It's already been in several releases. If we need to roll a new RC I agree this should go in. move Microsoft copyright to NOTICE file --- Key: HIVE-9317 URL: https://issues.apache.org/jira/browse/HIVE-9317 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.15.0, 1.0.0 Attachments: hive-9327.txt There are a set of files that still have the Microsoft copyright notices. Those notices need to be moved into NOTICES and replaced with the standard Apache headers. {code} ./common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java ./common/src/java/org/apache/hadoop/hive/common/type/SignedInt128.java ./common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java ./common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestSignedInt128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestSqlMathUtil.java ./common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8307) null character in columns.comments schema property breaks jobconf.xml
[ https://issues.apache.org/jira/browse/HIVE-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297297#comment-14297297 ] Ashutosh Chauhan commented on HIVE-8307: I will put up a patch to remove comments from serde properties. null character in columns.comments schema property breaks jobconf.xml - Key: HIVE-8307 URL: https://issues.apache.org/jira/browse/HIVE-8307 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0, 0.13.1 Reporter: Carl Laird It would appear that the fix for https://issues.apache.org/jira/browse/HIVE-6681 is causing the null character to show up in job config xml files: I get the following when trying to insert into an elasticsearch backed table: [Fatal Error] :336:51: Character reference # 14/06/17 14:40:11 FATAL conf.Configuration: error parsing conf file: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # Exception in thread main java.lang.RuntimeException: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1263) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1129) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1063) at org.apache.hadoop.conf.Configuration.get(Configuration.java:416) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:604) at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:1273) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:667) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1181) ... 11 more Execution failed with exit status: 1 Line 336 of jobconf.xml: propertynamecolumns.comments/namevalue#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;/value/property See https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ for more discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9392) JoinStatsRule miscalculates join cardinality as incorrect NDV is used due to column names having duplicated fqColumnName
[ https://issues.apache.org/jira/browse/HIVE-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296867#comment-14296867 ] Hive QA commented on HIVE-9392: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695176/HIVE-9392.2.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7407 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2568/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2568/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2568/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695176 - PreCommit-HIVE-TRUNK-Build JoinStatsRule miscalculates join cardinality as incorrect NDV is used due to column names having duplicated fqColumnName Key: HIVE-9392 URL: https://issues.apache.org/jira/browse/HIVE-9392 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Prasanth Jayachandran Priority: Critical Fix For: 0.15.0 Attachments: HIVE-9392.1.patch, HIVE-9392.2.patch In JoinStatsRule.process the join column statistics are stored in HashMap joinedColStats, the key used which is the ColStatistics.fqColName is duplicated between join column in the same vertex, as a result distinctVals ends up having duplicated values which negatively affects the join cardinality estimation. The duplicate keys are usually named KEY.reducesinkkey0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9500) Support nested structs over 24 levels.
[ https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-9500: --- Description: Customer has deeply nested avro structure and is receiving the following error when performing queries. 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting supported for LazySimpleSerde is 23 Unable to work with level 24 Currently we support up to 24 levels of nested structs when hive.serialization.extend.nesting.levels is set to true, while the customers have the requirement to support more than that. It would be better to make the supported levels configurable or completely removed (i.e., we can support any number of levels). was: Currently we support up to 24 levels of nested structs when hive.serialization.extend.nesting.levels is set to true, while the customers have the requirement to support more than that. It would be better to make the supported levels configurable or completely removed (i.e., we can support any number of levels). Support nested structs over 24 levels. -- Key: HIVE-9500 URL: https://issues.apache.org/jira/browse/HIVE-9500 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Labels: SerDe Customer has deeply nested avro structure and is receiving the following error when performing queries. 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting supported for LazySimpleSerde is 23 Unable to work with level 24 Currently we support up to 24 levels of nested structs when hive.serialization.extend.nesting.levels is set to true, while the customers have the requirement to support more than that. It would be better to make the supported levels configurable or completely removed (i.e., we can support any number of levels). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-9211: Attachment: HIVE-9211.4-spark.patch [~brocknoland], what code base is our current Spark installation built upon? I run into some inconsistent jar dependency issue in test, and update Spark installation based latest Spark branch-1.2 code fix it. The Hive spark branch depends on Hadoop 2.6.0 for hadoop2 now, we may need to build spark consistent with it. Research on build mini HoS cluster on YARN for unit test[Spark Branch] -- Key: HIVE-9211 URL: https://issues.apache.org/jira/browse/HIVE-9211 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M5 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch HoS on YARN is a common use case in product environment, we'd better enable unit test for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297050#comment-14297050 ] Hive QA commented on HIVE-9211: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695277/HIVE-9211.4-spark.patch {color:red}ERROR:{color} -1 due to 44 failed/errored test(s), 7404 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_memcheck org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketizedhiveinputformat org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_empty_dir_in_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_external_table_with_space_in_location_path org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_groupby2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap_auto org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_bucketed_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_dyn_part org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_merge org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_join1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_leftsemijoin_mr org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_parallel_orderby org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_quotedid_smb org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_remote_script org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_smb_mapjoin_8 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter_partitioned org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_truncate_column_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_uber_reduce org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/691/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/691/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-691/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase
Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
Congratulations to everyone. On Thu, Jan 29, 2015 at 10:05 AM, Aihua Xu a...@cloudera.com wrote: +1. Cong~ everyone! On Jan 29, 2015, at 9:43 AM, Philippe Kernévez pkerne...@octo.com wrote: Congratulations everyone ! On Wed, Jan 28, 2015 at 10:15 PM, Carl Steinbach c...@apache.org wrote: I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl -- Philippe Kernévez Directeur technique (Suisse), pkerne...@octo.com +41 79 888 33 32 Retrouvez OCTO sur OCTO Talk : http://blog.octo.com OCTO Technology http://www.octo.com
Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
Congratulations everyone ! On Wed, Jan 28, 2015 at 10:15 PM, Carl Steinbach c...@apache.org wrote: I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl -- Philippe Kernévez Directeur technique (Suisse), pkerne...@octo.com +41 79 888 33 32 Retrouvez OCTO sur OCTO Talk : http://blog.octo.com OCTO Technology http://www.octo.com
Re: Review Request 30388: HIVE-9103 - Support backup task for join related optimization [Spark Branch]
On Jan. 29, 2015, 4:20 a.m., Xuefu Zhang wrote: ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java, line 295 https://reviews.apache.org/r/30388/diff/1/?file=839499#file839499line295 childrenBackupTasks or backChildrenTasks? I suggest more consistent variable/method names. Since the none is task, I suggest child. Good point. Will change. On Jan. 29, 2015, 4:20 a.m., Xuefu Zhang wrote: ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java, line 110 https://reviews.apache.org/r/30388/diff/1/?file=839504#file839504line110 In Spark branch - For Spark Will change. - Chao --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30388/#review70150 --- On Jan. 29, 2015, 1:05 a.m., Chao Sun wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30388/ --- (Updated Jan. 29, 2015, 1:05 a.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-9103 https://issues.apache.org/jira/browse/HIVE-9103 Repository: hive-git Description --- This patch adds backup task to map join task. The backup task, which uses common join, will be triggered in case the mapjoin task failed. Note that, no matter how many map joins there are in the SparkTask, we will only generate one backup task. This means that if the original task failed at the very last map join, the whole task will be re-executed. The handling of backup task is a little bit different from what MR does, mostly because we convert JOIN to MAPJOIN during the operator plan optimization phase, at which time no task/work exist yet. In the patch, we cloned the whole operator tree before the JOIN operator is converted. The operator tree will be processed and generate a separate work tree for a separate backup SparkTask. Diffs - ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java 69004dc ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/StageIDsRearranger.java 79c3e02 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinOptimizer.java d57ceff ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java 9ff47c7 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSortMergeJoinFactory.java 6e0ac38 ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java b838bff ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 773cfbd ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java f7586a4 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 3a7477a ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 0e85990 ql/src/test/results/clientpositive/spark/auto_join25.q.out ab01b8a Diff: https://reviews.apache.org/r/30388/diff/ Testing --- auto_join25.q Thanks, Chao Sun
Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
+1. Cong~ everyone! On Jan 29, 2015, at 9:43 AM, Philippe Kernévez pkerne...@octo.com wrote: Congratulations everyone ! On Wed, Jan 28, 2015 at 10:15 PM, Carl Steinbach c...@apache.org wrote: I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl -- Philippe Kernévez Directeur technique (Suisse), pkerne...@octo.com +41 79 888 33 32 Retrouvez OCTO sur OCTO Talk : http://blog.octo.com OCTO Technology http://www.octo.com
[jira] [Commented] (HIVE-9317) move Microsoft copyright to NOTICE file
[ https://issues.apache.org/jira/browse/HIVE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297319#comment-14297319 ] Owen O'Malley commented on HIVE-9317: - +1 to not rolling a new RC specifically for this one. I just want to make sure it goes into to any new RCs. move Microsoft copyright to NOTICE file --- Key: HIVE-9317 URL: https://issues.apache.org/jira/browse/HIVE-9317 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.15.0, 1.0.0 Attachments: hive-9327.txt There are a set of files that still have the Microsoft copyright notices. Those notices need to be moved into NOTICES and replaced with the standard Apache headers. {code} ./common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java ./common/src/java/org/apache/hadoop/hive/common/type/SignedInt128.java ./common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java ./common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestSignedInt128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestSqlMathUtil.java ./common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 30388: HIVE-9103 - Support backup task for join related optimization [Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30388/ --- (Updated Jan. 29, 2015, 6:51 p.m.) Review request for hive and Xuefu Zhang. Changes --- Regenerated golden files (mostly plan change due to the backup task), and added auto_join25.q. Also addressed initial feedback from review board. Bugs: HIVE-9103 https://issues.apache.org/jira/browse/HIVE-9103 Repository: hive-git Description --- This patch adds backup task to map join task. The backup task, which uses common join, will be triggered in case the mapjoin task failed. Note that, no matter how many map joins there are in the SparkTask, we will only generate one backup task. This means that if the original task failed at the very last map join, the whole task will be re-executed. The handling of backup task is a little bit different from what MR does, mostly because we convert JOIN to MAPJOIN during the operator plan optimization phase, at which time no task/work exist yet. In the patch, we cloned the whole operator tree before the JOIN operator is converted. The operator tree will be processed and generate a separate work tree for a separate backup SparkTask. Diffs (updated) - itests/src/test/resources/testconfiguration.properties f583aaf ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java 69004dc ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/StageIDsRearranger.java 79c3e02 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinOptimizer.java d57ceff ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java 9ff47c7 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSortMergeJoinFactory.java 6e0ac38 ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java b838bff ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 773cfbd ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java f7586a4 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 3a7477a ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 0e85990 ql/src/test/results/clientpositive/spark/auto_join0.q.out 7f8eb63 ql/src/test/results/clientpositive/spark/auto_join1.q.out b640b9d ql/src/test/results/clientpositive/spark/auto_join10.q.out f01765c ql/src/test/results/clientpositive/spark/auto_join11.q.out 69c10e6 ql/src/test/results/clientpositive/spark/auto_join12.q.out bc763ed ql/src/test/results/clientpositive/spark/auto_join13.q.out 935ebf5 ql/src/test/results/clientpositive/spark/auto_join14.q.out 830314e ql/src/test/results/clientpositive/spark/auto_join15.q.out 780540b ql/src/test/results/clientpositive/spark/auto_join16.q.out f705339 ql/src/test/results/clientpositive/spark/auto_join17.q.out 3144db6 ql/src/test/results/clientpositive/spark/auto_join19.q.out f2b0140 ql/src/test/results/clientpositive/spark/auto_join2.q.out 2424cca ql/src/test/results/clientpositive/spark/auto_join20.q.out 9258f3b ql/src/test/results/clientpositive/spark/auto_join21.q.out aa8f6dd ql/src/test/results/clientpositive/spark/auto_join22.q.out d49dda9 ql/src/test/results/clientpositive/spark/auto_join23.q.out a179d87 ql/src/test/results/clientpositive/spark/auto_join24.q.out cfb076e ql/src/test/results/clientpositive/spark/auto_join25.q.out ab01b8a ql/src/test/results/clientpositive/spark/auto_join26.q.out 58821e9 ql/src/test/results/clientpositive/spark/auto_join28.q.out d30133b ql/src/test/results/clientpositive/spark/auto_join29.q.out 780c6cb ql/src/test/results/clientpositive/spark/auto_join3.q.out 54e24f3 ql/src/test/results/clientpositive/spark/auto_join30.q.out 4c832e2 ql/src/test/results/clientpositive/spark/auto_join31.q.out 5980814 ql/src/test/results/clientpositive/spark/auto_join32.q.out 9629f53 ql/src/test/results/clientpositive/spark/auto_join4.q.out 3366f75 ql/src/test/results/clientpositive/spark/auto_join5.q.out b6d8798 ql/src/test/results/clientpositive/spark/auto_join8.q.out 5b6cc80 ql/src/test/results/clientpositive/spark/auto_join9.q.out 6daf348 ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 ql/src/test/results/clientpositive/spark/auto_join_nulls.q.out 1f37c75 ql/src/test/results/clientpositive/spark/auto_join_stats.q.out 1fa1a74 ql/src/test/results/clientpositive/spark/auto_join_stats2.q.out c6473d3 ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 3d465db ql/src/test/results/clientpositive/spark/auto_sortmerge_join_10.q.out fe7b96d ql/src/test/results/clientpositive/spark/auto_sortmerge_join_11.q.out f4e889a ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out c358721
[jira] [Commented] (HIVE-9501) DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification
[ https://issues.apache.org/jira/browse/HIVE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297317#comment-14297317 ] Sushanth Sowmyan commented on HIVE-9501: +1, works correctly now, I'm able to see the dbname and tablename and able to filter on them appropriately. The test failure reported is unrelated, will go ahead and commit. DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification - Key: HIVE-9501 URL: https://issues.apache.org/jira/browse/HIVE-9501 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-9501.patch This is a hold over from the JMS stuff, where create database is sent on the general topic and create table on the db topic. But since DbNotificationListener isn't for JMS, keeping this semantic doesn't make sense. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9501) DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification
[ https://issues.apache.org/jira/browse/HIVE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-9501: --- Resolution: Fixed Fix Version/s: 1.2.0 Status: Resolved (was: Patch Available) DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification - Key: HIVE-9501 URL: https://issues.apache.org/jira/browse/HIVE-9501 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 1.2.0 Attachments: HIVE-9501.patch This is a hold over from the JMS stuff, where create database is sent on the general topic and create table on the db topic. But since DbNotificationListener isn't for JMS, keeping this semantic doesn't make sense. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9501) DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification
[ https://issues.apache.org/jira/browse/HIVE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297321#comment-14297321 ] Sushanth Sowmyan commented on HIVE-9501: Committed to trunk. Thanks, Alan. (Doc note : No docs required on this as well - DbNotificationListener is internal and this is adding additional info that it needed for filters on it to work correctly.) DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification - Key: HIVE-9501 URL: https://issues.apache.org/jira/browse/HIVE-9501 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 1.2.0 Attachments: HIVE-9501.patch This is a hold over from the JMS stuff, where create database is sent on the general topic and create table on the db topic. But since DbNotificationListener isn't for JMS, keeping this semantic doesn't make sense. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9103) Support backup task for join related optimization [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-9103: --- Attachment: HIVE-9103.2-spark.patch Regenerated golden files (mostly plan change due to the backup task), and added auto_join25.q. Also addressed initial feedback from review board. Support backup task for join related optimization [Spark Branch] Key: HIVE-9103 URL: https://issues.apache.org/jira/browse/HIVE-9103 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chao Priority: Blocker Attachments: HIVE-9103-1.spark.patch, HIVE-9103.2-spark.patch In MR, backup task can be executed if the original task, which probably contains certain (join) optimization fails. This JIRA is to track this topic for Spark. We need to determine if we need this and implement if necessary. This is a followup of HIVE-9099. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9454) Test failures due to new Calcite version
[ https://issues.apache.org/jira/browse/HIVE-9454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9454: -- Attachment: HIVE-9454.02.patch New patch based on Julien's, containing also the changes on golden files using Calcite-1.0.0-RC2. Test failures due to new Calcite version Key: HIVE-9454 URL: https://issues.apache.org/jira/browse/HIVE-9454 Project: Hive Issue Type: Bug Reporter: Brock Noland Attachments: HIVE-9454.02.patch, HIVE-9454.1.patch A bunch of failures have started appearing in patches which seen unrelated. I am thinking we've picked up a new version of Calcite. E.g.: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2488/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_auto_join12/ {noformat} Running: diff -a /home/hiveptest/54.147.202.89-hiveptest-1/apache-svn-trunk-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/auto_join12.q.out /home/hiveptest/54.147.202.89-hiveptest-1/apache-svn-trunk-source/itests/qtest/../../ql/src/test/results/clientpositive/auto_join12.q.out 32c32 $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src --- $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src 35c35 $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:src --- $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:$hdt$_1:src 39c39 $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src --- $hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:$hdt$_0:src 54c54 $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:src --- $hdt$_0:$hdt$_0:$hdt$_1:$hdt$_1:$hdt$_1:$hdt$_1:src {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9445) Revert HIVE-5700 - enforce single date format for partition column storage
[ https://issues.apache.org/jira/browse/HIVE-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297384#comment-14297384 ] Sergey Shelukhin commented on HIVE-9445: Hmm, looks like I missed the java part of the change that was not merely code move. Partition spec validation should not have been thrown out... Let me file a JIRA to add it back Revert HIVE-5700 - enforce single date format for partition column storage -- Key: HIVE-9445 URL: https://issues.apache.org/jira/browse/HIVE-9445 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0, 0.14.0, 0.13.1, 0.15.0, 0.14.1 Reporter: Brock Noland Assignee: Brock Noland Priority: Blocker Fix For: 0.15.0 Attachments: HIVE-9445.1.patch, HIVE-9445.1.patch HIVE-5700 has the following issues: * HIVE-8730 - fails mysql upgrades * Does not upgrade all metadata, e.g. {{PARTITIONS.PART_NAME}} See comments in HIVE-5700. * Completely corrupts postgres, see below. With a postgres metastore on 0.12, I executed the following: {noformat} CREATE TABLE HIVE5700_DATE_PARTED (line string) PARTITIONED BY (ddate date); CREATE TABLE HIVE5700_STRING_PARTED (line string) PARTITIONED BY (ddate string); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='NOT_DATE'); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='20150121'); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='20150122'); ALTER TABLE HIVE5700_DATE_PARTED ADD PARTITION (ddate='2015-01-23'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='NOT_DATE'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='20150121'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='20150122'); ALTER TABLE HIVE5700_STRING_PARTED ADD PARTITION (ddate='2015-01-23'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='NOT_DATE'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='20150121'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='20150122'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_DATE_PARTED PARTITION (ddate='2015-01-23'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='NOT_DATE'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='20150121'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='20150122'); LOAD DATA LOCAL INPATH '/tmp/single-line-of-data' INTO TABLE HIVE5700_STRING_PARTED PARTITION (ddate='2015-01-23'); hive show partitions HIVE5700_DATE_PARTED; OK ddate=20150121 ddate=20150122 ddate=2015-01-23 ddate=NOT_DATE Time taken: 0.052 seconds, Fetched: 4 row(s) hive show partitions HIVE5700_STRING_PARTED; OK ddate=20150121 ddate=20150122 ddate=2015-01-23 ddate=NOT_DATE Time taken: 0.051 seconds, Fetched: 4 row(s) {noformat} I then took a dump of the database named {{postgres-pre-upgrade.sql}} and the data in the dump looks good: {noformat} [root@hive5700-1-1 ~]# egrep -A9 '^COPY PARTITIONS|^COPY PARTITION_KEY_VALS' postgres-pre-upgrade.sql COPY PARTITIONS (PART_ID, CREATE_TIME, LAST_ACCESS_TIME, PART_NAME, SD_ID, TBL_ID) FROM stdin; 3 1421943647 0 ddate=NOT_DATE 6 2 4 1421943647 0 ddate=20150121 7 2 5 1421943648 0 ddate=20150122 8 2 6 1421943664 0 ddate=NOT_DATE 9 3 7 1421943664 0 ddate=20150121 10 3 8 1421943665 0 ddate=20150122 11 3 9 1421943694 0 ddate=2015-01-2312 2 101421943695 0 ddate=2015-01-2313 3 \. -- COPY PARTITION_KEY_VALS (PART_ID, PART_KEY_VAL, INTEGER_IDX) FROM stdin; 3 NOT_DATE0 4 201501210 5 201501220 6 NOT_DATE0 7 201501210 8 201501220 9 2015-01-23 0 102015-01-23 0 \. {noformat} I then upgraded to 0.13 and subsequently upgraded the MS with the following command: {{schematool -dbType postgres -upgradeSchema -verbose}} The file {{postgres-post-upgrade.sql}} is the post-upgrade db dump. As you can see the data is completely corrupt. {noformat} [root@hive5700-1-1 ~]# egrep -A9 '^COPY PARTITIONS|^COPY PARTITION_KEY_VALS' postgres-post-upgrade.sql COPY PARTITIONS (PART_ID, CREATE_TIME, LAST_ACCESS_TIME, PART_NAME, SD_ID, TBL_ID) FROM stdin; 3 1421943647 0 ddate=NOT_DATE 6 2 4 1421943647 0 ddate=20150121 7 2 5
[jira] [Created] (HIVE-9509) Restore partition spec validation removed by HIVE-9445
Sergey Shelukhin created HIVE-9509: -- Summary: Restore partition spec validation removed by HIVE-9445 Key: HIVE-9509 URL: https://issues.apache.org/jira/browse/HIVE-9509 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9273) Add option to fire metastore event on insert
[ https://issues.apache.org/jira/browse/HIVE-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297398#comment-14297398 ] Sushanth Sowmyan commented on HIVE-9273: a) I like that you changed the return type to FireResponseType from void - that allows for future growth if we need to ACK anything. b) In the FireEventRequest thrift definition, I wondered about whether tableName should really be optional, but I think that is important for future listener events which might not map to table events exactly. But reasoning along that line, shouldn't dbName also be optional? We could have warehouse-level events we might want to fire. c) Given that FireEventRequestData data in FireEventRequest is marked as optional, I think there should be a null-guard on HiveMetaStore.fire_listener_event when switching on rqst.getData().getSetField() ? d) This can be tackled as a separate bug, but we should fire a FireEventRequest from HCatalog appends as well. [~ashutoshc], could I have a backup review on the changes this patch makes to ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ? To me, the changes look reasonable, but I'm unsure if this is exhaustive in all the places we would need to change to ensure we trigger this event for new files/data being added to a table which does not result in a metadata change(i.e. append cases) Add option to fire metastore event on insert Key: HIVE-9273 URL: https://issues.apache.org/jira/browse/HIVE-9273 Project: Hive Issue Type: New Feature Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-9273.patch HIVE-9271 adds the ability for the client to request firing metastore events. This can be used in the MoveTask to fire events when an insert is done that does not add partitions to a table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class
[ https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Hyde moved CALCITE-578 to HIVE-9510: --- Affects Version/s: (was: 1.0.0-incubating) Workflow: no-reopen-closed, patch-avail (was: jira) Key: HIVE-9510 (was: CALCITE-578) Project: Hive (was: Calcite) Throwing null point exception , when get join distinct row count from RelMdUtil.java class -- Key: HIVE-9510 URL: https://issues.apache.org/jira/browse/HIVE-9510 Project: Hive Issue Type: Bug Reporter: asko Assignee: Julian Hyde Attachments: log3_cbo5 Setting log level in logging.properties file as following: handlers=java.util.logging.ConsoleHandler .level=INFO org.apache.calcite.plan.RelOptPlanner.level=ALL java.util.logging.ConsoleHandler.level=ALL Running Q3 in TPCH-full after modifying , in order to test join reorder, but running failed. QL: set hive.cbo.enable=true; --ANALYZE TABLE customer COMPUTE STATISTICS for columns; --ANALYZE TABLE orders COMPUTE STATISTICS for columns; --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns; --Q3 -- the query select l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, o_shippriority from lineitem l join orders o on l.l_orderkey = o.o_orderkey join customer c on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey where o_orderdate '1995-03-15' and l_shipdate '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10; LOG: Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner fireRule FINE: call#15: Apply rule [FilterProjectTransposeRule] to [rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, '1995-03-15')), rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)] Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HiveFilter#138 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HiveProject#139 Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner notifyTransformation FINE: call#15: Rule FilterProjectTransposeRule arguments [rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, '1995-03-15')), rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)] produced HiveProject#139 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HepRelVertex#140 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HiveProject#141 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HepRelVertex#142 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - Foreign Key relation: 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: HiveJoin(condition=[=($0, $4)], joinType=[inner]) HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], l_shipdate=[$10]) HiveFilter(condition=[($10, '1995-03-15')]) HiveTableScan(table=[[default.lineitem]]) HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], o_shippriority=[$7]) HiveFilter(condition=[($4, '1995-03-15')]) HiveTableScan(table=[[default.orders]]) 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Primary - Foreign Key join: fkSide = 1 FKInfo:FKInfo(rowCount=1.00,ndv=-1.00) PKInfo:PKInfo(rowCount=1.00,ndv=-1.00,selectivity=1.00) isPKSideSimple:false NDV Scaling Factor:1.00 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - Foreign Key relation: 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: HiveJoin(condition=[=($8, $5)], joinType=[inner]) HiveJoin(condition=[=($0, $4)], joinType=[inner]) HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], l_shipdate=[$10]) HiveFilter(condition=[($10, '1995-03-15')]) HiveTableScan(table=[[default.lineitem]]) HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], o_shippriority=[$7]) HiveFilter(condition=[($4, '1995-03-15')]) HiveTableScan(table=[[default.orders]]) HiveProject(c_custkey=[$0], c_mktsegment=[$6]) HiveFilter(condition=[=($6, 'BUILDING')]) HiveTableScan(table=[[default.customer]]) 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Primary - Foreign Key join: fkSide = 1 FKInfo:FKInfo(rowCount=1.00,ndv=-1.00) PKInfo:PKInfo(rowCount=1.00,ndv=-1.00,selectivity=1.00) isPKSideSimple:false NDV Scaling Factor:1.00 Jan 29, 2015
[jira] [Commented] (HIVE-5472) support a simple scalar which returns the current timestamp
[ https://issues.apache.org/jira/browse/HIVE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297406#comment-14297406 ] Jason Dere commented on HIVE-5472: -- Looks like these test failures have been failing in other precommit runs as well. Doesn't look to be related. support a simple scalar which returns the current timestamp --- Key: HIVE-5472 URL: https://issues.apache.org/jira/browse/HIVE-5472 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0 Reporter: N Campbell Assignee: Jason Dere Attachments: HIVE-5472.1.patch, HIVE-5472.2.patch, HIVE-5472.3.patch, HIVE-5472.4.patch ISO-SQL has two forms of functions local and current timestamp where the former is a TIMESTAMP WITHOUT TIMEZONE and the latter with TIME ZONE select cast ( unix_timestamp() as timestamp ) from T implement a function which computes LOCAL TIMESTAMP which would be the current timestamp for the users session time zone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class
[ https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297412#comment-14297412 ] Julian Hyde commented on HIVE-9510: --- I moved this from CALCITE to HIVE because even though the error comes from Calcite code, it should be investigated as a Hive issue. The likely cause is that Hive did not set up its metadata provider correctly. Throwing null point exception , when get join distinct row count from RelMdUtil.java class -- Key: HIVE-9510 URL: https://issues.apache.org/jira/browse/HIVE-9510 Project: Hive Issue Type: Bug Reporter: asko Assignee: Julian Hyde Attachments: log3_cbo5 Setting log level in logging.properties file as following: handlers=java.util.logging.ConsoleHandler .level=INFO org.apache.calcite.plan.RelOptPlanner.level=ALL java.util.logging.ConsoleHandler.level=ALL Running Q3 in TPCH-full after modifying , in order to test join reorder, but running failed. QL: set hive.cbo.enable=true; --ANALYZE TABLE customer COMPUTE STATISTICS for columns; --ANALYZE TABLE orders COMPUTE STATISTICS for columns; --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns; --Q3 -- the query select l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, o_shippriority from lineitem l join orders o on l.l_orderkey = o.o_orderkey join customer c on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey where o_orderdate '1995-03-15' and l_shipdate '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10; LOG: Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner fireRule FINE: call#15: Apply rule [FilterProjectTransposeRule] to [rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, '1995-03-15')), rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)] Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HiveFilter#138 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HiveProject#139 Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner notifyTransformation FINE: call#15: Rule FilterProjectTransposeRule arguments [rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, '1995-03-15')), rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)] produced HiveProject#139 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HepRelVertex#140 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HiveProject#141 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HepRelVertex#142 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - Foreign Key relation: 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: HiveJoin(condition=[=($0, $4)], joinType=[inner]) HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], l_shipdate=[$10]) HiveFilter(condition=[($10, '1995-03-15')]) HiveTableScan(table=[[default.lineitem]]) HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], o_shippriority=[$7]) HiveFilter(condition=[($4, '1995-03-15')]) HiveTableScan(table=[[default.orders]]) 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Primary - Foreign Key join: fkSide = 1 FKInfo:FKInfo(rowCount=1.00,ndv=-1.00) PKInfo:PKInfo(rowCount=1.00,ndv=-1.00,selectivity=1.00) isPKSideSimple:false NDV Scaling Factor:1.00 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - Foreign Key relation: 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: HiveJoin(condition=[=($8, $5)], joinType=[inner]) HiveJoin(condition=[=($0, $4)], joinType=[inner]) HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], l_shipdate=[$10]) HiveFilter(condition=[($10, '1995-03-15')]) HiveTableScan(table=[[default.lineitem]]) HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], o_shippriority=[$7]) HiveFilter(condition=[($4, '1995-03-15')]) HiveTableScan(table=[[default.orders]]) HiveProject(c_custkey=[$0], c_mktsegment=[$6]) HiveFilter(condition=[=($6, 'BUILDING')]) HiveTableScan(table=[[default.customer]]) 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Primary - Foreign Key join: fkSide = 1 FKInfo:FKInfo(rowCount=1.00,ndv=-1.00) PKInfo:PKInfo(rowCount=1.00,ndv=-1.00,selectivity=1.00) isPKSideSimple:false NDV
[jira] [Updated] (HIVE-8307) null character in columns.comments schema property breaks jobconf.xml
[ https://issues.apache.org/jira/browse/HIVE-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8307: --- Attachment: HIVE-8307.patch Patch to remove comments from serde properties. Expecting more failures for golden file updates. [~hagleitn] Can you take a look? null character in columns.comments schema property breaks jobconf.xml - Key: HIVE-8307 URL: https://issues.apache.org/jira/browse/HIVE-8307 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0, 0.14.0, 0.13.1 Reporter: Carl Laird Attachments: HIVE-8307.patch It would appear that the fix for https://issues.apache.org/jira/browse/HIVE-6681 is causing the null character to show up in job config xml files: I get the following when trying to insert into an elasticsearch backed table: [Fatal Error] :336:51: Character reference # 14/06/17 14:40:11 FATAL conf.Configuration: error parsing conf file: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # Exception in thread main java.lang.RuntimeException: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1263) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1129) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1063) at org.apache.hadoop.conf.Configuration.get(Configuration.java:416) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:604) at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:1273) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:667) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1181) ... 11 more Execution failed with exit status: 1 Line 336 of jobconf.xml: propertynamecolumns.comments/namevalue#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;/value/property See https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ for more discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8307) null character in columns.comments schema property breaks jobconf.xml
[ https://issues.apache.org/jira/browse/HIVE-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8307: --- Assignee: Ashutosh Chauhan Affects Version/s: 0.14.0 Status: Patch Available (was: Open) null character in columns.comments schema property breaks jobconf.xml - Key: HIVE-8307 URL: https://issues.apache.org/jira/browse/HIVE-8307 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.1, 0.14.0, 0.13.0 Reporter: Carl Laird Assignee: Ashutosh Chauhan Attachments: HIVE-8307.patch It would appear that the fix for https://issues.apache.org/jira/browse/HIVE-6681 is causing the null character to show up in job config xml files: I get the following when trying to insert into an elasticsearch backed table: [Fatal Error] :336:51: Character reference # 14/06/17 14:40:11 FATAL conf.Configuration: error parsing conf file: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # Exception in thread main java.lang.RuntimeException: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1263) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1129) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1063) at org.apache.hadoop.conf.Configuration.get(Configuration.java:416) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:604) at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:1273) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:667) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1181) ... 11 more Execution failed with exit status: 1 Line 336 of jobconf.xml: propertynamecolumns.comments/namevalue#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;/value/property See https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ for more discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 30422: remove comments from serde properties.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30422/ --- Review request for hive and Gunther Hagleitner. Bugs: HIVE-8307 https://issues.apache.org/jira/browse/HIVE-8307 Repository: hive-git Description --- remove comments from serde properties. Diffs - metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 612f927 ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 3204af8 ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out cea9eb5 ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out b696e83 ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out c58fa36 ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 772ccec ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out ea7b8ff ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 24011a3 ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 969189f ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out f458c33 ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 52a69e8 ql/src/test/results/clientpositive/binary_output_format.q.out a0e8e83 ql/src/test/results/clientpositive/bucket1.q.out 13ec735 ql/src/test/results/clientpositive/bucket2.q.out 32a77c3 ql/src/test/results/clientpositive/bucket3.q.out ff7173e ql/src/test/results/clientpositive/bucket_map_join_1.q.out 69a61d4 ql/src/test/results/clientpositive/bucket_map_join_2.q.out fc55855 ql/src/test/results/clientpositive/bucketcontext_1.q.out 48e9f10 ql/src/test/results/clientpositive/bucketcontext_2.q.out 695feb1 ql/src/test/results/clientpositive/bucketcontext_3.q.out b3929f3 ql/src/test/results/clientpositive/bucketcontext_4.q.out cd81f9e ql/src/test/results/clientpositive/bucketcontext_5.q.out ef45b4a ql/src/test/results/clientpositive/bucketcontext_6.q.out 62edc1d ql/src/test/results/clientpositive/bucketcontext_7.q.out bd79ff2 ql/src/test/results/clientpositive/bucketcontext_8.q.out b6a9ad2 ql/src/test/results/clientpositive/bucketmapjoin1.q.out b8e4b41 ql/src/test/results/clientpositive/bucketmapjoin10.q.out 493e038 ql/src/test/results/clientpositive/bucketmapjoin11.q.out 3a4b2b5 ql/src/test/results/clientpositive/bucketmapjoin12.q.out 537f19f ql/src/test/results/clientpositive/bucketmapjoin13.q.out a296197 ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 ql/src/test/results/clientpositive/bucketmapjoin8.q.out 6d48156 ql/src/test/results/clientpositive/bucketmapjoin9.q.out 01d7cc9 ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out e4c87fa ql/src/test/results/clientpositive/char_serde.q.out 8f6f8ce ql/src/test/results/clientpositive/columnstats_partlvl.q.out e431b0f ql/src/test/results/clientpositive/columnstats_tbllvl.q.out de21af8 ql/src/test/results/clientpositive/date_serde.q.out ff09f70 ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out a209ae9 ql/src/test/results/clientpositive/filter_join_breaktask.q.out 3631412 ql/src/test/results/clientpositive/groupby_map_ppr.q.out 71a6578 ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out 4414b79 ql/src/test/results/clientpositive/groupby_ppr.q.out 4fdcbfd ql/src/test/results/clientpositive/groupby_ppr_multi_distinct.q.out cd3454c ql/src/test/results/clientpositive/groupby_sort_6.q.out 4e5c96f ql/src/test/results/clientpositive/input23.q.out 0bd543b ql/src/test/results/clientpositive/input42.q.out 95e8553 ql/src/test/results/clientpositive/input_part1.q.out b71faff ql/src/test/results/clientpositive/input_part2.q.out 77da2eb ql/src/test/results/clientpositive/input_part7.q.out 6094f9c ql/src/test/results/clientpositive/input_part9.q.out 6e60679 ql/src/test/results/clientpositive/join17.q.out 26aabcf ql/src/test/results/clientpositive/join26.q.out 148479a ql/src/test/results/clientpositive/join32.q.out 9a24d8c ql/src/test/results/clientpositive/join32_lessSize.q.out 20858cb ql/src/test/results/clientpositive/join33.q.out 9a24d8c ql/src/test/results/clientpositive/join34.q.out a20e49f ql/src/test/results/clientpositive/join35.q.out 937539c ql/src/test/results/clientpositive/join9.q.out 8421036 ql/src/test/results/clientpositive/join_filters_overlap.q.out 00ca0e5 ql/src/test/results/clientpositive/join_map_ppr.q.out 349c9f5
Re: Review Request 30422: remove comments from serde properties.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30422/ --- (Updated Jan. 29, 2015, 7:41 p.m.) Review request for hive and Gunther Hagleitner. Bugs: HIVE-8307 https://issues.apache.org/jira/browse/HIVE-8307 Repository: hive-git Description --- remove comments from serde properties. Diffs - metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 612f927 ql/src/test/results/clientpositive/auto_join_reordering_values.q.out 3204af8 ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out cea9eb5 ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out b696e83 ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out c58fa36 ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out 772ccec ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out ea7b8ff ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out 24011a3 ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out 969189f ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out f458c33 ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 52a69e8 ql/src/test/results/clientpositive/binary_output_format.q.out a0e8e83 ql/src/test/results/clientpositive/bucket1.q.out 13ec735 ql/src/test/results/clientpositive/bucket2.q.out 32a77c3 ql/src/test/results/clientpositive/bucket3.q.out ff7173e ql/src/test/results/clientpositive/bucket_map_join_1.q.out 69a61d4 ql/src/test/results/clientpositive/bucket_map_join_2.q.out fc55855 ql/src/test/results/clientpositive/bucketcontext_1.q.out 48e9f10 ql/src/test/results/clientpositive/bucketcontext_2.q.out 695feb1 ql/src/test/results/clientpositive/bucketcontext_3.q.out b3929f3 ql/src/test/results/clientpositive/bucketcontext_4.q.out cd81f9e ql/src/test/results/clientpositive/bucketcontext_5.q.out ef45b4a ql/src/test/results/clientpositive/bucketcontext_6.q.out 62edc1d ql/src/test/results/clientpositive/bucketcontext_7.q.out bd79ff2 ql/src/test/results/clientpositive/bucketcontext_8.q.out b6a9ad2 ql/src/test/results/clientpositive/bucketmapjoin1.q.out b8e4b41 ql/src/test/results/clientpositive/bucketmapjoin10.q.out 493e038 ql/src/test/results/clientpositive/bucketmapjoin11.q.out 3a4b2b5 ql/src/test/results/clientpositive/bucketmapjoin12.q.out 537f19f ql/src/test/results/clientpositive/bucketmapjoin13.q.out a296197 ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 ql/src/test/results/clientpositive/bucketmapjoin8.q.out 6d48156 ql/src/test/results/clientpositive/bucketmapjoin9.q.out 01d7cc9 ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out e4c87fa ql/src/test/results/clientpositive/char_serde.q.out 8f6f8ce ql/src/test/results/clientpositive/columnstats_partlvl.q.out e431b0f ql/src/test/results/clientpositive/columnstats_tbllvl.q.out de21af8 ql/src/test/results/clientpositive/date_serde.q.out ff09f70 ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out a209ae9 ql/src/test/results/clientpositive/filter_join_breaktask.q.out 3631412 ql/src/test/results/clientpositive/groupby_map_ppr.q.out 71a6578 ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out 4414b79 ql/src/test/results/clientpositive/groupby_ppr.q.out 4fdcbfd ql/src/test/results/clientpositive/groupby_ppr_multi_distinct.q.out cd3454c ql/src/test/results/clientpositive/groupby_sort_6.q.out 4e5c96f ql/src/test/results/clientpositive/input23.q.out 0bd543b ql/src/test/results/clientpositive/input42.q.out 95e8553 ql/src/test/results/clientpositive/input_part1.q.out b71faff ql/src/test/results/clientpositive/input_part2.q.out 77da2eb ql/src/test/results/clientpositive/input_part7.q.out 6094f9c ql/src/test/results/clientpositive/input_part9.q.out 6e60679 ql/src/test/results/clientpositive/join17.q.out 26aabcf ql/src/test/results/clientpositive/join26.q.out 148479a ql/src/test/results/clientpositive/join32.q.out 9a24d8c ql/src/test/results/clientpositive/join32_lessSize.q.out 20858cb ql/src/test/results/clientpositive/join33.q.out 9a24d8c ql/src/test/results/clientpositive/join34.q.out a20e49f ql/src/test/results/clientpositive/join35.q.out 937539c ql/src/test/results/clientpositive/join9.q.out 8421036 ql/src/test/results/clientpositive/join_filters_overlap.q.out 00ca0e5 ql/src/test/results/clientpositive/join_map_ppr.q.out 349c9f5
[jira] [Commented] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.
[ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297426#comment-14297426 ] Mithun Radhakrishnan commented on HIVE-9471: Unrelated test-failures, methinks. Bad seek in uncompressed ORC, at row-group boundary. Key: HIVE-9471 URL: https://issues.apache.org/jira/browse/HIVE-9471 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9471.2.patch, HIVE-9471.3.patch, data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group. {code:title=stacktrace} java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) ... Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112) at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852) {code} I'll attach the script to reproduce the problem herewith. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9509) Restore partition spec validation removed by HIVE-9445
[ https://issues.apache.org/jira/browse/HIVE-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-9509: --- Status: Patch Available (was: Open) Restore partition spec validation removed by HIVE-9445 -- Key: HIVE-9509 URL: https://issues.apache.org/jira/browse/HIVE-9509 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-9509.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9509) Restore partition spec validation removed by HIVE-9445
[ https://issues.apache.org/jira/browse/HIVE-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-9509: --- Attachment: HIVE-9509.patch [~ashutoshc] [~brocknoland] can you take a look? Restore partition spec validation removed by HIVE-9445 -- Key: HIVE-9509 URL: https://issues.apache.org/jira/browse/HIVE-9509 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-9509.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297457#comment-14297457 ] Marcelo Vanzin commented on HIVE-9487: -- I failed git branch management 101. New patch should be correct. Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9487: - Attachment: HIVE-9487.2-spark.patch Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8379) NanoTimeUtils performs some work needlessly
[ https://issues.apache.org/jira/browse/HIVE-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-8379: -- Status: Patch Available (was: Open) NanoTimeUtils performs some work needlessly --- Key: HIVE-8379 URL: https://issues.apache.org/jira/browse/HIVE-8379 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Sergio Peña Priority: Minor Attachments: HIVE-8379.1.patch Portions of the math done with the constants can be pre-computed: https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java#L70 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8379) NanoTimeUtils performs some work needlessly
[ https://issues.apache.org/jira/browse/HIVE-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-8379: -- Attachment: HIVE-8379.1.patch Patch attached that makes the code more readable by using constants names specific to the nano time. I run some JMH micro benchmark times look almost the same for both approaches. NanoTimeUtils performs some work needlessly --- Key: HIVE-8379 URL: https://issues.apache.org/jira/browse/HIVE-8379 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Sergio Peña Priority: Minor Attachments: HIVE-8379.1.patch Portions of the math done with the constants can be pre-computed: https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/NanoTimeUtils.java#L70 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
Congratulations!! :) On Wed, Jan 28, 2015 at 1:15 PM, Carl Steinbach c...@apache.org wrote: I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl
[jira] [Updated] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class
[ https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9510: --- Description: Setting log level in logging.properties file as following: {noformat} handlers=java.util.logging.ConsoleHandler.level=INFO org.apache.calcite.plan.RelOptPlanner.level=ALL java.util.logging.ConsoleHandler.level=ALL {noformat} Running Q3 in TPCH-full after modifying , in order to test join reorder, but running failed. QL: {code:sql} set hive.cbo.enable=true; --ANALYZE TABLE customer COMPUTE STATISTICS for columns; --ANALYZE TABLE orders COMPUTE STATISTICS for columns; --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns; --Q3 -- the query select l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, o_shippriority from lineitem l join orders o on l.l_orderkey = o.o_orderkey join customer c on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey where o_orderdate '1995-03-15' and l_shipdate '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10; {code} LOG: was: Setting log level in logging.properties file as following: {noformat} handlers=java.util.logging.ConsoleHandler.level=INFO org.apache.calcite.plan.RelOptPlanner.level=ALL java.util.logging.ConsoleHandler.level=ALL {noformat} Running Q3 in TPCH-full after modifying , in order to test join reorder, but running failed. QL: set hive.cbo.enable=true; --ANALYZE TABLE customer COMPUTE STATISTICS for columns; --ANALYZE TABLE orders COMPUTE STATISTICS for columns; --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns; --Q3 -- the query {code:sql} select l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, o_shippriority from lineitem l join orders o on l.l_orderkey = o.o_orderkey join customer c on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey where o_orderdate '1995-03-15' and l_shipdate '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10; {code} LOG: Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner fireRule FINE: call#15: Apply rule [FilterProjectTransposeRule] to [rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, '1995-03-15')), rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)] Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HiveFilter#138 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HiveProject#139 Jan 29, 2015 11:48:04 AM org.apache.calcite.plan.AbstractRelOptPlanner notifyTransformation FINE: call#15: Rule FilterProjectTransposeRule arguments [rel#107:HiveFilter.HIVE.[](input=HepRelVertex#106,condition=($2, '1995-03-15')), rel#105:HiveProject.HIVE.[](input=HepRelVertex#104,o_orderkey=$0,o_custkey=$1,o_orderdate=$4,o_shippriority=$7)] produced HiveProject#139 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HepRelVertex#140 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HiveProject#141 Jan 29, 2015 11:48:04 AM org.apache.calcite.rel.AbstractRelNode init FINEST: new HepRelVertex#142 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - Foreign Key relation: 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: HiveJoin(condition=[=($0, $4)], joinType=[inner]) HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], l_shipdate=[$10]) HiveFilter(condition=[($10, '1995-03-15')]) HiveTableScan(table=[[default.lineitem]]) HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], o_shippriority=[$7]) HiveFilter(condition=[($4, '1995-03-15')]) HiveTableScan(table=[[default.orders]]) 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Primary - Foreign Key join: fkSide = 1 FKInfo:FKInfo(rowCount=1.00,ndv=-1.00) PKInfo:PKInfo(rowCount=1.00,ndv=-1.00,selectivity=1.00) isPKSideSimple:false NDV Scaling Factor:1.00 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: Identified Primary - Foreign Key relation: 15/01/29 11:48:04 [main]: DEBUG stats.HiveRelMdRowCount: HiveJoin(condition=[=($8, $5)], joinType=[inner]) HiveJoin(condition=[=($0, $4)], joinType=[inner]) HiveProject(l_orderkey=[$0], l_extendedprice=[$5], l_discount=[$6], l_shipdate=[$10]) HiveFilter(condition=[($10, '1995-03-15')]) HiveTableScan(table=[[default.lineitem]]) HiveProject(o_orderkey=[$0], o_custkey=[$1], o_orderdate=[$4], o_shippriority=[$7]) HiveFilter(condition=[($4, '1995-03-15')]) HiveTableScan(table=[[default.orders]]) HiveProject(c_custkey=[$0], c_mktsegment=[$6]) HiveFilter(condition=[=($6, 'BUILDING')])
[jira] [Updated] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class
[ https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9510: --- Description: Setting log level in logging.properties file as following: {noformat} handlers=java.util.logging.ConsoleHandler.level=INFO org.apache.calcite.plan.RelOptPlanner.level=ALL java.util.logging.ConsoleHandler.level=ALL {noformat} Running Q3 in TPCH-full after modifying , in order to test join reorder, but running failed. QL: {code:sql} set hive.cbo.enable=true; --ANALYZE TABLE customer COMPUTE STATISTICS for columns; --ANALYZE TABLE orders COMPUTE STATISTICS for columns; --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns; --Q3 -- the query select l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, o_shippriority from lineitem l join orders o on l.l_orderkey = o.o_orderkey join customer c on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey where o_orderdate '1995-03-15' and l_shipdate '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10; {code} LOG: see log.txt was: Setting log level in logging.properties file as following: {noformat} handlers=java.util.logging.ConsoleHandler.level=INFO org.apache.calcite.plan.RelOptPlanner.level=ALL java.util.logging.ConsoleHandler.level=ALL {noformat} Running Q3 in TPCH-full after modifying , in order to test join reorder, but running failed. QL: {code:sql} set hive.cbo.enable=true; --ANALYZE TABLE customer COMPUTE STATISTICS for columns; --ANALYZE TABLE orders COMPUTE STATISTICS for columns; --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns; --Q3 -- the query select l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, o_shippriority from lineitem l join orders o on l.l_orderkey = o.o_orderkey join customer c on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey where o_orderdate '1995-03-15' and l_shipdate '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10; {code} LOG: Throwing null point exception , when get join distinct row count from RelMdUtil.java class -- Key: HIVE-9510 URL: https://issues.apache.org/jira/browse/HIVE-9510 Project: Hive Issue Type: Bug Reporter: asko Assignee: Julian Hyde Attachments: log.txt, log3_cbo5 Setting log level in logging.properties file as following: {noformat} handlers=java.util.logging.ConsoleHandler.level=INFO org.apache.calcite.plan.RelOptPlanner.level=ALL java.util.logging.ConsoleHandler.level=ALL {noformat} Running Q3 in TPCH-full after modifying , in order to test join reorder, but running failed. QL: {code:sql} set hive.cbo.enable=true; --ANALYZE TABLE customer COMPUTE STATISTICS for columns; --ANALYZE TABLE orders COMPUTE STATISTICS for columns; --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns; --Q3 -- the query select l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, o_shippriority from lineitem l join orders o on l.l_orderkey = o.o_orderkey join customer c on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey where o_orderdate '1995-03-15' and l_shipdate '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10; {code} LOG: see log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class
[ https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9510: --- Attachment: log.txt Throwing null point exception , when get join distinct row count from RelMdUtil.java class -- Key: HIVE-9510 URL: https://issues.apache.org/jira/browse/HIVE-9510 Project: Hive Issue Type: Bug Reporter: asko Assignee: Julian Hyde Attachments: log.txt, log3_cbo5 Setting log level in logging.properties file as following: {noformat} handlers=java.util.logging.ConsoleHandler.level=INFO org.apache.calcite.plan.RelOptPlanner.level=ALL java.util.logging.ConsoleHandler.level=ALL {noformat} Running Q3 in TPCH-full after modifying , in order to test join reorder, but running failed. QL: {code:sql} set hive.cbo.enable=true; --ANALYZE TABLE customer COMPUTE STATISTICS for columns; --ANALYZE TABLE orders COMPUTE STATISTICS for columns; --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns; --Q3 -- the query select l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, o_shippriority from lineitem l join orders o on l.l_orderkey = o.o_orderkey join customer c on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey where o_orderdate '1995-03-15' and l_shipdate '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10; {code} LOG: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9511) Switch Tez to 0.6.0
Damien Carol created HIVE-9511: -- Summary: Switch Tez to 0.6.0 Key: HIVE-9511 URL: https://issues.apache.org/jira/browse/HIVE-9511 Project: Hive Issue Type: Improvement Reporter: Damien Carol Tez 0.6.0 has been released. Research to switch to version 0.6.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9430) NullPointerException on ALTER TABLE ADD PARTITION if no value given
[ https://issues.apache.org/jira/browse/HIVE-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña reassigned HIVE-9430: - Assignee: Sergio Peña NullPointerException on ALTER TABLE ADD PARTITION if no value given --- Key: HIVE-9430 URL: https://issues.apache.org/jira/browse/HIVE-9430 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0 Reporter: Danny Lade Assignee: Sergio Peña ALTER TABLE xxx ADD PARTITION (yyy) results in NullPointerException: {code:java} 2015-01-21 10:31:12,636 ERROR [main]: ql.Driver (SessionState.java:printError(545)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.validatePartitionValues(DDLSemanticAnalyzer.java:2999) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableAddParts(DDLSemanticAnalyzer.java:2680) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:393) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {code} Therefore there is currently no way to add a partition to an already existing table.: {code:SQL} alter table XXX add partition (YYY = 'VALUE'); FAILED: SemanticException table is not partitioned but partition spec exists: {YYY=VALUE} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9430) NullPointerException on ALTER TABLE ADD PARTITION if no value given
[ https://issues.apache.org/jira/browse/HIVE-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9430: -- Attachment: HIVE-9430.1.patch NullPointerException on ALTER TABLE ADD PARTITION if no value given --- Key: HIVE-9430 URL: https://issues.apache.org/jira/browse/HIVE-9430 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0 Reporter: Danny Lade Assignee: Sergio Peña Attachments: HIVE-9430.1.patch ALTER TABLE xxx ADD PARTITION (yyy) results in NullPointerException: {code:java} 2015-01-21 10:31:12,636 ERROR [main]: ql.Driver (SessionState.java:printError(545)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.validatePartitionValues(DDLSemanticAnalyzer.java:2999) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableAddParts(DDLSemanticAnalyzer.java:2680) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:393) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {code} Therefore there is currently no way to add a partition to an already existing table.: {code:SQL} alter table XXX add partition (YYY = 'VALUE'); FAILED: SemanticException table is not partitioned but partition spec exists: {YYY=VALUE} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9430) NullPointerException on ALTER TABLE ADD PARTITION if no value given
[ https://issues.apache.org/jira/browse/HIVE-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9430: -- Status: Patch Available (was: Open) NullPointerException on ALTER TABLE ADD PARTITION if no value given --- Key: HIVE-9430 URL: https://issues.apache.org/jira/browse/HIVE-9430 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0 Reporter: Danny Lade Assignee: Sergio Peña Attachments: HIVE-9430.1.patch ALTER TABLE xxx ADD PARTITION (yyy) results in NullPointerException: {code:java} 2015-01-21 10:31:12,636 ERROR [main]: ql.Driver (SessionState.java:printError(545)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.validatePartitionValues(DDLSemanticAnalyzer.java:2999) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableAddParts(DDLSemanticAnalyzer.java:2680) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:393) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {code} Therefore there is currently no way to add a partition to an already existing table.: {code:SQL} alter table XXX add partition (YYY = 'VALUE'); FAILED: SemanticException table is not partitioned but partition spec exists: {YYY=VALUE} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9500) Support nested structs over 24 levels.
[ https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297809#comment-14297809 ] Thejas M Nair commented on HIVE-9500: - [~aihuaxu] Is it failing for create table with avro format ? Does avro use lazysimpleserde ? Support nested structs over 24 levels. -- Key: HIVE-9500 URL: https://issues.apache.org/jira/browse/HIVE-9500 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Labels: SerDe Customer has deeply nested avro structure and is receiving the following error when performing queries. 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting supported for LazySimpleSerde is 23 Unable to work with level 24 Currently we support up to 24 levels of nested structs when hive.serialization.extend.nesting.levels is set to true, while the customers have the requirement to support more than that. It would be better to make the supported levels configurable or completely removed (i.e., we can support any number of levels). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9510) Throwing null point exception , when get join distinct row count from RelMdUtil.java class
[ https://issues.apache.org/jira/browse/HIVE-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9510: --- Component/s: CBO Throwing null point exception , when get join distinct row count from RelMdUtil.java class -- Key: HIVE-9510 URL: https://issues.apache.org/jira/browse/HIVE-9510 Project: Hive Issue Type: Bug Components: CBO Reporter: asko Assignee: Julian Hyde Attachments: log.txt, log3_cbo5 Setting log level in logging.properties file as following: {noformat} handlers=java.util.logging.ConsoleHandler.level=INFO org.apache.calcite.plan.RelOptPlanner.level=ALL java.util.logging.ConsoleHandler.level=ALL {noformat} Running Q3 in TPCH-full after modifying , in order to test join reorder, but running failed. QL: {code:sql} set hive.cbo.enable=true; --ANALYZE TABLE customer COMPUTE STATISTICS for columns; --ANALYZE TABLE orders COMPUTE STATISTICS for columns; --ANALYZE TABLE lineitem COMPUTE STATISTICS for columns; --Q3 -- the query select l_orderkey, sum(l_extendedprice*(1-l_discount)) as revenue, o_orderdate, o_shippriority from lineitem l join orders o on l.l_orderkey = o.o_orderkey join customer c on c.c_mktsegment = 'BUILDING' and c.c_custkey = o.o_custkey where o_orderdate '1995-03-15' and l_shipdate '1995-03-15' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10; {code} LOG: see log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9498) Update golden files of join38 subquery_in on trunk due to 9327
[ https://issues.apache.org/jira/browse/HIVE-9498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297729#comment-14297729 ] Prasanth Jayachandran commented on HIVE-9498: - +1 Update golden files of join38 subquery_in on trunk due to 9327 Key: HIVE-9498 URL: https://issues.apache.org/jira/browse/HIVE-9498 Project: Hive Issue Type: Task Affects Versions: 1.2.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-9498.patch Missed updating golden files for these tests while committing HIVE-9327 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297755#comment-14297755 ] Hive QA commented on HIVE-9487: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695341/HIVE-9487.2-spark.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7361 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_types org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_context_ngrams org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/693/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/693/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-693/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695341 - PreCommit-HIVE-SPARK-Build Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9468) Test groupby3_map_skew.q fails due to decimal precision difference
[ https://issues.apache.org/jira/browse/HIVE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297776#comment-14297776 ] Xuefu Zhang commented on HIVE-9468: --- parquet_types.q {code} 1 121 1 8 1.174970197678 2.062159062730128 --- 1 121 1 8 1.174970197678 2.0621590627301285 238c238 3 120 1 7 1.171428578240531 1.8 --- 3 120 1 7 1.171428578240531 1.7996 {code} Test groupby3_map_skew.q fails due to decimal precision difference -- Key: HIVE-9468 URL: https://issues.apache.org/jira/browse/HIVE-9468 Project: Hive Issue Type: Bug Components: Tests Reporter: Xuefu Zhang From test run, http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/682/testReport: {code} Running: diff -a /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/groupby3_map_skew.q.out /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/groupby3_map_skew.q.out 162c162 130091.0260.182 256.10355987055016 98.00.0 142.92680950752379 143.06995106518903 20428.07288 20469.0109 --- 130091.0260.182 256.10355987055016 98.00.0 142.9268095075238 143.06995106518906 20428.07288 20469.0109 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9514) schematool is broken in hive 1.0.0
[ https://issues.apache.org/jira/browse/HIVE-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297827#comment-14297827 ] Thejas M Nair commented on HIVE-9514: - I ran all metastore tool unit tests for this and they pass. Also manually verified schema initialization and upgrade with derby. Also ran queries with hive.metastore.schema.verification=true schematool is broken in hive 1.0.0 -- Key: HIVE-9514 URL: https://issues.apache.org/jira/browse/HIVE-9514 Project: Hive Issue Type: Bug Components: Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.0.0 Attachments: HIVE-9514.1.patch Schematool gives following error - {code} bin/schematool -dbType derby -initSchema Starting metastore schema initialization to 1.0 org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified for initialization: 1.0 {code} Metastore schema hasn't changed from 0.14.0 to 1.0.0. So there is no need for new .sql files for 1.0.0. However, schematool needs to be made aware of the metastore schema equivalence. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9103) Support backup task for join related optimization [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297605#comment-14297605 ] Hive QA commented on HIVE-9103: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695326/HIVE-9103.2-spark.patch {color:red}ERROR:{color} -1 due to 30 failed/errored test(s), 7358 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join31 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_rearrange org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_identity_project_remove_skip org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join29 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join31 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_8 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/692/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/692/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-692/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 30 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695326 - PreCommit-HIVE-SPARK-Build Support backup task for join related optimization [Spark Branch] Key: HIVE-9103 URL: https://issues.apache.org/jira/browse/HIVE-9103 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chao Priority: Blocker Attachments: HIVE-9103-1.spark.patch, HIVE-9103.2-spark.patch In MR, backup task can be executed if the original task, which probably contains certain (join) optimization fails. This JIRA is to track this topic for Spark. We need to determine if we need this and implement if necessary. This is a followup of HIVE-9099. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9514) schematool is broken in hive 1.0.0
[ https://issues.apache.org/jira/browse/HIVE-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-9514: Attachment: HIVE-9514.1.patch schematool is broken in hive 1.0.0 -- Key: HIVE-9514 URL: https://issues.apache.org/jira/browse/HIVE-9514 Project: Hive Issue Type: Bug Components: Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.0.0 Attachments: HIVE-9514.1.patch Schematool gives following error - {code} bin/schematool -dbType derby -initSchema Starting metastore schema initialization to 1.0 org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified for initialization: 1.0 {code} Metastore schema hasn't changed from 0.14.0 to 1.0.0. So there is no need for new .sql files for 1.0.0. However, schematool needs to be made aware of the metastore schema equivalence. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9512) HIVE-9327 causing regression in stats annotation
[ https://issues.apache.org/jira/browse/HIVE-9512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297643#comment-14297643 ] Prasanth Jayachandran commented on HIVE-9512: - [~jcamachorodriguez] Any idea why? HIVE-9327 causing regression in stats annotation Key: HIVE-9512 URL: https://issues.apache.org/jira/browse/HIVE-9512 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran HIVE-9327 causes regression to statistics annotation test case. Regression can be seen here https://github.com/apache/hive/blob/trunk/ql/src/test/results/clientpositive/annotate_stats_select.q.out#L1065 The expected data size is 194 but 0 is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9512) HIVE-9327 causing regression in stats annotation
Prasanth Jayachandran created HIVE-9512: --- Summary: HIVE-9327 causing regression in stats annotation Key: HIVE-9512 URL: https://issues.apache.org/jira/browse/HIVE-9512 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran HIVE-9327 causes regression to statistics annotation test case. Regression can be seen here https://github.com/apache/hive/blob/trunk/ql/src/test/results/clientpositive/annotate_stats_select.q.out#L1065 The expected data size is 194 but 0 is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9500) Support nested structs over 24 levels.
[ https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297676#comment-14297676 ] Aihua Xu commented on HIVE-9500: It fails exactly the same place as the one in HIVE-3253 during the initialization of SerDe for the table creation and the query as well. Support nested structs over 24 levels. -- Key: HIVE-9500 URL: https://issues.apache.org/jira/browse/HIVE-9500 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Labels: SerDe Customer has deeply nested avro structure and is receiving the following error when performing queries. 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting supported for LazySimpleSerde is 23 Unable to work with level 24 Currently we support up to 24 levels of nested structs when hive.serialization.extend.nesting.levels is set to true, while the customers have the requirement to support more than that. It would be better to make the supported levels configurable or completely removed (i.e., we can support any number of levels). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9498) Update golden files of join38 subquery_in on trunk due to 9327
[ https://issues.apache.org/jira/browse/HIVE-9498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-9498: --- Resolution: Fixed Fix Version/s: 1.2.0 Status: Resolved (was: Patch Available) Committed to trunk. Update golden files of join38 subquery_in on trunk due to 9327 Key: HIVE-9498 URL: https://issues.apache.org/jira/browse/HIVE-9498 Project: Hive Issue Type: Task Affects Versions: 1.2.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.2.0 Attachments: HIVE-9498.patch Missed updating golden files for these tests while committing HIVE-9327 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9392) JoinStatsRule miscalculates join cardinality as incorrect NDV is used due to column names having duplicated fqColumnName
[ https://issues.apache.org/jira/browse/HIVE-9392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297825#comment-14297825 ] Prasanth Jayachandran commented on HIVE-9392: - There is another case where data size becomes 0. I am suspecting it to be caused by HIVE-9512. JoinStatsRule miscalculates join cardinality as incorrect NDV is used due to column names having duplicated fqColumnName Key: HIVE-9392 URL: https://issues.apache.org/jira/browse/HIVE-9392 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Prasanth Jayachandran Priority: Critical Fix For: 0.15.0 Attachments: HIVE-9392.1.patch, HIVE-9392.2.patch In JoinStatsRule.process the join column statistics are stored in HashMap joinedColStats, the key used which is the ColStatistics.fqColName is duplicated between join column in the same vertex, as a result distinctVals ends up having duplicated values which negatively affects the join cardinality estimation. The duplicate keys are usually named KEY.reducesinkkey0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)