[jira] [Updated] (HIVE-6806) CREATE TABLE should support STORED AS AVRO
[ https://issues.apache.org/jira/browse/HIVE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish Kumar Singh updated HIVE-6806: - Release Note: Add support to infer Avro schema from Hive table schema. Avro-backed tables can simply be created by using STORED AS AVRO in DDL statement. AvroSerDe takes care of creating appropriate Avro schema from Hive table schema, a big win in terms of Avro usability in Hive. CREATE TABLE should support STORED AS AVRO -- Key: HIVE-6806 URL: https://issues.apache.org/jira/browse/HIVE-6806 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.12.0 Reporter: Jeremy Beard Assignee: Ashish Kumar Singh Priority: Minor Labels: Avro, TODOC14 Fix For: 0.14.0 Attachments: HIVE-6806.1.patch, HIVE-6806.2.patch, HIVE-6806.3.patch, HIVE-6806.patch Avro is well established and widely used within Hive, however creating Avro-backed tables requires the messy listing of the SerDe, InputFormat and OutputFormat classes. Similarly to HIVE-5783 for Parquet, Hive would be easier to use if it had native Avro support. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6806) CREATE TABLE should support STORED AS AVRO
[ https://issues.apache.org/jira/browse/HIVE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088879#comment-14088879 ] Ashish Kumar Singh commented on HIVE-6806: -- [~leftylev] Thanks for detailed info here. I have updated the JIRA's release note and documentation on Avro's usage in Hive, https://cwiki.apache.org/confluence/display/Hive/AvroSerDe. Feel free to make it better. CREATE TABLE should support STORED AS AVRO -- Key: HIVE-6806 URL: https://issues.apache.org/jira/browse/HIVE-6806 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.12.0 Reporter: Jeremy Beard Assignee: Ashish Kumar Singh Priority: Minor Labels: Avro, TODOC14 Fix For: 0.14.0 Attachments: HIVE-6806.1.patch, HIVE-6806.2.patch, HIVE-6806.3.patch, HIVE-6806.patch Avro is well established and widely used within Hive, however creating Avro-backed tables requires the messy listing of the SerDe, InputFormat and OutputFormat classes. Similarly to HIVE-5783 for Parquet, Hive would be easier to use if it had native Avro support. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7618) TestDDLWithRemoteMetastoreSecondNamenode unit test failure
[ https://issues.apache.org/jira/browse/HIVE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088923#comment-14088923 ] Hive QA commented on HIVE-7618: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660244/HIVE-7618.2.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5883 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/201/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/201/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-201/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12660244 TestDDLWithRemoteMetastoreSecondNamenode unit test failure -- Key: HIVE-7618 URL: https://issues.apache.org/jira/browse/HIVE-7618 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-7618.1.patch, HIVE-7618.2.patch Looks like TestDDLWithRemoteMetastoreSecondNamenode started failing after HIVE-6584 was committed. {noformat} TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode:272-createTableAndCheck:201-createTableAndCheck:219 Table should be located in the second filesystem expected:[hdfs] but was:[pfile] {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7223) Support generic PartitionSpecs in Metastore partition-functions
[ https://issues.apache.org/jira/browse/HIVE-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088926#comment-14088926 ] Hive QA commented on HIVE-7223: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660271/HIVE-7223.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/202/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/202/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-202/ Messages: {noformat} This message was trimmed, see log for full details [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-metastore: Compilation failure: Compilation failure: [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[117,44] cannot find symbol [ERROR] symbol: class PartitionListComposingSpec [ERROR] location: package org.apache.hadoop.hive.metastore.api [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[118,44] cannot find symbol [ERROR] symbol: class PartitionSpec [ERROR] location: package org.apache.hadoop.hive.metastore.api [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[119,44] cannot find symbol [ERROR] symbol: class PartitionSpecWithSharedSD [ERROR] location: package org.apache.hadoop.hive.metastore.api [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[120,44] cannot find symbol [ERROR] symbol: class PartitionWithoutSD [ERROR] location: package org.apache.hadoop.hive.metastore.api [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/partition/spec/PartitionSpecProxy.java:[5,44] cannot find symbol [ERROR] symbol: class PartitionSpec [ERROR] location: package org.apache.hadoop.hive.metastore.api [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[1990,48] cannot find symbol [ERROR] symbol: class PartitionSpec [ERROR] location: class org.apache.hadoop.hive.metastore.HiveMetaStore.HMSHandler [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[2004,58] cannot find symbol [ERROR] symbol: class PartitionSpec [ERROR] location: class org.apache.hadoop.hive.metastore.HiveMetaStore.HMSHandler [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[2719,17] cannot find symbol [ERROR] symbol: class PartitionSpec [ERROR] location: class org.apache.hadoop.hive.metastore.HiveMetaStore.HMSHandler [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[2779,18] cannot find symbol [ERROR] symbol: class PartitionSpec [ERROR] location: class org.apache.hadoop.hive.metastore.HiveMetaStore.HMSHandler [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[2844,93] cannot find symbol [ERROR] symbol: class PartitionWithoutSD [ERROR] location: class org.apache.hadoop.hive.metastore.HiveMetaStore.HMSHandler [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[2844,13] cannot find symbol [ERROR] symbol: class PartitionSpec [ERROR] location: class org.apache.hadoop.hive.metastore.HiveMetaStore.HMSHandler [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[4073,17] cannot find symbol [ERROR] symbol: class PartitionSpec [ERROR] location: class org.apache.hadoop.hive.metastore.HiveMetaStore.HMSHandler [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/partition/spec/PartitionSpecProxy.java:[55,24] cannot find symbol [ERROR] symbol: class PartitionSpec [ERROR] location: class org.apache.hadoop.hive.metastore.partition.spec.PartitionSpecProxy [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java:[100,44] cannot find symbol [ERROR] symbol: class PartitionSpec [ERROR] location: package org.apache.hadoop.hive.metastore.api [ERROR]
Re: Review Request 23799: HIVE-7390: refactor csv output format with in RFC mode and add one more option to support formatting as the csv format in hive cli
On Aug. 6, 2014, 5:50 a.m., Lars Francke wrote: I don't think this latest patch is the one you wanted to upload? I rebased the code to the latest and it has some difference comparing to the v6. - cheng --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23799/#review49698 --- On Aug. 6, 2014, 5:41 a.m., cheng xu wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23799/ --- (Updated Aug. 6, 2014, 5:41 a.m.) Review request for hive. Bugs: HIVE-7390 https://issues.apache.org/jira/browse/HIVE-7390 Repository: hive-git Description --- HIVE-7390: refactor csv output format with in RFC mode and add one more option to support formatting as the csv format in hive cli Diffs - beeline/pom.xml 6ec1d1aff3f35c097aa6054aae84faf2d63854f1 beeline/src/java/org/apache/hive/beeline/BeeLine.java 10fd2e2daac78ca43d45c74fcbad6b720a8d28ad beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java a1e07a090c222be42a381fa2052f0bda97867432 beeline/src/java/org/apache/hive/beeline/SeparatedValuesOutputFormat.java 7853c3f38f3c3fb9ae0b9939c714f1dc940ba053 beeline/src/main/resources/BeeLine.properties ddb0ba74d4bd3f0255eeb7d791eb28108c474ae1 itests/hive-unit/src/test/java/org/apache/hive/beeline/TestBeeLineWithArgs.java bd97aff5959fd9040fc0f0a1f6b782f2aa6f pom.xml b3216e1ea4abcf692a933024d415754322210d59 Diff: https://reviews.apache.org/r/23799/diff/ Testing --- Thanks, cheng xu
[jira] [Created] (HIVE-7642) Set hive input format by configuration.
Chengxiang Li created HIVE-7642: --- Summary: Set hive input format by configuration. Key: HIVE-7642 URL: https://issues.apache.org/jira/browse/HIVE-7642 Project: Hive Issue Type: Bug Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Currently hive input format is hard coded as HiveInputFormat, we should set this parameter from configuration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7642) Set hive input format by configuration.
[ https://issues.apache.org/jira/browse/HIVE-7642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-7642: Status: Patch Available (was: Open) Set hive input format by configuration. --- Key: HIVE-7642 URL: https://issues.apache.org/jira/browse/HIVE-7642 Project: Hive Issue Type: Bug Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Attachments: HIVE-7642.1-spark.patch Currently hive input format is hard coded as HiveInputFormat, we should set this parameter from configuration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7642) Set hive input format by configuration.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-7642: Summary: Set hive input format by configuration.[Spark Branch] (was: Set hive input format by configuration.) Set hive input format by configuration.[Spark Branch] - Key: HIVE-7642 URL: https://issues.apache.org/jira/browse/HIVE-7642 Project: Hive Issue Type: Bug Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Attachments: HIVE-7642.1-spark.patch Currently hive input format is hard coded as HiveInputFormat, we should set this parameter from configuration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7642) Set hive input format by configuration.
[ https://issues.apache.org/jira/browse/HIVE-7642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-7642: Attachment: HIVE-7642.1-spark.patch Set hive input format by configuration. --- Key: HIVE-7642 URL: https://issues.apache.org/jira/browse/HIVE-7642 Project: Hive Issue Type: Bug Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Attachments: HIVE-7642.1-spark.patch Currently hive input format is hard coded as HiveInputFormat, we should set this parameter from configuration. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 24445: HIVE-7642, Set hive input format by configuration.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24445/ --- Review request for hive, Brock Noland and Szehon Ho. Bugs: HIVE-7642 https://issues.apache.org/jira/browse/HIVE-7642 Repository: hive-git Description --- Currently hive input format is hard coded as HiveInputFormat, we should set this parameter from configuration. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 45eff67 Diff: https://reviews.apache.org/r/24445/diff/ Testing --- Thanks, chengxiang li
[jira] [Commented] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088968#comment-14088968 ] Lefty Leverenz commented on HIVE-6578: -- This adds configuration parameter *hive.stats.gather.num.threads* which is documented in the wiki here: * [Configuration Properties -- hive.stats.gather.num.threads | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.gather.num.threads] Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: TODOC13, orcfile Fix For: 0.13.0 Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, HIVE-6578.4.patch, HIVE-6578.4.patch.txt ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5425) Provide a configuration option to control the default stripe size for ORC
[ https://issues.apache.org/jira/browse/HIVE-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088978#comment-14088978 ] Lefty Leverenz commented on HIVE-5425: -- The ORC parameters have their own section in Configuration Properties now, and *hive.exec.orc.default.stripe.size* is documented here: * [Configuration Properties -- hive.exec.orc.default.stripe.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.orc.default.stripe.size] * [Configuration Properties -- ORC File Format | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat] Provide a configuration option to control the default stripe size for ORC - Key: HIVE-5425 URL: https://issues.apache.org/jira/browse/HIVE-5425 Project: Hive Issue Type: Bug Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Labels: TODOC13 Fix For: 0.13.0 Attachments: D13233.1.patch We should provide a configuration option to control the default stripe size. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5425) Provide a configuration option to control the default stripe size for ORC
[ https://issues.apache.org/jira/browse/HIVE-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-5425: - Labels: (was: TODOC13) Provide a configuration option to control the default stripe size for ORC - Key: HIVE-5425 URL: https://issues.apache.org/jira/browse/HIVE-5425 Project: Hive Issue Type: Bug Components: File Formats Reporter: Owen O'Malley Assignee: Owen O'Malley Fix For: 0.13.0 Attachments: D13233.1.patch We should provide a configuration option to control the default stripe size. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-6578: - Labels: orcfile (was: TODOC13 orcfile) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Fix For: 0.13.0 Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, HIVE-6578.4.patch, HIVE-6578.4.patch.txt ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-7624) Reduce operator initialization failed when running multiple MR query on spark
[ https://issues.apache.org/jira/browse/HIVE-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li reassigned HIVE-7624: Assignee: Rui Li Reduce operator initialization failed when running multiple MR query on spark - Key: HIVE-7624 URL: https://issues.apache.org/jira/browse/HIVE-7624 Project: Hive Issue Type: Bug Components: Spark Reporter: Rui Li Assignee: Rui Li The following error occurs when I try to run a query with multiple reduce works (M-R-R): {quote} 14/08/05 12:17:07 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 1) java.lang.RuntimeException: Reduce operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:170) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:53) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:31) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: cannot find field reducesinkkey0 from [0:_col0] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147) … {quote} I suspect we're applying the reduce function in wrong order. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7490) Revert ORC stripe size
[ https://issues.apache.org/jira/browse/HIVE-7490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088985#comment-14088985 ] Lefty Leverenz commented on HIVE-7490: -- *hive.exec.orc.default.stripe.size* is documented in Configuration Properties with its change of default value in 0.14.0: * [Configuration Properties -- hive.exec.orc.default.stripe.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.orc.default.stripe.size] Revert ORC stripe size -- Key: HIVE-7490 URL: https://issues.apache.org/jira/browse/HIVE-7490 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Trivial Labels: orcfile Fix For: 0.14.0 Attachments: HIVE-7490.1.patch HIVE-6037 reverted the changes to ORC stripe size introduced by HIVE-7231. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7490) Revert ORC stripe size
[ https://issues.apache.org/jira/browse/HIVE-7490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7490: - Labels: orcfile (was: TODOC14 orcfile) Revert ORC stripe size -- Key: HIVE-7490 URL: https://issues.apache.org/jira/browse/HIVE-7490 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Trivial Labels: orcfile Fix For: 0.14.0 Attachments: HIVE-7490.1.patch HIVE-6037 reverted the changes to ORC stripe size introduced by HIVE-7231. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7600) ConstantPropagateProcFactory uses reference equality on Boolean
[ https://issues.apache.org/jira/browse/HIVE-7600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KangHS updated HIVE-7600: - Attachment: HIVE-7600.patch Change the compared method for Boolean's value. - shortcutFunction() - process() ConstantPropagateProcFactory uses reference equality on Boolean --- Key: HIVE-7600 URL: https://issues.apache.org/jira/browse/HIVE-7600 Project: Hive Issue Type: Bug Reporter: Ted Yu Attachments: HIVE-7600.patch shortcutFunction() has the following code: {code} if (c.getValue() == Boolean.FALSE) { {code} Boolean.FALSE.equals() should be used. There're a few other occurrences of using reference equality on Boolean in this class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7600) ConstantPropagateProcFactory uses reference equality on Boolean
[ https://issues.apache.org/jira/browse/HIVE-7600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KangHS updated HIVE-7600: - Status: Patch Available (was: Open) ConstantPropagateProcFactory uses reference equality on Boolean --- Key: HIVE-7600 URL: https://issues.apache.org/jira/browse/HIVE-7600 Project: Hive Issue Type: Bug Reporter: Ted Yu Attachments: HIVE-7600.patch shortcutFunction() has the following code: {code} if (c.getValue() == Boolean.FALSE) { {code} Boolean.FALSE.equals() should be used. There're a few other occurrences of using reference equality on Boolean in this class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7231: - Labels: orcfile (was: TODOC14 orcfile) Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Fix For: 0.14.0 Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, HIVE-7231.8.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088990#comment-14088990 ] Lefty Leverenz commented on HIVE-7231: -- The wiki now documents *hive.exec.orc.default.block.size*, *hive.exec.orc.block.padding.tolerance*, and the changed default for *hive.exec.orc.default.stripe.size* in 0.14.0: * [Configuration Properties -- hive.exec.orc.default.block.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.orc.default.block.size] * [Configuration Properties -- hive.exec.orc.block.padding.tolerance | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.orc.block.padding.tolerance] * [Configuration Properties -- hive.exec.orc.default.stripe.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.orc.default.stripe.size] Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Fix For: 0.14.0 Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, HIVE-7231.8.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7624) Reduce operator initialization failed when running multiple MR query on spark
[ https://issues.apache.org/jira/browse/HIVE-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088993#comment-14088993 ] Chao commented on HIVE-7624: Hi [~ruili], I spent sometime looking at this bug today. What I found out is that, even with cloned JobConfs, in {{Utilities.setBaseWork}} it will still create same {{planPath}} for different reduce plans. Therefore, only one reduce plan will be left. I think we might need to find some way to allow multiple reduce plan files to co-exist. Hope this helps. Reduce operator initialization failed when running multiple MR query on spark - Key: HIVE-7624 URL: https://issues.apache.org/jira/browse/HIVE-7624 Project: Hive Issue Type: Bug Components: Spark Reporter: Rui Li Assignee: Rui Li The following error occurs when I try to run a query with multiple reduce works (M-R-R): {quote} 14/08/05 12:17:07 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 1) java.lang.RuntimeException: Reduce operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:170) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:53) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:31) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: cannot find field reducesinkkey0 from [0:_col0] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147) … {quote} I suspect we're applying the reduce function in wrong order. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7532) allow disabling direct sql per query with external metastore
[ https://issues.apache.org/jira/browse/HIVE-7532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088994#comment-14088994 ] Hive QA commented on HIVE-7532: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660281/HIVE-7532.5.patch.txt {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5883 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/203/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/203/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-203/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12660281 allow disabling direct sql per query with external metastore Key: HIVE-7532 URL: https://issues.apache.org/jira/browse/HIVE-7532 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Navis Attachments: HIVE-7532.1.patch.txt, HIVE-7532.2.nogen, HIVE-7532.2.patch.txt, HIVE-7532.3.patch.txt, HIVE-7532.4.patch.txt, HIVE-7532.5.patch.txt Currently with external metastore, direct sql can only be disabled via metastore config globally. Perhaps it makes sense to have the ability to propagate the setting per query from client to override the metastore setting, e.g. if one particular query causes it to fail. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7624) Reduce operator initialization failed when running multiple MR query on spark
[ https://issues.apache.org/jira/browse/HIVE-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089001#comment-14089001 ] Rui Li commented on HIVE-7624: -- Thanks very much [~csun]. After some debugging, I found this issue is caused in GenMapRedUtils.setKeyAndValueDescForTaskTree, which is called after we compiled the task. In that method we always set the keyDesc of the leaf reduce work according to the root map work. I suppose this is both incorrect and redundant because when a reduce work is created, we already call GenSparkUtils.setupReduceSink to set the keyDesc. I removed these code and the exception is gone. However I met another problem: no result is returned for the multi-MR query. (I cloned the jobConf and set a new plan path for the cloned) Reduce operator initialization failed when running multiple MR query on spark - Key: HIVE-7624 URL: https://issues.apache.org/jira/browse/HIVE-7624 Project: Hive Issue Type: Bug Components: Spark Reporter: Rui Li Assignee: Rui Li The following error occurs when I try to run a query with multiple reduce works (M-R-R): {quote} 14/08/05 12:17:07 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 1) java.lang.RuntimeException: Reduce operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:170) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:53) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:31) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: cannot find field reducesinkkey0 from [0:_col0] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147) … {quote} I suspect we're applying the reduce function in wrong order. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7553) avoid the scheduling maintenance window for every jar change
[ https://issues.apache.org/jira/browse/HIVE-7553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089008#comment-14089008 ] Ferdinand Xu commented on HIVE-7553: I am now working on this issue, but before putting in a patch I want to present the approach so that I could get some feedback. To my understand, this issue is attempting to resolve the hot swap jar files for HIVE_AUX_JARS_PATH. I did some POCs locally. The main workflow for jars loader is as below: 1. Read env HIVE_AUX_JARS_PATH from the system and parse it through adding jar files under the directory one by one 2. The system classloader loads the jar files in step 1. 3. When trying to create a UDF based on the added aux jars, FunctionTask will try to get a class from loaded classes by calling getUdfClass method. From my view, the key factor to solve this problem is mainly about the classloader. For class loader, it has some limitations which should be some designs: a) when finding a class, it will check the parent classloader first to see whether the class is loaded and then current classloader. b) Classloader did not have the mechanism for us to reload a cached class. Based on this, I have come up with the following solutions in three catalogs. 1. change the order of loading classes As mentioned in section 1, auxilary class path is parsed and loaded when hive server2 booting up. Can we postpone the loading phase until needed which means loading it on the go, namely creating UDF? In addition, reloading cached jars is handful issue. To resolve it, we should create a new classloader each time followed by calling the method Thread.setContextClassloader(refreshedCL) and HiveConf.setClassLoader(refreshedCL) 2. override the standard classloader Still keep the current loading order make the classloader loading child classloader first and then the parent(still need to create a new classloader on the go) 3. Others use OSGi? JRebel? If I have anything incorrect, please feel free to figure me out. Thanks! avoid the scheduling maintenance window for every jar change Key: HIVE-7553 URL: https://issues.apache.org/jira/browse/HIVE-7553 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Ferdinand Xu Assignee: Ferdinand Xu When user needs to refresh existing or add a new jar to HS2, it needs to restart it. As HS2 is service exposed to clients, this requires scheduling maintenance window for every jar change. It would be great if we could avoid that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7521) Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process()
[ https://issues.apache.org/jira/browse/HIVE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KangHS updated HIVE-7521: - Attachment: HIVE-7521.patch Change the compared method for Boolean's value. Reference equality is used on Boolean in NullScanOptimizer#WhereFalseProcessor#process() Key: HIVE-7521 URL: https://issues.apache.org/jira/browse/HIVE-7521 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: HIVE-7521.patch {code} ExprNodeConstantDesc c = (ExprNodeConstantDesc) condition; if (c.getValue() != Boolean.FALSE) { return null; } {code} equals() should be called instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7643) ExecMapper statis states lead to unpredictable query result.[Spark Branch]
Chengxiang Li created HIVE-7643: --- Summary: ExecMapper statis states lead to unpredictable query result.[Spark Branch] Key: HIVE-7643 URL: https://issues.apache.org/jira/browse/HIVE-7643 Project: Hive Issue Type: Bug Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li ExecMapper contain static states, static variable done for example. Spark executor may execute multi tasks concurrently, ExecMapper static state updated by one task would influence the logic of another task, which may lead to unpredictable result. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7643) ExecMapper statis states lead to unpredictable query result.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-7643: Description: ExecMapper contain static states, static variable done for example. Spark executor may execute multi tasks concurrently, ExecMapper static state updated by one task would influence the logic of another task, which may lead to unpredictable result. To reproduce, execute select count(*) from test tablesample(1 rows) s, test should be a table with several blocks source data. (was: ExecMapper contain static states, static variable done for example. Spark executor may execute multi tasks concurrently, ExecMapper static state updated by one task would influence the logic of another task, which may lead to unpredictable result.) ExecMapper statis states lead to unpredictable query result.[Spark Branch] -- Key: HIVE-7643 URL: https://issues.apache.org/jira/browse/HIVE-7643 Project: Hive Issue Type: Bug Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li ExecMapper contain static states, static variable done for example. Spark executor may execute multi tasks concurrently, ExecMapper static state updated by one task would influence the logic of another task, which may lead to unpredictable result. To reproduce, execute select count(*) from test tablesample(1 rows) s, test should be a table with several blocks source data. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7643) ExecMapper statis states lead to unpredictable query result.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-7643: Description: ExecMapper contain static states, static variable done for example. Spark executor may execute multi tasks concurrently, ExecMapper static state updated by one task would influence the logic of another task, which may lead to unpredictable result. To reproduce, execute {code:sql} SELECT COUNT(*) FROM TEST TABLESAMPLE(1 ROWS) s {code}, TEST should be a table with several blocks source data. was:ExecMapper contain static states, static variable done for example. Spark executor may execute multi tasks concurrently, ExecMapper static state updated by one task would influence the logic of another task, which may lead to unpredictable result. To reproduce, execute select count(*) from test tablesample(1 rows) s, test should be a table with several blocks source data. ExecMapper statis states lead to unpredictable query result.[Spark Branch] -- Key: HIVE-7643 URL: https://issues.apache.org/jira/browse/HIVE-7643 Project: Hive Issue Type: Bug Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li ExecMapper contain static states, static variable done for example. Spark executor may execute multi tasks concurrently, ExecMapper static state updated by one task would influence the logic of another task, which may lead to unpredictable result. To reproduce, execute {code:sql} SELECT COUNT(*) FROM TEST TABLESAMPLE(1 ROWS) s {code}, TEST should be a table with several blocks source data. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-42) Enable findbugs and hive tests from hadoop build.xml
[ https://issues.apache.org/jira/browse/HIVE-42?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Francke resolved HIVE-42. -- Resolution: Invalid Invalid at least since the switch to Maven Enable findbugs and hive tests from hadoop build.xml Key: HIVE-42 URL: https://issues.apache.org/jira/browse/HIVE-42 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Ashish Thusoo Enable findbugs on hive code and also enable hive tests to be run as part of hadoop tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7421) Make VectorUDFDateString use the same date parsing and formatting as GenericUDFDate
[ https://issues.apache.org/jira/browse/HIVE-7421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7421: --- Attachment: (was: TestWithORC.zip) Make VectorUDFDateString use the same date parsing and formatting as GenericUDFDate --- Key: HIVE-7421 URL: https://issues.apache.org/jira/browse/HIVE-7421 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7421.1.patch One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Seems very similar to https://issues.apache.org/jira/browse/HIVE-6649 Stack trace: {code} Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setConcat(BytesColumnVector.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.StringConcatColScalar.evaluate(StringConcatColScalar.java:78) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:59) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongScalarAddLongColumn.evaluate(LongScalarAddLongColumn.java:65) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongColAddLongColumn.evaluate(LongColAddLongColumn.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.LongColDivideLongScalar.evaluate(LongColDivideLongScalar.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FuncFloorDoubleToLong.evaluate(FuncFloorDoubleToLong.java:47) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:147) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:289) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:711) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7421) Make VectorUDFDateString use the same date parsing and formatting as GenericUDFDate
[ https://issues.apache.org/jira/browse/HIVE-7421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7421: --- Attachment: (was: fail_62.sql) Make VectorUDFDateString use the same date parsing and formatting as GenericUDFDate --- Key: HIVE-7421 URL: https://issues.apache.org/jira/browse/HIVE-7421 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7421.1.patch One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Seems very similar to https://issues.apache.org/jira/browse/HIVE-6649 Stack trace: {code} Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setConcat(BytesColumnVector.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.StringConcatColScalar.evaluate(StringConcatColScalar.java:78) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:59) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongScalarAddLongColumn.evaluate(LongScalarAddLongColumn.java:65) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongColAddLongColumn.evaluate(LongColAddLongColumn.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.LongColDivideLongScalar.evaluate(LongColDivideLongScalar.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FuncFloorDoubleToLong.evaluate(FuncFloorDoubleToLong.java:47) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:147) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:289) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:711) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7421) Make VectorUDFDateString use the same date parsing and formatting as GenericUDFDate
[ https://issues.apache.org/jira/browse/HIVE-7421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7421: --- Attachment: (was: fail_47.sql) Make VectorUDFDateString use the same date parsing and formatting as GenericUDFDate --- Key: HIVE-7421 URL: https://issues.apache.org/jira/browse/HIVE-7421 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7421.1.patch One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Seems very similar to https://issues.apache.org/jira/browse/HIVE-6649 Stack trace: {code} Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setConcat(BytesColumnVector.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.StringConcatColScalar.evaluate(StringConcatColScalar.java:78) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:59) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongScalarAddLongColumn.evaluate(LongScalarAddLongColumn.java:65) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongColAddLongColumn.evaluate(LongColAddLongColumn.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.LongColDivideLongScalar.evaluate(LongColDivideLongScalar.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FuncFloorDoubleToLong.evaluate(FuncFloorDoubleToLong.java:47) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:147) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:289) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:711) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7421) Make VectorUDFDateString use the same date parsing and formatting as GenericUDFDate
[ https://issues.apache.org/jira/browse/HIVE-7421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7421: --- Description: One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Seems very similar to https://issues.apache.org/jira/browse/HIVE-6649 Stack trace: {code} Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setConcat(BytesColumnVector.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.StringConcatColScalar.evaluate(StringConcatColScalar.java:78) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:59) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongScalarAddLongColumn.evaluate(LongScalarAddLongColumn.java:65) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongColAddLongColumn.evaluate(LongColAddLongColumn.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.LongColDivideLongScalar.evaluate(LongColDivideLongScalar.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FuncFloorDoubleToLong.evaluate(FuncFloorDoubleToLong.java:47) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:147) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:289) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:711) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} was: One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Seems very similar to https://issues.apache.org/jira/browse/HIVE-6649 Query: {code} SELECT FLOOR((7 + DATEDIFF(`Staples`.`order_date_`, CONCAT(CAST(YEAR(`Staples`.`order_date_`) AS STRING), '-01-01 00:00:00')) +pmod(8 + pmod(datediff(CONCAT(CAST(YEAR(`Staples`.`order_date_`) AS STRING), '-01-01 00:00:00'), '1995-01-01'), 7) - 2, 7) ) / 7) AS `wk_order_date_ok`, SUM(`Staples`.`sales_total`) AS `sum_sales_total_ok` FROM `default`.`testv1_Staples` `Staples` GROUP BY FLOOR((7 + DATEDIFF(`Staples`.`order_date_`, CONCAT(CAST(YEAR(`Staples`.`order_date_`) AS STRING), '-01-01 00:00:00')) +pmod(8 + pmod(datediff(CONCAT(CAST(YEAR(`Staples`.`order_date_`) AS STRING), '-01-01 00:00:00'), '1995-01-01'), 7) - 2, 7) ) / 7) ; {code} Stack trace: {code} Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setConcat(BytesColumnVector.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.StringConcatColScalar.evaluate(StringConcatColScalar.java:78) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:59) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongScalarAddLongColumn.evaluate(LongScalarAddLongColumn.java:65) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongColAddLongColumn.evaluate(LongColAddLongColumn.java:52) at
[jira] [Updated] (HIVE-7421) Make VectorUDFDateString use the same date parsing and formatting as GenericUDFDate
[ https://issues.apache.org/jira/browse/HIVE-7421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7421: --- Description: One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Stack trace: {code} Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setConcat(BytesColumnVector.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.StringConcatColScalar.evaluate(StringConcatColScalar.java:78) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:59) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongScalarAddLongColumn.evaluate(LongScalarAddLongColumn.java:65) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongColAddLongColumn.evaluate(LongColAddLongColumn.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.LongColDivideLongScalar.evaluate(LongColDivideLongScalar.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FuncFloorDoubleToLong.evaluate(FuncFloorDoubleToLong.java:47) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:147) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:289) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:711) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} was: One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Seems very similar to https://issues.apache.org/jira/browse/HIVE-6649 Stack trace: {code} Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setConcat(BytesColumnVector.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.StringConcatColScalar.evaluate(StringConcatColScalar.java:78) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:59) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongScalarAddLongColumn.evaluate(LongScalarAddLongColumn.java:65) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongColAddLongColumn.evaluate(LongColAddLongColumn.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.LongColDivideLongScalar.evaluate(LongColDivideLongScalar.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FuncFloorDoubleToLong.evaluate(FuncFloorDoubleToLong.java:47) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:147) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:289) at
[jira] [Updated] (HIVE-7421) Make VectorUDFDateString use the same date parsing and formatting as GenericUDFDate
[ https://issues.apache.org/jira/browse/HIVE-7421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7421: --- Attachment: (was: fail_932.sql) Make VectorUDFDateString use the same date parsing and formatting as GenericUDFDate --- Key: HIVE-7421 URL: https://issues.apache.org/jira/browse/HIVE-7421 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7421.1.patch One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Seems very similar to https://issues.apache.org/jira/browse/HIVE-6649 Stack trace: {code} Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setConcat(BytesColumnVector.java:190) at org.apache.hadoop.hive.ql.exec.vector.expressions.StringConcatColScalar.evaluate(StringConcatColScalar.java:78) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorUDFDateDiffColCol.evaluate(VectorUDFDateDiffColCol.java:59) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongScalarAddLongColumn.evaluate(LongScalarAddLongColumn.java:65) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.LongColAddLongColumn.evaluate(LongColAddLongColumn.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.LongColDivideLongScalar.evaluate(LongColDivideLongScalar.java:52) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:112) at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FuncFloorDoubleToLong.evaluate(FuncFloorDoubleToLong.java:47) at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:147) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:289) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:711) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7422) Array out of bounds exception involving ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble
[ https://issues.apache.org/jira/browse/HIVE-7422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7422: --- Attachment: (was: fail_119.sql) Array out of bounds exception involving ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble - Key: HIVE-7422 URL: https://issues.apache.org/jira/browse/HIVE-7422 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Fix For: 0.14.0 Attachments: HIVE-7422.1.patch, HIVE-7422.2.patch, HIVE-7422.3.patch One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Stack trace: {code} Caused by: java.lang.ArrayIndexOutOfBoundsException: 50 at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.CastLongToDouble.evaluate(CastLongToDouble.java:50) at org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble.aggregateInputSelection(VectorUDAFAvgDouble.java:139) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processAggregators(VectorGroupByOperator.java:121) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:295) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:711) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7422) Array out of bounds exception involving ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble
[ https://issues.apache.org/jira/browse/HIVE-7422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7422: --- Description: One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Stack trace: {code} Caused by: java.lang.ArrayIndexOutOfBoundsException: 50 at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.CastLongToDouble.evaluate(CastLongToDouble.java:50) at org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble.aggregateInputSelection(VectorUDAFAvgDouble.java:139) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processAggregators(VectorGroupByOperator.java:121) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:295) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:711) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} was: One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Query: {code} SELECT `Starbucks`.`product` AS `none_product_nk`, AVG(CAST(50 AS DOUBLE)) AS `avg_x_ok`, AVG(CAST(50 AS DOUBLE)) AS `avg_y_ok` FROM `default`.`testv1_Starbucks` `Starbucks` GROUP BY `Starbucks`.`product` ; {code} Stack trace: {code} Caused by: java.lang.ArrayIndexOutOfBoundsException: 50 at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.CastLongToDouble.evaluate(CastLongToDouble.java:50) at org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble.aggregateInputSelection(VectorUDAFAvgDouble.java:139) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processAggregators(VectorGroupByOperator.java:121) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:295) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:711) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} Array out of bounds exception involving ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble - Key: HIVE-7422 URL: https://issues.apache.org/jira/browse/HIVE-7422 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Fix For: 0.14.0 Attachments: HIVE-7422.1.patch, HIVE-7422.2.patch, HIVE-7422.3.patch One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Stack trace: {code} Caused by: java.lang.ArrayIndexOutOfBoundsException: 50 at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.CastLongToDouble.evaluate(CastLongToDouble.java:50) at org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble.aggregateInputSelection(VectorUDAFAvgDouble.java:139) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processAggregators(VectorGroupByOperator.java:121) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:295) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:711) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at
[jira] [Updated] (HIVE-7422) Array out of bounds exception involving ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble
[ https://issues.apache.org/jira/browse/HIVE-7422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7422: --- Attachment: (was: TestWithORC.zip) Array out of bounds exception involving ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble - Key: HIVE-7422 URL: https://issues.apache.org/jira/browse/HIVE-7422 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Fix For: 0.14.0 Attachments: HIVE-7422.1.patch, HIVE-7422.2.patch, HIVE-7422.3.patch One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Stack trace: {code} Caused by: java.lang.ArrayIndexOutOfBoundsException: 50 at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.CastLongToDouble.evaluate(CastLongToDouble.java:50) at org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.gen.VectorUDAFAvgDouble.aggregateInputSelection(VectorUDAFAvgDouble.java:139) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processAggregators(VectorGroupByOperator.java:121) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:295) at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:711) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7426) ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate
[ https://issues.apache.org/jira/browse/HIVE-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7426: --- Attachment: (was: fail_856.sql) ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate Key: HIVE-7426 URL: https://issues.apache.org/jira/browse/HIVE-7426 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Fix For: 0.14.0 Attachments: HIVE-7426.1.patch, HIVE-7426.2.patch, HIVE-7426.3.patch One of several found by Raj Bains. M/R or Tez. Query does not vectorize, so this is not vector related. Stack Trace: {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBasePad.evaluate(GenericUDFBasePad.java:65) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.evaluate(GenericUDFToUnixTimeStamp.java:121) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUnixTimeStamp.evaluate(GenericUDFUnixTimeStamp.java:52) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:177) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFWhen.evaluate(GenericUDFWhen.java:78) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:778) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7426) ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate
[ https://issues.apache.org/jira/browse/HIVE-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7426: --- Attachment: (was: fail_366.sql) ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate Key: HIVE-7426 URL: https://issues.apache.org/jira/browse/HIVE-7426 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Fix For: 0.14.0 Attachments: HIVE-7426.1.patch, HIVE-7426.2.patch, HIVE-7426.3.patch One of several found by Raj Bains. M/R or Tez. Query does not vectorize, so this is not vector related. Stack Trace: {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBasePad.evaluate(GenericUDFBasePad.java:65) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.evaluate(GenericUDFToUnixTimeStamp.java:121) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUnixTimeStamp.evaluate(GenericUDFUnixTimeStamp.java:52) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:177) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFWhen.evaluate(GenericUDFWhen.java:78) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:778) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7424) HiveException: Error evaluating concat(concat(' ', str2), ' ') in ql.exec.vector.VectorSelectOperator.processOp
[ https://issues.apache.org/jira/browse/HIVE-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7424: --- Description: One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Stack trace: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating concat(concat(' ', str2), ' ') at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} was: One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Query: {code} SELECT `testv1_Calcs`.`key` AS `none_key_nk`, CONCAT(CONCAT(' ',`testv1_Calcs`.`str2`),' ') AS `none_padded_str2_nk`, CONCAT(CONCAT('|',RTRIM(CONCAT(CONCAT(' ',`testv1_Calcs`.`str2`),' '))),'|') AS `none_z_rtrim_str_nk` FROM `default`.`testv1_Calcs` `testv1_Calcs` ; {code} Stack trace: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating concat(concat(' ', str2), ' ') at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} HiveException: Error evaluating concat(concat(' ', str2), ' ') in ql.exec.vector.VectorSelectOperator.processOp - Key: HIVE-7424 URL: https://issues.apache.org/jira/browse/HIVE-7424 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.13.1 Reporter: Matt McCline Assignee: Matt McCline Fix For: 0.14.0 Attachments: HIVE-7424.1.patch, HIVE-7424.2.patch, HIVE-7424.3.patch One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Stack trace: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating concat(concat(' ', str2), ' ') at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7424) HiveException: Error evaluating concat(concat(' ', str2), ' ') in ql.exec.vector.VectorSelectOperator.processOp
[ https://issues.apache.org/jira/browse/HIVE-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7424: --- Attachment: (was: TestWithORC.zip) HiveException: Error evaluating concat(concat(' ', str2), ' ') in ql.exec.vector.VectorSelectOperator.processOp - Key: HIVE-7424 URL: https://issues.apache.org/jira/browse/HIVE-7424 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.13.1 Reporter: Matt McCline Assignee: Matt McCline Fix For: 0.14.0 Attachments: HIVE-7424.1.patch, HIVE-7424.2.patch, HIVE-7424.3.patch One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Stack trace: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating concat(concat(' ', str2), ' ') at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7426) ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate
[ https://issues.apache.org/jira/browse/HIVE-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7426: --- Attachment: (was: fail_750.sql) ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate Key: HIVE-7426 URL: https://issues.apache.org/jira/browse/HIVE-7426 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Fix For: 0.14.0 Attachments: HIVE-7426.1.patch, HIVE-7426.2.patch, HIVE-7426.3.patch One of several found by Raj Bains. M/R or Tez. Query does not vectorize, so this is not vector related. Stack Trace: {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBasePad.evaluate(GenericUDFBasePad.java:65) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.evaluate(GenericUDFToUnixTimeStamp.java:121) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUnixTimeStamp.evaluate(GenericUDFUnixTimeStamp.java:52) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:177) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFWhen.evaluate(GenericUDFWhen.java:78) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:778) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7424) HiveException: Error evaluating concat(concat(' ', str2), ' ') in ql.exec.vector.VectorSelectOperator.processOp
[ https://issues.apache.org/jira/browse/HIVE-7424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7424: --- Attachment: (was: fail_401.sql) HiveException: Error evaluating concat(concat(' ', str2), ' ') in ql.exec.vector.VectorSelectOperator.processOp - Key: HIVE-7424 URL: https://issues.apache.org/jira/browse/HIVE-7424 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.13.1 Reporter: Matt McCline Assignee: Matt McCline Fix For: 0.14.0 Attachments: HIVE-7424.1.patch, HIVE-7424.2.patch, HIVE-7424.3.patch One of several found by Raj Bains. M/R or Tez. {code} set hive.vectorized.execution.enabled=true; {code} Stack trace: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating concat(concat(' ', str2), ' ') at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7426) ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate
[ https://issues.apache.org/jira/browse/HIVE-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7426: --- Description: One of several found by Raj Bains. M/R or Tez. Query does not vectorize, so this is not vector related. Stack Trace: {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBasePad.evaluate(GenericUDFBasePad.java:65) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.evaluate(GenericUDFToUnixTimeStamp.java:121) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUnixTimeStamp.evaluate(GenericUDFUnixTimeStamp.java:52) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:177) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFWhen.evaluate(GenericUDFWhen.java:78) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:778) {code} was: One of several found by Raj Bains. M/R or Tez. Query does not vectorize, so this is not vector related. Query: {code} SELECT `Calcs`.`datetime0` AS `none_datetime0_ok`, `Calcs`.`int1` AS `none_int1_ok`, `Calcs`.`key` AS `none_key_nk`, CASE WHEN (`Calcs`.`datetime0` IS NOT NULL AND `Calcs`.`int1` IS NOT NULL) THEN FROM_UNIXTIME(UNIX_TIMESTAMP(CONCAT((YEAR(`Calcs`.`datetime0`)+FLOOR((MONTH(`Calcs`.`datetime0`)+`Calcs`.`int1`)/12)), CONCAT('-', CONCAT(LPAD(PMOD(MONTH(`Calcs`.`datetime0`)+`Calcs`.`int1`, 12), 2, '0'), SUBSTR(`Calcs`.`datetime0`, 8, SUBSTR('-MM-dd HH:mm:ss',0,LENGTH(`Calcs`.`datetime0`))), '-MM-dd HH:mm:ss') END AS `none_z_dateadd_month_ok` FROM `default`.`testv1_Calcs`
[jira] [Updated] (HIVE-7426) ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate
[ https://issues.apache.org/jira/browse/HIVE-7426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7426: --- Attachment: (was: TestWithORC.zip) ClassCastException: ...IntWritable cannot be cast to ...Text involving ql.udf.generic.GenericUDFBasePad.evaluate Key: HIVE-7426 URL: https://issues.apache.org/jira/browse/HIVE-7426 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Fix For: 0.14.0 Attachments: HIVE-7426.1.patch, HIVE-7426.2.patch, HIVE-7426.3.patch One of several found by Raj Bains. M/R or Tez. Query does not vectorize, so this is not vector related. Stack Trace: {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBasePad.evaluate(GenericUDFBasePad.java:65) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.stringEvaluate(GenericUDFConcat.java:189) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFConcat.evaluate(GenericUDFConcat.java:159) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.evaluate(GenericUDFToUnixTimeStamp.java:121) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUnixTimeStamp.evaluate(GenericUDFUnixTimeStamp.java:52) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:177) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:77) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFWhen.evaluate(GenericUDFWhen.java:78) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.KeyWrapperFactory$ListKeyWrapper.getNewKey(KeyWrapperFactory.java:113) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:778) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7452) Boolean comparison is done through reference equality rather than using equals
[ https://issues.apache.org/jira/browse/HIVE-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KangHS updated HIVE-7452: - Status: Patch Available (was: Open) Boolean comparison is done through reference equality rather than using equals -- Key: HIVE-7452 URL: https://issues.apache.org/jira/browse/HIVE-7452 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: HIVE-7452.patch In Driver#doAuthorization(): {code} if (tbl != null !tableAuthChecked.contains(tbl.getTableName()) !(tableUsePartLevelAuth.get(tbl.getTableName()) == Boolean.TRUE)) { {code} The above comparison should be done using .equals() method. The comparison below doesn't evaluate to true: {code} Boolean b = new Boolean(true); if (b == Boolean.TRUE) { {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7452) Boolean comparison is done through reference equality rather than using equals
[ https://issues.apache.org/jira/browse/HIVE-7452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KangHS updated HIVE-7452: - Attachment: HIVE-7452.patch Change the boolean comparison method. - doAuthrization() - getTableParitionUsedColums() Boolean comparison is done through reference equality rather than using equals -- Key: HIVE-7452 URL: https://issues.apache.org/jira/browse/HIVE-7452 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: HIVE-7452.patch In Driver#doAuthorization(): {code} if (tbl != null !tableAuthChecked.contains(tbl.getTableName()) !(tableUsePartLevelAuth.get(tbl.getTableName()) == Boolean.TRUE)) { {code} The above comparison should be done using .equals() method. The comparison below doesn't evaluate to true: {code} Boolean b = new Boolean(true); if (b == Boolean.TRUE) { {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7229) String is compared using equal in HiveMetaStore#HMSHandler#init()
[ https://issues.apache.org/jira/browse/HIVE-7229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KangHS updated HIVE-7229: - Attachment: HIVE-7229.patch String is compared using equal in HiveMetaStore#HMSHandler#init() - Key: HIVE-7229 URL: https://issues.apache.org/jira/browse/HIVE-7229 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: HIVE-7229.1.patch, HIVE-7229.patch, HIVE-7229.patch Around line 423: {code} if (partitionValidationRegex != null partitionValidationRegex != ) { partitionValidationPattern = Pattern.compile(partitionValidationRegex); {code} partitionValidationRegex.isEmpty() can be used instead. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7637) Change throws clause for Hadoop23Shims.ProxyFileSystem23.access()
[ https://issues.apache.org/jira/browse/HIVE-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089066#comment-14089066 ] Hive QA commented on HIVE-7637: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660284/HIVE-7637.1.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5883 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dyn_part_max org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/204/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/204/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-204/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12660284 Change throws clause for Hadoop23Shims.ProxyFileSystem23.access() - Key: HIVE-7637 URL: https://issues.apache.org/jira/browse/HIVE-7637 Project: Hive Issue Type: Bug Components: Shims Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-7637.1.patch Looks like the changes from HIVE-7583 don't build correctly with Hadoop-2.6.0 because the ProxyFileSystem23 version of access() throws Exception, which is not one of the exceptions listed in the throws clause of FileSystem.access(). The method in ProxyFileSystem23 should have its throws clause modified to match FileSystem's. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-7640) Support Hive TABLESAMPLE
[ https://issues.apache.org/jira/browse/HIVE-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li resolved HIVE-7640. - Resolution: Fixed Support Hive TABLESAMPLE Key: HIVE-7640 URL: https://issues.apache.org/jira/browse/HIVE-7640 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Research and verify TABLESAMPLE support in Hive on Spark, and research whether it can be merged with Spark sample features. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7640) Support Hive TABLESAMPLE
[ https://issues.apache.org/jira/browse/HIVE-7640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089085#comment-14089085 ] Chengxiang Li commented on HIVE-7640: - Mainly table sampling features are workable except 2 issues: # part of block sampling requires CombineHiveInputFormat, while Hive on Spark hard code input format to HiveInputFormat. # few block sampling queries return unexpected result. I've create 2 related jira which are linked above to track these 2 issues. Support Hive TABLESAMPLE Key: HIVE-7640 URL: https://issues.apache.org/jira/browse/HIVE-7640 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Research and verify TABLESAMPLE support in Hive on Spark, and research whether it can be merged with Spark sample features. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7644) hive custom udf cannot be used in the join_condition(on)
Hayok created HIVE-7644: --- Summary: hive custom udf cannot be used in the join_condition(on) Key: HIVE-7644 URL: https://issues.apache.org/jira/browse/HIVE-7644 Project: Hive Issue Type: Bug Components: Clients Affects Versions: 0.12.0 Reporter: Hayok hive ADD JAR x; Added x to class path Added resource: x hive create temporary function func1 as 'xxx'; OK Time taken: 0.009 seconds hive list jars; xxx.jar hive select /*+ MAPJOIN(certain column1) */ * from tb1 join tb2 on tb1.column2 = func1(tb2.column3) ; Total MapReduce jobs = 1 14/08/07 17:38:04 WARN conf.Configuration: file:/tmp/[username]hive_2014-08-07_17-38-01_048_6199454015323812186-1/-local-10005/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/08/07 17:38:04 WARN conf.Configuration: file:/tmp/[username]/hive_2014-08-07_17-38-01_048_6199454015323812186-1/-local-10005/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14/08/07 17:38:05 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore. Execution log at: /tmp/[username]/[username]_20140807173838_d673690f-c452-4ebb-bf53-9d663c49d04e.log 2014-08-07 05:38:05 Starting to launch local task to process map join; maximum memory = 2027290624 Execution failed with exit status: 2 Obtaining error information Task failed! Task ID: Stage-4 Logs: /tmp/[username]/hive.log FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask Then I watch the log named /tmp/[username]/[username]_20140807173838_d673690f-c452-4ebb-bf53-9d663c49d04e.log, it writes: 2014-08-07 16:46:59,105 INFO mr.MapredLocalTask (SessionState.java:printInfo(417)) - 2014-08-07 04:46:59 Starting to launch local task to process map join; maximum memory = 2027290624 2014-08-07 16:46:59,114 INFO mr.MapredLocalTask (MapredLocalTask.java:initializeOperators(389)) - fetchoperator for tmp_compete created 2014-08-07 16:46:59,196 INFO exec.TableScanOperator (Operator.java:initialize(338)) - Initializing Self 0 TS 2014-08-07 16:46:59,197 INFO exec.TableScanOperator (Operator.java:initializeChildren(403)) - Operator 0 TS initialized 2014-08-07 16:46:59,197 INFO exec.TableScanOperator (Operator.java:initializeChildren(407)) - Initializing children of 0 TS 2014-08-07 16:46:59,197 INFO exec.HashTableSinkOperator (Operator.java:initialize(442)) - Initializing child 1 HASHTABLESINK 2014-08-07 16:46:59,197 INFO exec.HashTableSinkOperator (Operator.java:initialize(338)) - Initializing Self 1 HASHTABLESINK 2014-08-07 16:46:59,198 INFO mapjoin.MapJoinMemoryExhaustionHandler (MapJoinMemoryExhaustionHandler.java:init(72)) - JVM Max Heap Size: 2027290624 2014-08-07 16:46:59,222 ERROR mr.MapredLocalTask (MapredLocalTask.java:executeFromChildJVM(324)) - Hive Runtime Error: Map local work failed org.apache.hadoop.hive.ql.exec.UDFArgumentException: The UDF implementation class 'xxx' is not present in the class path at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:142) at org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:116) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:127) at org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:66) at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:140) at
[jira] [Created] (HIVE-7645) Hive CompactorMR job set NUM_BUCKETS mistake
Xiaoyu Wang created HIVE-7645: - Summary: Hive CompactorMR job set NUM_BUCKETS mistake Key: HIVE-7645 URL: https://issues.apache.org/jira/browse/HIVE-7645 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Xiaoyu Wang code: job.setInt(NUM_BUCKETS, sd.getBucketColsSize()); should change to: job.setInt(NUM_BUCKETS, sd.getNumBuckets()); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7644) hive custom udf cannot be used in the join_condition(on)
[ https://issues.apache.org/jira/browse/HIVE-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hayok updated HIVE-7644: Description: console: hive ADD JAR x; Added x to class path Added resource: x hive create temporary function func1 as 'xxx'; OK Time taken: 0.009 seconds hive list jars; xxx.jar hive select /*+ MAPJOIN(certain column1) */ * from tb1 join tb2 on tb1.column2 = func1(tb2.column3) ; Total MapReduce jobs = 1 14/08/07 17:38:04 WARN conf.Configuration: file:/tmp/[username]hive_2014-08-07_17-38-01_048_6199454015323812186-1/-local-10005/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/08/07 17:38:04 WARN conf.Configuration: file:/tmp/[username]/hive_2014-08-07_17-38-01_048_6199454015323812186-1/-local-10005/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/08/07 17:38:04 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 14/08/07 17:38:05 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore. Execution log at: /tmp/[username]/[username]_20140807173838_d673690f-c452-4ebb-bf53-9d663c49d04e.log 2014-08-07 05:38:05 Starting to launch local task to process map join; maximum memory = 2027290624 Execution failed with exit status: 2 Obtaining error information Task failed! Task ID: Stage-4 Logs: /tmp/[username]/hive.log FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask -- Then I watch the log named /tmp/[username]/[username]_20140807173838_d673690f-c452-4ebb-bf53-9d663c49d04e.log, it writes: 2014-08-07 16:46:59,105 INFO mr.MapredLocalTask (SessionState.java:printInfo(417)) - 2014-08-07 04:46:59 Starting to launch local task to process map join; maximum memory = 2027290624 2014-08-07 16:46:59,114 INFO mr.MapredLocalTask (MapredLocalTask.java:initializeOperators(389)) - fetchoperator for tmp_compete created 2014-08-07 16:46:59,196 INFO exec.TableScanOperator (Operator.java:initialize(338)) - Initializing Self 0 TS 2014-08-07 16:46:59,197 INFO exec.TableScanOperator (Operator.java:initializeChildren(403)) - Operator 0 TS initialized 2014-08-07 16:46:59,197 INFO exec.TableScanOperator (Operator.java:initializeChildren(407)) - Initializing children of 0 TS 2014-08-07 16:46:59,197 INFO exec.HashTableSinkOperator (Operator.java:initialize(442)) - Initializing child 1 HASHTABLESINK 2014-08-07 16:46:59,197 INFO exec.HashTableSinkOperator (Operator.java:initialize(338)) - Initializing Self 1 HASHTABLESINK 2014-08-07 16:46:59,198 INFO mapjoin.MapJoinMemoryExhaustionHandler (MapJoinMemoryExhaustionHandler.java:init(72)) - JVM Max Heap Size: 2027290624 2014-08-07 16:46:59,222 ERROR mr.MapredLocalTask (MapredLocalTask.java:executeFromChildJVM(324)) - Hive Runtime Error: Map local work failed org.apache.hadoop.hive.ql.exec.UDFArgumentException: The UDF implementation class 'xxx' is not present in the class path at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:142) at org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:116) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:127) at org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:66) at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:140) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377) at
[jira] [Updated] (HIVE-7644) hive custom udf cannot be used in the join_condition(on)
[ https://issues.apache.org/jira/browse/HIVE-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hayok updated HIVE-7644: Description: console: hive ADD JAR x; Added x to class path Added resource: x hive create temporary function func1 as 'xxx'; OK Time taken: 0.009 seconds hive list jars; xxx.jar hive select /*+ MAPJOIN(certain column1) */ * from tb1 join tb2 on tb1.column2 = func1(tb2.column3) ; Total MapReduce jobs = 1 Execution log at: /tmp/[username]/[username]_20140807173838_d673690f-c452-4ebb-bf53-9d663c49d04e.log 2014-08-07 05:38:05 Starting to launch local task to process map join; maximum memory = 2027290624 Execution failed with exit status: 2 Obtaining error information Task failed! Task ID: Stage-4 Logs: /tmp/[username]/hive.log FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask -- Then I watch the log named /tmp/[username]/[username]_20140807173838_d673690f-c452-4ebb-bf53-9d663c49d04e.log, it writes: 2014-08-07 16:46:59,105 INFO mr.MapredLocalTask (SessionState.java:printInfo(417)) - 2014-08-07 04:46:59 Starting to launch local task to process map join; maximum memory = 2027290624 2014-08-07 16:46:59,114 INFO mr.MapredLocalTask (MapredLocalTask.java:initializeOperators(389)) - fetchoperator for tmp_compete created 2014-08-07 16:46:59,196 INFO exec.TableScanOperator (Operator.java:initialize(338)) - Initializing Self 0 TS 2014-08-07 16:46:59,197 INFO exec.TableScanOperator (Operator.java:initializeChildren(403)) - Operator 0 TS initialized 2014-08-07 16:46:59,197 INFO exec.TableScanOperator (Operator.java:initializeChildren(407)) - Initializing children of 0 TS 2014-08-07 16:46:59,197 INFO exec.HashTableSinkOperator (Operator.java:initialize(442)) - Initializing child 1 HASHTABLESINK 2014-08-07 16:46:59,197 INFO exec.HashTableSinkOperator (Operator.java:initialize(338)) - Initializing Self 1 HASHTABLESINK 2014-08-07 16:46:59,198 INFO mapjoin.MapJoinMemoryExhaustionHandler (MapJoinMemoryExhaustionHandler.java:init(72)) - JVM Max Heap Size: 2027290624 2014-08-07 16:46:59,222 ERROR mr.MapredLocalTask (MapredLocalTask.java:executeFromChildJVM(324)) - Hive Runtime Error: Map local work failed org.apache.hadoop.hive.ql.exec.UDFArgumentException: The UDF implementation class 'xxx' is not present in the class path at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:142) at org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:116) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:127) at org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:66) at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:140) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:453) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:409) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:188) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:408) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:302) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:728) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) I ensure there is no authorization problem with it,and when the udf is not in the join-condition such as select udf(column_name) or where udf(column_name) it works good. Anyone else encountered the problem? was: console: hive ADD JAR x; Added x to class path Added resource: x hive create temporary function func1 as 'xxx'; OK Time taken: 0.009 seconds hive list jars; xxx.jar hive select /*+ MAPJOIN(certain column1) */ * from tb1 join tb2 on tb1.column2 = func1(tb2.column3) ; Total MapReduce jobs = 1 14/08/07 17:38:04 WARN conf.Configuration: file:/tmp/[username]hive_2014-08-07_17-38-01_048_6199454015323812186-1/-local-10005/jobconf.xml:an attempt to override final
[jira] [Updated] (HIVE-7645) Hive CompactorMR job set NUM_BUCKETS mistake
[ https://issues.apache.org/jira/browse/HIVE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Wang updated HIVE-7645: -- Attachment: HIVE-7645.patch Hive CompactorMR job set NUM_BUCKETS mistake Key: HIVE-7645 URL: https://issues.apache.org/jira/browse/HIVE-7645 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Xiaoyu Wang Attachments: HIVE-7645.patch code: job.setInt(NUM_BUCKETS, sd.getBucketColsSize()); should change to: job.setInt(NUM_BUCKETS, sd.getNumBuckets()); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7644) hive custom udf cannot be used in the join_condition(on)
[ https://issues.apache.org/jira/browse/HIVE-7644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hayok updated HIVE-7644: Description: console: hive ADD JAR x; Added x to class path Added resource: x hive create temporary function func1 as 'xxx'; OK Time taken: 0.009 seconds hive list jars; xxx.jar hive select /*+ MAPJOIN(certain column1) */ * from tb1 join tb2 on tb1.column2 = func1(tb2.column3) ; Total MapReduce jobs = 1 Execution log at: /tmp/[username]/[username]_20140807173838_d673690f-c452-4ebb-bf53-9d663c49d04e.log 2014-08-07 05:38:05 Starting to launch local task to process map join; maximum memory = 2027290624 Execution failed with exit status: 2 Obtaining error information Task failed! Task ID: Stage-4 Logs: /tmp/[username]/hive.log FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask -- Then I watch the log named /tmp/[username]/[username]_20140807173838_d673690f-c452-4ebb-bf53-9d663c49d04e.log, it writes: 2014-08-07 16:46:59,105 INFO mr.MapredLocalTask (SessionState.java:printInfo(417)) - 2014-08-07 04:46:59 Starting to launch local task to process map join; maximum memory = 2027290624 2014-08-07 16:46:59,114 INFO mr.MapredLocalTask (MapredLocalTask.java:initializeOperators(389)) - fetchoperator for tmp_compete created 2014-08-07 16:46:59,196 INFO exec.TableScanOperator (Operator.java:initialize(338)) - Initializing Self 0 TS 2014-08-07 16:46:59,197 INFO exec.TableScanOperator (Operator.java:initializeChildren(403)) - Operator 0 TS initialized 2014-08-07 16:46:59,197 INFO exec.TableScanOperator (Operator.java:initializeChildren(407)) - Initializing children of 0 TS 2014-08-07 16:46:59,197 INFO exec.HashTableSinkOperator (Operator.java:initialize(442)) - Initializing child 1 HASHTABLESINK 2014-08-07 16:46:59,197 INFO exec.HashTableSinkOperator (Operator.java:initialize(338)) - Initializing Self 1 HASHTABLESINK 2014-08-07 16:46:59,198 INFO mapjoin.MapJoinMemoryExhaustionHandler (MapJoinMemoryExhaustionHandler.java:init(72)) - JVM Max Heap Size: 2027290624 2014-08-07 16:46:59,222 ERROR mr.MapredLocalTask (MapredLocalTask.java:executeFromChildJVM(324)) - Hive Runtime Error: Map local work failed org.apache.hadoop.hive.ql.exec.UDFArgumentException: The UDF implementation class 'xxx' is not present in the class path at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:142) at org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:116) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:127) at org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:66) at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:140) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:453) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:409) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:188) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:408) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:302) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:728) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) I ensure there is no authorization problem with it,and when the udf is not in the join-condition such as 'select udf(column_name)' or 'where udf(column_name)' it works good. Anyone else encountered the problem? was: console: hive ADD JAR x; Added x to class path Added resource: x hive create temporary function func1 as 'xxx'; OK Time taken: 0.009 seconds hive list jars; xxx.jar hive select /*+ MAPJOIN(certain column1) */ * from tb1 join tb2 on tb1.column2 = func1(tb2.column3) ; Total MapReduce jobs = 1 Execution log at: /tmp/[username]/[username]_20140807173838_d673690f-c452-4ebb-bf53-9d663c49d04e.log 2014-08-07 05:38:05 Starting to launch local task to process
[jira] [Updated] (HIVE-7645) Hive CompactorMR job set NUM_BUCKETS mistake
[ https://issues.apache.org/jira/browse/HIVE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Wang updated HIVE-7645: -- Status: Patch Available (was: Open) Hive CompactorMR job set NUM_BUCKETS mistake Key: HIVE-7645 URL: https://issues.apache.org/jira/browse/HIVE-7645 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Xiaoyu Wang Attachments: HIVE-7645.patch code: job.setInt(NUM_BUCKETS, sd.getBucketColsSize()); should change to: job.setInt(NUM_BUCKETS, sd.getNumBuckets()); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7642) Set hive input format by configuration.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089095#comment-14089095 ] Hive QA commented on HIVE-7642: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660334/HIVE-7642.1-spark.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5826 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/20/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/20/console Test logs: http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-20/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12660334 Set hive input format by configuration.[Spark Branch] - Key: HIVE-7642 URL: https://issues.apache.org/jira/browse/HIVE-7642 Project: Hive Issue Type: Bug Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Attachments: HIVE-7642.1-spark.patch Currently hive input format is hard coded as HiveInputFormat, we should set this parameter from configuration. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7260) between operator for vectorization should support non-constant expressions as inputs
[ https://issues.apache.org/jira/browse/HIVE-7260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089120#comment-14089120 ] Hive QA commented on HIVE-7260: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660295/HIVE-7260.1.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5889 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/205/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/205/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-205/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12660295 between operator for vectorization should support non-constant expressions as inputs Key: HIVE-7260 URL: https://issues.apache.org/jira/browse/HIVE-7260 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-7260.1.patch Follow-up jira for HIVE-7166. Eg query where vectorization is disabled: select x from T where T.y between T.a and T.d; -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review requests JIRA process
Hey, just wanted to bump this in case anyone has any opinions as well as what the procedure is for patches that don't get a review for a longer period of time. I'd like to avoid a lot of unnecessary work on my end :) Cheers, Lars On Wed, Aug 6, 2014 at 1:15 AM, Lars Francke lars.fran...@gmail.com wrote: Hi everyone, I have a couple of review requests that I'd love for someone to look at. I'll list them below. I have however two more questions. Two of my issues are clean ups of existing code (HIVE-7622 HIVE-7543). I realize that they don't bring immediate benefit and I had planned to fix some more of the issues Checkstyle, my IDE and SonarQube[1] complain about. Is this okay for you guys or would you rather I stop this? I ask because they take a significant amount of time not only for myself but also for a reviewer and they go stale fast. I think it helps to have a clean codebase for what it's worth. The second question is about the JIRA process: What's the best way to get someone to review patches? I currently always create a review, attach the patch to the Issue and set it to PATCH AVAILABLE. The documentation is not quite clear about the process[2]. These are the issues in need of reviews: * https://issues.apache.org/jira/browse/HIVE-7622 (huge but I'd appreciate an answer fast to avoid having to rebase it often) * https://issues.apache.org/jira/browse/HIVE-7543 * https://issues.apache.org/jira/browse/HIVE-6123 * https://issues.apache.org/jira/browse/HIVE-7107 Thanks! Cheers, Lars [1] http://www.sonarqube.org/ I have a publicly accessible server set up with Hive analyzed, happy to send the link to anyone interested http://i.imgur.com/e3KjR26.png [2] https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-MakingChanges
Mail bounces from ebuddy.com
Hi, every time I send a mail to dev@ I get two bounce mails from two people at ebuddy.com. I don't want to post the E-Mail addresses publicly but I can send them on if needed (and it can be triggered easily by just replying to this mail I guess). Could we maybe remove them from the list? Cheers, Lars
[jira] [Commented] (HIVE-5718) Support direct fetch for lateral views, sub queries, etc.
[ https://issues.apache.org/jira/browse/HIVE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089155#comment-14089155 ] Hive QA commented on HIVE-5718: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660286/HIVE-5718.7.patch.txt {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5883 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/206/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/206/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-206/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12660286 Support direct fetch for lateral views, sub queries, etc. - Key: HIVE-5718 URL: https://issues.apache.org/jira/browse/HIVE-5718 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D13857.1.patch, D13857.2.patch, D13857.3.patch, HIVE-5718.4.patch.txt, HIVE-5718.5.patch.txt, HIVE-5718.6.patch.txt, HIVE-5718.7.patch.txt Extend HIVE-2925 with LV and SubQ. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7624) Reduce operator initialization failed when running multiple MR query on spark
[ https://issues.apache.org/jira/browse/HIVE-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-7624: - Attachment: HIVE-7624.patch Reduce operator initialization failed when running multiple MR query on spark - Key: HIVE-7624 URL: https://issues.apache.org/jira/browse/HIVE-7624 Project: Hive Issue Type: Bug Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-7624.patch The following error occurs when I try to run a query with multiple reduce works (M-R-R): {quote} 14/08/05 12:17:07 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 1) java.lang.RuntimeException: Reduce operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:170) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:53) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:31) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: cannot find field reducesinkkey0 from [0:_col0] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147) … {quote} I suspect we're applying the reduce function in wrong order. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7624) Reduce operator initialization failed when running multiple MR query on spark
[ https://issues.apache.org/jira/browse/HIVE-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089181#comment-14089181 ] Rui Li commented on HIVE-7624: -- This patch solves the reducesinkkey0 problem. Map work and reduce work finish successfully. However, no result is returned. I checked the log and found the second reduce work got nothing to process. Not sure what is missing here... I quickly looked at tez code and find it sets output collector for each reduce sink. (OperatorUtils.setChildrenCollector) Don't know if this is related though Reduce operator initialization failed when running multiple MR query on spark - Key: HIVE-7624 URL: https://issues.apache.org/jira/browse/HIVE-7624 Project: Hive Issue Type: Bug Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-7624.patch The following error occurs when I try to run a query with multiple reduce works (M-R-R): {quote} 14/08/05 12:17:07 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 1) java.lang.RuntimeException: Reduce operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:170) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:53) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:31) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: cannot find field reducesinkkey0 from [0:_col0] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147) … {quote} I suspect we're applying the reduce function in wrong order. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
[ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089205#comment-14089205 ] Hive QA commented on HIVE-7405: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660303/HIVE-7405.7.patch {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 5883 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/207/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/207/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-207/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12660303 Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic) -- Key: HIVE-7405 URL: https://issues.apache.org/jira/browse/HIVE-7405 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch, HIVE-7405.3.patch, HIVE-7405.4.patch, HIVE-7405.5.patch, HIVE-7405.6.patch, HIVE-7405.7.patch Vectorize the basic case that does not have any count distinct aggregation. Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time. Thus, the values in the batch can be aggregated quickly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7634) Use Configuration.getPassword() if available to eliminate passwords from hive-site.xml
[ https://issues.apache.org/jira/browse/HIVE-7634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089271#comment-14089271 ] Hive QA commented on HIVE-7634: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660301/HIVE-7634.1.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5885 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/208/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/208/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-208/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12660301 Use Configuration.getPassword() if available to eliminate passwords from hive-site.xml -- Key: HIVE-7634 URL: https://issues.apache.org/jira/browse/HIVE-7634 Project: Hive Issue Type: Bug Components: Security Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-7634.1.patch HADOOP-10607 provides a Configuration.getPassword() API that allows passwords to be retrieved from a configured credential provider, while also being able to fall back to the HiveConf setting if no provider is set up. Hive should use this API for versions of Hadoop that support this API. This would give users the ability to remove the passwords from their Hive configuration files. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7630) DROP PARTITION does not recognize built-in function
[ https://issues.apache.org/jira/browse/HIVE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gwenael Le Barzic updated HIVE-7630: Description: Hello ! We currently have the following problem with Hive 0.13 in the HDP 2.1. {code:shell}CREATE TABLE MyTable ( mystring STRING, mydate DATE ) PARTITIONED BY (DT_PARTITION DATE);{code} When I try to do this : ALTER TABLE MyTable DROP PARTITION (DT_PARTITION = DATE_SUB(‘2012-09-13’,1)); I get the following error message : NoViableAltException(26@[221:1: constant : ( Number | dateLiteral | StringLiteral | stringLiteralSequence | BigintLiteral | SmallintLiteral | TinyintLiteral | DecimalLiteral | charSetStringLiteral | booleanValue );]) at org.antlr.runtime.DFA.noViableAlt(DFA.java:158) at org.antlr.runtime.DFA.predict(DFA.java:116) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.constant(HiveParser_IdentifiersParser.java:6128) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionVal(HiveParser_IdentifiersParser.java:10819) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionSpec(HiveParser_IdentifiersParser.java:10664) at org.apache.hadoop.hive.ql.parse.HiveParser.dropPartitionSpec(HiveParser.java:40160) at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatementSuffixDropPartitions(HiveParser.java:9953) at org.apache.hadoop.hive.ql.parse.HiveParser.alterTableStatementSuffix(HiveParser.java:6731) at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatement(HiveParser.java:6552) at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2189) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1398) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1036) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:409) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:323) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:980) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1045) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) FAILED: ParseException line 1:51 cannot recognize input near 'DATE_SUB' '(' '.' in constant In fact, it is larger than that. You cannot get the result of a built-in function (for example DATE_SUB) into a variable in hive and use in later in the hql script. Best regards. Gwenael Le Barzic was: Hello ! We currently have the following problem with Hive 0.13 in the HDP 2.1. CREATE TABLE MyTable ( mystring STRING, mydate DATE ) PARTITIONED BY (DT_PARTITION DATE); When I try to do this : ALTER TABLE MyTable DROP PARTITION (DT_PARTITION = DATE_SUB(‘2012-09-13’,1)); I get the following error message : NoViableAltException(26@[221:1: constant : ( Number | dateLiteral | StringLiteral | stringLiteralSequence | BigintLiteral | SmallintLiteral | TinyintLiteral | DecimalLiteral | charSetStringLiteral | booleanValue );]) at org.antlr.runtime.DFA.noViableAlt(DFA.java:158) at org.antlr.runtime.DFA.predict(DFA.java:116) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.constant(HiveParser_IdentifiersParser.java:6128) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionVal(HiveParser_IdentifiersParser.java:10819) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionSpec(HiveParser_IdentifiersParser.java:10664) at org.apache.hadoop.hive.ql.parse.HiveParser.dropPartitionSpec(HiveParser.java:40160) at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatementSuffixDropPartitions(HiveParser.java:9953) at
[jira] [Updated] (HIVE-7630) DROP PARTITION does not recognize built-in function
[ https://issues.apache.org/jira/browse/HIVE-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gwenael Le Barzic updated HIVE-7630: Description: Hello ! We currently have the following problem with Hive 0.13 in the HDP 2.1. {code:none}CREATE TABLE MyTable ( mystring STRING, mydate DATE ) PARTITIONED BY (DT_PARTITION DATE);{code} When I try to do this : {code:none}ALTER TABLE MyTable DROP PARTITION (DT_PARTITION = DATE_SUB(‘2012-09-13’,1));{code} I get the following error message : {code:none}NoViableAltException(26@[221:1: constant : ( Number | dateLiteral | StringLiteral | stringLiteralSequence | BigintLiteral | SmallintLiteral | TinyintLiteral | DecimalLiteral | charSetStringLiteral | booleanValue );]) at org.antlr.runtime.DFA.noViableAlt(DFA.java:158) at org.antlr.runtime.DFA.predict(DFA.java:116) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.constant(HiveParser_IdentifiersParser.java:6128) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionVal(HiveParser_IdentifiersParser.java:10819) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionSpec(HiveParser_IdentifiersParser.java:10664) at org.apache.hadoop.hive.ql.parse.HiveParser.dropPartitionSpec(HiveParser.java:40160) at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatementSuffixDropPartitions(HiveParser.java:9953) at org.apache.hadoop.hive.ql.parse.HiveParser.alterTableStatementSuffix(HiveParser.java:6731) at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatement(HiveParser.java:6552) at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2189) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1398) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1036) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:409) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:323) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:980) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1045) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) FAILED: ParseException line 1:51 cannot recognize input near 'DATE_SUB' '(' '.' in constant{code} In fact, it is larger than that. You cannot get the result of a built-in function (for example DATE_SUB) into a variable in hive and use in later in the hql script. Best regards. Gwenael Le Barzic was: Hello ! We currently have the following problem with Hive 0.13 in the HDP 2.1. {code:shell}CREATE TABLE MyTable ( mystring STRING, mydate DATE ) PARTITIONED BY (DT_PARTITION DATE);{code} When I try to do this : ALTER TABLE MyTable DROP PARTITION (DT_PARTITION = DATE_SUB(‘2012-09-13’,1)); I get the following error message : NoViableAltException(26@[221:1: constant : ( Number | dateLiteral | StringLiteral | stringLiteralSequence | BigintLiteral | SmallintLiteral | TinyintLiteral | DecimalLiteral | charSetStringLiteral | booleanValue );]) at org.antlr.runtime.DFA.noViableAlt(DFA.java:158) at org.antlr.runtime.DFA.predict(DFA.java:116) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.constant(HiveParser_IdentifiersParser.java:6128) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionVal(HiveParser_IdentifiersParser.java:10819) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.dropPartitionSpec(HiveParser_IdentifiersParser.java:10664) at org.apache.hadoop.hive.ql.parse.HiveParser.dropPartitionSpec(HiveParser.java:40160) at org.apache.hadoop.hive.ql.parse.HiveParser.alterStatementSuffixDropPartitions(HiveParser.java:9953) at
[jira] [Updated] (HIVE-7635) Query having same aggregate functions but different case throws IndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-7635: -- Attachment: HIVE-7635.1.patch Fixed the failed test (due to the missing update in having.q.out for Tez). Uploaded the new patch here and also to RB. Query having same aggregate functions but different case throws IndexOutOfBoundsException - Key: HIVE-7635 URL: https://issues.apache.org/jira/browse/HIVE-7635 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Fix For: 0.14.0 Attachments: HIVE-7635.1.patch, HIVE-7635.patch A query having same aggregate functions (e.g. count) but in different case does not work and throws IndexOutOfBoundsException. {code} Query: SELECT key, COUNT(value) FROM src GROUP BY key HAVING count(value) = 4 --- Error log: 14/08/06 11:00:45 ERROR ql.Driver: FAILED: IndexOutOfBoundsException Index: 2, Size: 2 java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanReduceSinkOperator(SemanticAnalyzer.java:4173) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5165) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8337) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9178) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9431) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:207) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:414) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1023) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:960) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:265) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:217) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:427) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:800) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24404: HIVE-7635: Query having same aggregate functions but different case throws IndexOutOfBoundsException
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24404/ --- (Updated Aug. 7, 2014, 3:40 p.m.) Review request for hive. Changes --- Fixed the failed test due to the missing update to having.q.out for Tez. Recreated the diff and uploaded here. Thanks for the review. Bugs: HIVE-7635 https://issues.apache.org/jira/browse/HIVE-7635 Repository: hive-git Description --- A query having same aggregate functions but in different case (e.g. SELECT key, COUNT(value) FROM src GROUP BY key HAVING count(value) = 4) does not work and throws IndexOutOfBoundsException. The cause is that Hive treats count(value) and COUNT(value) in this query as two different aggregate expression when compiling query and generating plan. They are case sensitive. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 51838ae ql/src/test/queries/clientpositive/having.q 5b1aa69 ql/src/test/results/clientpositive/having.q.out d912001 ql/src/test/results/clientpositive/tez/having.q.out e96342d Diff: https://reviews.apache.org/r/24404/diff/ Testing --- 1. The fix addressed the failed query with different case in aggregate function name in the query 2. New unit tests passed 3. patch will be submitted for pre-commit tests Thanks, Chaoyu Tang
Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs
On Aug. 5, 2014, 8:56 a.m., Lars Francke wrote: service/if/TCLIService.thrift, line 1043 https://reviews.apache.org/r/24293/diff/1/?file=651542#file651542line1043 I know that no one else does it yet in this file and I haven't gotten around to finishing my patch. But could you use this style of comments instead: /** Get the output result of a query. */ Thank you! That will be automatically moved into a comment section (python, javadoc etc.) by the Thrift compiler. Thanks for you reminding. This comment style makes the generated code look better. Not sure whether you are working on changing all the comment style in TCLIService.thrift file. So I just change the 3 comments related with this fix. If not, I'm glad to make the changes of all the comments in the thrift through this patch or another new Jira. - Dong --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/#review49573 --- On Aug. 5, 2014, 3:47 a.m., Dong Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/ --- (Updated Aug. 5, 2014, 3:47 a.m.) Review request for hive. Repository: hive-git Description --- HIVE-4629: HS2 should support an API to retrieve query logs HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 service/if/TCLIService.thrift 80086b4 service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java 808b73f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchType.java PRE-CREATION service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java f665146 service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java c9fd5f9 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java caf413d service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java fd4e94d service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java ebca996 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 05991e0 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 315dbea service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 0ec2543 service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java 3d3fddc service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java e0d17a1 service/src/java/org/apache/hive/service/cli/operation/Operation.java 45fbd61 service/src/java/org/apache/hive/service/cli/operation/OperationLog.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 21c33bc service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java de54ca1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 4c3164e service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java b39d64d service/src/java/org/apache/hive/service/cli/session/SessionManager.java 816bea4 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 5c87bcb service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java e3384d3 service/src/test/org/apache/hive/service/cli/operation/TestOperationLoggingAPI.java PRE-CREATION Diff: https://reviews.apache.org/r/24293/diff/ Testing --- UT passed. Thanks, Dong Chen
[jira] [Commented] (HIVE-7357) Add vectorized support for BINARY data type
[ https://issues.apache.org/jira/browse/HIVE-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089359#comment-14089359 ] Hive QA commented on HIVE-7357: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660307/HIVE-7357.7.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5885 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/209/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/209/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-209/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12660307 Add vectorized support for BINARY data type --- Key: HIVE-7357 URL: https://issues.apache.org/jira/browse/HIVE-7357 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7357.1.patch, HIVE-7357.2.patch, HIVE-7357.3.patch, HIVE-7357.4.patch, HIVE-7357.5.patch, HIVE-7357.6.patch, HIVE-7357.7.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5760) Add vectorized support for CHAR/VARCHAR data types
[ https://issues.apache.org/jira/browse/HIVE-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089360#comment-14089360 ] Hive QA commented on HIVE-5760: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660304/HIVE-5760.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/210/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/210/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-210/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-210/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'itests/qtest/testconfiguration.properties' Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorizedRowBatchCtx.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedColumnarSerDe.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedBatchUtil.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/core/target hcatalog/streaming/target hcatalog/server-extensions/target hcatalog/hcatalog-pig-adapter/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hwi/target common/target common/src/gen service/target contrib/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target ql/src/test/results/clientpositive/vector_data_types.q.out ql/src/test/results/clientpositive/tez/vector_data_types.q.out ql/src/test/queries/clientpositive/vector_data_types.q + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1616512. At revision 1616512. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12660304 Add vectorized support for CHAR/VARCHAR data types -- Key: HIVE-5760 URL: https://issues.apache.org/jira/browse/HIVE-5760 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Matt McCline Attachments:
Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs
On Aug. 5, 2014, 8:56 a.m., Lars Francke wrote: service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java, line 81 https://reviews.apache.org/r/24293/diff/1/?file=651562#file651562line81 I don't understand how log data ends up in the writer? I looked for accesses of it but it doesn't seem to be touched at all. What am I missing? Also for a little boost if the code stays like this you can move it after the null check to avoid string conversion if the OperationLog is null This LogDivertAppender inherits from WriterAppender, and when its method subAppend(event) is invoked, the first line super.subAppend(event) will write the log into writer. Not matter the OperationLog is null or not, the writer should be reset, since the log in it will be not used any more in this Appender. Otherwise, the remaining log in writer might mix with next log. So maybe we could keep the access and null check order. :) - Dong --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/#review49573 --- On Aug. 5, 2014, 3:47 a.m., Dong Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/ --- (Updated Aug. 5, 2014, 3:47 a.m.) Review request for hive. Repository: hive-git Description --- HIVE-4629: HS2 should support an API to retrieve query logs HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 service/if/TCLIService.thrift 80086b4 service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java 808b73f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchType.java PRE-CREATION service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java f665146 service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java c9fd5f9 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java caf413d service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java fd4e94d service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java ebca996 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 05991e0 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 315dbea service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 0ec2543 service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java 3d3fddc service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java e0d17a1 service/src/java/org/apache/hive/service/cli/operation/Operation.java 45fbd61 service/src/java/org/apache/hive/service/cli/operation/OperationLog.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 21c33bc service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java de54ca1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 4c3164e service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java b39d64d service/src/java/org/apache/hive/service/cli/session/SessionManager.java 816bea4 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 5c87bcb service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java e3384d3 service/src/test/org/apache/hive/service/cli/operation/TestOperationLoggingAPI.java PRE-CREATION Diff: https://reviews.apache.org/r/24293/diff/ Testing --- UT passed. Thanks, Dong Chen
[jira] [Commented] (HIVE-7553) avoid the scheduling maintenance window for every jar change
[ https://issues.apache.org/jira/browse/HIVE-7553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089396#comment-14089396 ] Brock Noland commented on HIVE-7553: Hi [~Ferd], Thank you so much for looking into this!! I think you have a good direction on some of the possible solutions. I do think this is a very big item with many different aspects. Would you be interested in creating a design document on this? There are many examples out there: https://cwiki.apache.org/confluence/display/Hive/DesignDocs e.g: https://cwiki.apache.org/confluence/display/Hive/Theta+Join If you create a design doc, I think the big aspect of this design doc would be evaluate all the pros/cons of each possible solution. Thank you again for looking at this!! Cheers, Brock avoid the scheduling maintenance window for every jar change Key: HIVE-7553 URL: https://issues.apache.org/jira/browse/HIVE-7553 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Ferdinand Xu Assignee: Ferdinand Xu When user needs to refresh existing or add a new jar to HS2, it needs to restart it. As HS2 is service exposed to clients, this requires scheduling maintenance window for every jar change. It would be great if we could avoid that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7492) Enhance SparkCollector
[ https://issues.apache.org/jira/browse/HIVE-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089400#comment-14089400 ] Brock Noland commented on HIVE-7492: +1 Enhance SparkCollector -- Key: HIVE-7492 URL: https://issues.apache.org/jira/browse/HIVE-7492 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Venki Korukanti Attachments: HIVE-7492-1-spark.patch, HIVE-7492.2-spark.patch SparkCollector is used to collect the rows generated by HiveMapFunction or HiveReduceFunction. It currently is backed by a ArrayList, and thus has unbounded memory usage. Ideally, the collector should have a bounded memory usage, and be able to spill to disc when its quota is reached. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs
On Aug. 5, 2014, 8:56 a.m., Lars Francke wrote: service/src/java/org/apache/hive/service/cli/operation/OperationLog.java, line 58 https://reviews.apache.org/r/24293/diff/1/?file=651565#file651565line58 can be final and then renamed Thank you! I made it final and it is a good point. But a little confused about the renamed? Do you mean the variable name threadLocalOperationLog? - Dong --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/#review49573 --- On Aug. 5, 2014, 3:47 a.m., Dong Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/ --- (Updated Aug. 5, 2014, 3:47 a.m.) Review request for hive. Repository: hive-git Description --- HIVE-4629: HS2 should support an API to retrieve query logs HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 service/if/TCLIService.thrift 80086b4 service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java 808b73f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchType.java PRE-CREATION service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java f665146 service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java c9fd5f9 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java caf413d service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java fd4e94d service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java ebca996 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 05991e0 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 315dbea service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 0ec2543 service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java 3d3fddc service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java e0d17a1 service/src/java/org/apache/hive/service/cli/operation/Operation.java 45fbd61 service/src/java/org/apache/hive/service/cli/operation/OperationLog.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 21c33bc service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java de54ca1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 4c3164e service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java b39d64d service/src/java/org/apache/hive/service/cli/session/SessionManager.java 816bea4 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 5c87bcb service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java e3384d3 service/src/test/org/apache/hive/service/cli/operation/TestOperationLoggingAPI.java PRE-CREATION Diff: https://reviews.apache.org/r/24293/diff/ Testing --- UT passed. Thanks, Dong Chen
[jira] [Updated] (HIVE-7492) Enhance SparkCollector
[ https://issues.apache.org/jira/browse/HIVE-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7492: --- Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Thank you very much [~vkorukanti] for your contribution! Would you mind opening another jira to allow RowContainer to write to the DFS as opposed to /tmp? I don't think this work should be done on the Spark branch and I don't think it's urgent. However, since many users have extremely small /tmp I don't think we should be writing unbounded amounts of data there. Committed to spark! Enhance SparkCollector -- Key: HIVE-7492 URL: https://issues.apache.org/jira/browse/HIVE-7492 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Venki Korukanti Fix For: spark-branch Attachments: HIVE-7492-1-spark.patch, HIVE-7492.2-spark.patch SparkCollector is used to collect the rows generated by HiveMapFunction or HiveReduceFunction. It currently is backed by a ArrayList, and thus has unbounded memory usage. Ideally, the collector should have a bounded memory usage, and be able to spill to disc when its quota is reached. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables
[ https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated HIVE-7629: --- Attachment: parquet_smb_join.patch Problem in SMB Joins between two Parquet tables --- Key: HIVE-7629 URL: https://issues.apache.org/jira/browse/HIVE-7629 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Suma Shivaprasad Attachments: HIVE-7629.patch The issue is clearly seen when two bucketed and sorted parquet tables with different number of columns are involved in the join . The following exception is seen Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66) at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables
[ https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated HIVE-7629: --- Attachment: HIVE-7629.patch Problem in SMB Joins between two Parquet tables --- Key: HIVE-7629 URL: https://issues.apache.org/jira/browse/HIVE-7629 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Suma Shivaprasad Attachments: HIVE-7629.patch The issue is clearly seen when two bucketed and sorted parquet tables with different number of columns are involved in the join . The following exception is seen Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66) at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables
[ https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated HIVE-7629: --- Fix Version/s: 0.14.0 Status: Patch Available (was: Open) Problem in SMB Joins between two Parquet tables --- Key: HIVE-7629 URL: https://issues.apache.org/jira/browse/HIVE-7629 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Suma Shivaprasad Fix For: 0.14.0 Attachments: HIVE-7629.patch The issue is clearly seen when two bucketed and sorted parquet tables with different number of columns are involved in the join . The following exception is seen Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66) at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables
[ https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated HIVE-7629: --- Labels: Parquet (was: ) Problem in SMB Joins between two Parquet tables --- Key: HIVE-7629 URL: https://issues.apache.org/jira/browse/HIVE-7629 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Suma Shivaprasad Labels: Parquet Fix For: 0.14.0 Attachments: HIVE-7629.patch The issue is clearly seen when two bucketed and sorted parquet tables with different number of columns are involved in the join . The following exception is seen Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66) at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6959) Remove vectorization related constant expression folding code once Constant propagation optimizer for Hive is committed
[ https://issues.apache.org/jira/browse/HIVE-6959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6959: --- Status: Open (was: Patch Available) Seems like vectorization_14.q vector_coalesce.q failed to vectorize and vector_cast_constant.q failed altogether. Remove vectorization related constant expression folding code once Constant propagation optimizer for Hive is committed --- Key: HIVE-6959 URL: https://issues.apache.org/jira/browse/HIVE-6959 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-6959.1.patch, HIVE-6959.2.patch, HIVE-6959.3.patch, HIVE-6959.4.patch HIVE-5771 covers Constant propagation optimizer for Hive. Now that HIVE-5771 is committed, we should remove any vectorization related code which duplicates this feature. For example, a fn to be cleaned is VectorizarionContext::foldConstantsForUnaryExprs(). In addition to this change, constant propagation should kick in when vectorization is enabled. i.e. we need to lift the HIVE_VECTORIZATION_ENABLED restriction inside ConstantPropagate::transform(). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables
[ https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suma Shivaprasad updated HIVE-7629: --- Attachment: (was: parquet_smb_join.patch) Problem in SMB Joins between two Parquet tables --- Key: HIVE-7629 URL: https://issues.apache.org/jira/browse/HIVE-7629 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Suma Shivaprasad Attachments: HIVE-7629.patch The issue is clearly seen when two bucketed and sorted parquet tables with different number of columns are involved in the join . The following exception is seen Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66) at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65) -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/#review49911 --- service/if/TCLIService.thrift https://reviews.apache.org/r/24293/#comment87340 I have a partial patch that changes all of them and I planned on submitting it when I'm back from holiday. - Lars Francke On Aug. 5, 2014, 3:47 a.m., Dong Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/ --- (Updated Aug. 5, 2014, 3:47 a.m.) Review request for hive. Repository: hive-git Description --- HIVE-4629: HS2 should support an API to retrieve query logs HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 service/if/TCLIService.thrift 80086b4 service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java 808b73f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchType.java PRE-CREATION service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java f665146 service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java c9fd5f9 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java caf413d service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java fd4e94d service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java ebca996 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 05991e0 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 315dbea service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 0ec2543 service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java 3d3fddc service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java e0d17a1 service/src/java/org/apache/hive/service/cli/operation/Operation.java 45fbd61 service/src/java/org/apache/hive/service/cli/operation/OperationLog.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 21c33bc service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java de54ca1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 4c3164e service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java b39d64d service/src/java/org/apache/hive/service/cli/session/SessionManager.java 816bea4 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 5c87bcb service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java e3384d3 service/src/test/org/apache/hive/service/cli/operation/TestOperationLoggingAPI.java PRE-CREATION Diff: https://reviews.apache.org/r/24293/diff/ Testing --- UT passed. Thanks, Dong Chen
[jira] [Commented] (HIVE-7629) Problem in SMB Joins between two Parquet tables
[ https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089437#comment-14089437 ] Brock Noland commented on HIVE-7629: [~suma.shivaprasad] can you add a review board item? FYI [~szehon] Problem in SMB Joins between two Parquet tables --- Key: HIVE-7629 URL: https://issues.apache.org/jira/browse/HIVE-7629 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Suma Shivaprasad Labels: Parquet Fix For: 0.14.0 Attachments: HIVE-7629.patch The issue is clearly seen when two bucketed and sorted parquet tables with different number of columns are involved in the join . The following exception is seen Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66) at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65) -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs
On Aug. 5, 2014, 8:56 a.m., Lars Francke wrote: service/src/java/org/apache/hive/service/cli/operation/OperationLog.java, line 58 https://reviews.apache.org/r/24293/diff/1/?file=651565#file651565line58 can be final and then renamed Dong Chen wrote: Thank you! I made it final and it is a good point. But a little confused about the renamed? Do you mean the variable name threadLocalOperationLog? static finals have the naming convention of being all upper case with underscores in between. So it should be THREAD_LOCAL_OPERATION_LOG On Aug. 5, 2014, 8:56 a.m., Lars Francke wrote: service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java, line 81 https://reviews.apache.org/r/24293/diff/1/?file=651562#file651562line81 I don't understand how log data ends up in the writer? I looked for accesses of it but it doesn't seem to be touched at all. What am I missing? Also for a little boost if the code stays like this you can move it after the null check to avoid string conversion if the OperationLog is null Dong Chen wrote: This LogDivertAppender inherits from WriterAppender, and when its method subAppend(event) is invoked, the first line super.subAppend(event) will write the log into writer. Not matter the OperationLog is null or not, the writer should be reset, since the log in it will be not used any more in this Appender. Otherwise, the remaining log in writer might mix with next log. So maybe we could keep the access and null check order. :) Ahh thanks for the explanation. I missed the setWriter bit. - Lars --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/#review49573 --- On Aug. 5, 2014, 3:47 a.m., Dong Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/ --- (Updated Aug. 5, 2014, 3:47 a.m.) Review request for hive. Repository: hive-git Description --- HIVE-4629: HS2 should support an API to retrieve query logs HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 service/if/TCLIService.thrift 80086b4 service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java 808b73f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchType.java PRE-CREATION service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java f665146 service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java c9fd5f9 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java caf413d service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java fd4e94d service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java ebca996 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 05991e0 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 315dbea service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 0ec2543 service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java 3d3fddc service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java e0d17a1 service/src/java/org/apache/hive/service/cli/operation/Operation.java 45fbd61 service/src/java/org/apache/hive/service/cli/operation/OperationLog.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 21c33bc service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java de54ca1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95
Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs
On Aug. 7, 2014, 4:57 p.m., Lars Francke wrote: service/if/TCLIService.thrift, line 1043 https://reviews.apache.org/r/24293/diff/1/?file=651542#file651542line1043 I have a partial patch that changes all of them and I planned on submitting it when I'm back from holiday. Sorry I messed up RB. This was meant as a reply to your Thrift comment answer and not a new issue. - Lars --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/#review49911 --- On Aug. 5, 2014, 3:47 a.m., Dong Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/ --- (Updated Aug. 5, 2014, 3:47 a.m.) Review request for hive. Repository: hive-git Description --- HIVE-4629: HS2 should support an API to retrieve query logs HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 service/if/TCLIService.thrift 80086b4 service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java 808b73f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchType.java PRE-CREATION service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java f665146 service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java c9fd5f9 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java caf413d service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java fd4e94d service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java ebca996 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 05991e0 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 315dbea service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 0ec2543 service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java 3d3fddc service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java e0d17a1 service/src/java/org/apache/hive/service/cli/operation/Operation.java 45fbd61 service/src/java/org/apache/hive/service/cli/operation/OperationLog.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 21c33bc service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java de54ca1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 4c3164e service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java b39d64d service/src/java/org/apache/hive/service/cli/session/SessionManager.java 816bea4 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 5c87bcb service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java e3384d3 service/src/test/org/apache/hive/service/cli/operation/TestOperationLoggingAPI.java PRE-CREATION Diff: https://reviews.apache.org/r/24293/diff/ Testing --- UT passed. Thanks, Dong Chen
[jira] [Commented] (HIVE-7624) Reduce operator initialization failed when running multiple MR query on spark
[ https://issues.apache.org/jira/browse/HIVE-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089455#comment-14089455 ] Chao commented on HIVE-7624: Great! Thanks [~ruili]. I'll try this patch. Reduce operator initialization failed when running multiple MR query on spark - Key: HIVE-7624 URL: https://issues.apache.org/jira/browse/HIVE-7624 Project: Hive Issue Type: Bug Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-7624.patch The following error occurs when I try to run a query with multiple reduce works (M-R-R): {quote} 14/08/05 12:17:07 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 1) java.lang.RuntimeException: Reduce operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:170) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:53) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:31) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: cannot find field reducesinkkey0 from [0:_col0] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147) … {quote} I suspect we're applying the reduce function in wrong order. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7624) Reduce operator initialization failed when running multiple MR query on spark
[ https://issues.apache.org/jira/browse/HIVE-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089459#comment-14089459 ] Brock Noland commented on HIVE-7624: During debugging I have used the code below {noformat} System.err.println(JoinOperator + alias + row = + SerDeUtils.getJSONString(row, inputObjInspectors[tag])); {noformat} I wonder if we should not commit that to each operator for debugging since it's much easier to see how the rows are filtered, modified... Reduce operator initialization failed when running multiple MR query on spark - Key: HIVE-7624 URL: https://issues.apache.org/jira/browse/HIVE-7624 Project: Hive Issue Type: Bug Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-7624.patch The following error occurs when I try to run a query with multiple reduce works (M-R-R): {quote} 14/08/05 12:17:07 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 1) java.lang.RuntimeException: Reduce operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:170) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:53) at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:31) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:164) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:596) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:54) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: cannot find field reducesinkkey0 from [0:_col0] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:147) … {quote} I suspect we're applying the reduce function in wrong order. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/ --- (Updated Aug. 7, 2014, 5:06 p.m.) Review request for hive. Changes --- Updated patch HIVE_4629.5.patch. 1. address the review comments. 2. fix the failed case in HIVE QA. (org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync) Repository: hive-git Description --- HIVE-4629: HS2 should support an API to retrieve query logs HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 service/if/TCLIService.thrift 80086b4 service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java 808b73f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchType.java PRE-CREATION service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java f665146 service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java c9fd5f9 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java caf413d service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java fd4e94d service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java ebca996 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 05991e0 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 315dbea service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 0ec2543 service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java 3d3fddc service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java e0d17a1 service/src/java/org/apache/hive/service/cli/operation/Operation.java 45fbd61 service/src/java/org/apache/hive/service/cli/operation/OperationLog.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 21c33bc service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java de54ca1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 4c3164e service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java b39d64d service/src/java/org/apache/hive/service/cli/session/SessionManager.java 816bea4 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 5c87bcb service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java e3384d3 service/src/test/org/apache/hive/service/cli/operation/TestOperationLoggingAPI.java PRE-CREATION Diff: https://reviews.apache.org/r/24293/diff/ Testing --- UT passed. Thanks, Dong Chen
Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs
On Aug. 7, 2014, 4:57 p.m., Lars Francke wrote: service/if/TCLIService.thrift, line 1043 https://reviews.apache.org/r/24293/diff/1/?file=651542#file651542line1043 I have a partial patch that changes all of them and I planned on submitting it when I'm back from holiday. Lars Francke wrote: Sorry I messed up RB. This was meant as a reply to your Thrift comment answer and not a new issue. That's OK. :) Got it. Thanks. So I will just change the 3 Thrift comments related with this patch. - Dong --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/#review49911 --- On Aug. 7, 2014, 5:06 p.m., Dong Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/ --- (Updated Aug. 7, 2014, 5:06 p.m.) Review request for hive. Repository: hive-git Description --- HIVE-4629: HS2 should support an API to retrieve query logs HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 service/if/TCLIService.thrift 80086b4 service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java 808b73f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchType.java PRE-CREATION service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java f665146 service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java c9fd5f9 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java caf413d service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java fd4e94d service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java ebca996 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 05991e0 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 315dbea service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 0ec2543 service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java 3d3fddc service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java e0d17a1 service/src/java/org/apache/hive/service/cli/operation/Operation.java 45fbd61 service/src/java/org/apache/hive/service/cli/operation/OperationLog.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 21c33bc service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java de54ca1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 4c3164e service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java b39d64d service/src/java/org/apache/hive/service/cli/session/SessionManager.java 816bea4 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 5c87bcb service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java e3384d3 service/src/test/org/apache/hive/service/cli/operation/TestOperationLoggingAPI.java PRE-CREATION Diff: https://reviews.apache.org/r/24293/diff/ Testing --- UT passed. Thanks, Dong Chen
[jira] [Updated] (HIVE-4629) HS2 should support an API to retrieve query logs
[ https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Chen updated HIVE-4629: Attachment: HIVE-4629.5.patch HS2 should support an API to retrieve query logs Key: HIVE-4629 URL: https://issues.apache.org/jira/browse/HIVE-4629 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4629-no_thrift.1.patch, HIVE-4629.1.patch, HIVE-4629.2.patch, HIVE-4629.3.patch.txt, HIVE-4629.4.patch, HIVE-4629.5.patch HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/ --- (Updated Aug. 7, 2014, 5:37 p.m.) Review request for hive. Changes --- A little change: rename the static final variable threadLocalOperationLog Repository: hive-git Description --- HIVE-4629: HS2 should support an API to retrieve query logs HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 service/if/TCLIService.thrift 80086b4 service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java 808b73f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchType.java PRE-CREATION service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java f665146 service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java c9fd5f9 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java caf413d service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java fd4e94d service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java ebca996 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 05991e0 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 315dbea service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 0ec2543 service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java 3d3fddc service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java e0d17a1 service/src/java/org/apache/hive/service/cli/operation/Operation.java 45fbd61 service/src/java/org/apache/hive/service/cli/operation/OperationLog.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 21c33bc service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java de54ca1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 4c3164e service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java b39d64d service/src/java/org/apache/hive/service/cli/session/SessionManager.java 816bea4 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 5c87bcb service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java e3384d3 service/src/test/org/apache/hive/service/cli/operation/TestOperationLoggingAPI.java PRE-CREATION Diff: https://reviews.apache.org/r/24293/diff/ Testing --- UT passed. Thanks, Dong Chen
Re: Review Request 24293: HIVE-4629: HS2 should support an API to retrieve query logs
On Aug. 5, 2014, 8:56 a.m., Lars Francke wrote: service/src/java/org/apache/hive/service/cli/operation/OperationLog.java, line 58 https://reviews.apache.org/r/24293/diff/1/?file=651565#file651565line58 can be final and then renamed Dong Chen wrote: Thank you! I made it final and it is a good point. But a little confused about the renamed? Do you mean the variable name threadLocalOperationLog? Lars Francke wrote: static finals have the naming convention of being all upper case with underscores in between. So it should be THREAD_LOCAL_OPERATION_LOG Oh, right! I should keep this in mind. :) - Dong --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/#review49573 --- On Aug. 7, 2014, 5:37 p.m., Dong Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24293/ --- (Updated Aug. 7, 2014, 5:37 p.m.) Review request for hive. Repository: hive-git Description --- HIVE-4629: HS2 should support an API to retrieve query logs HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfc681 service/if/TCLIService.thrift 80086b4 service/src/gen/thrift/gen-cpp/TCLIService_types.h 1b37fb5 service/src/gen/thrift/gen-cpp/TCLIService_types.cpp d5f98a8 service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java 808b73f service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchType.java PRE-CREATION service/src/gen/thrift/gen-py/TCLIService/ttypes.py 2cbbdd8 service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb 93f9a81 service/src/java/org/apache/hive/service/cli/CLIService.java add37a1 service/src/java/org/apache/hive/service/cli/CLIServiceClient.java 87c10b9 service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java f665146 service/src/java/org/apache/hive/service/cli/FetchType.java PRE-CREATION service/src/java/org/apache/hive/service/cli/ICLIService.java c569796 service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java c9fd5f9 service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java caf413d service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java fd4e94d service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java ebca996 service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java 05991e0 service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 315dbea service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 0ec2543 service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java 3d3fddc service/src/java/org/apache/hive/service/cli/operation/LogDivertAppender.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java e0d17a1 service/src/java/org/apache/hive/service/cli/operation/Operation.java 45fbd61 service/src/java/org/apache/hive/service/cli/operation/OperationLog.java PRE-CREATION service/src/java/org/apache/hive/service/cli/operation/OperationManager.java 21c33bc service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java de54ca1 service/src/java/org/apache/hive/service/cli/session/HiveSession.java 9785e95 service/src/java/org/apache/hive/service/cli/session/HiveSessionBase.java 4c3164e service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java b39d64d service/src/java/org/apache/hive/service/cli/session/SessionManager.java 816bea4 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 5c87bcb service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java e3384d3 service/src/test/org/apache/hive/service/cli/operation/TestOperationLoggingAPI.java PRE-CREATION Diff: https://reviews.apache.org/r/24293/diff/ Testing --- UT passed. Thanks, Dong Chen
[jira] [Commented] (HIVE-7637) Change throws clause for Hadoop23Shims.ProxyFileSystem23.access()
[ https://issues.apache.org/jira/browse/HIVE-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14089511#comment-14089511 ] Jason Dere commented on HIVE-7637: -- Test failures do not appear to be related. Change throws clause for Hadoop23Shims.ProxyFileSystem23.access() - Key: HIVE-7637 URL: https://issues.apache.org/jira/browse/HIVE-7637 Project: Hive Issue Type: Bug Components: Shims Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-7637.1.patch Looks like the changes from HIVE-7583 don't build correctly with Hadoop-2.6.0 because the ProxyFileSystem23 version of access() throws Exception, which is not one of the exceptions listed in the throws clause of FileSystem.access(). The method in ProxyFileSystem23 should have its throws clause modified to match FileSystem's. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4629) HS2 should support an API to retrieve query logs
[ https://issues.apache.org/jira/browse/HIVE-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Chen updated HIVE-4629: Attachment: HIVE-4629.6.patch Update the patch (v6) to trigger test. It addresses review comments and fixes one failed case related with this patch in HIVE QA. HS2 should support an API to retrieve query logs Key: HIVE-4629 URL: https://issues.apache.org/jira/browse/HIVE-4629 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4629-no_thrift.1.patch, HIVE-4629.1.patch, HIVE-4629.2.patch, HIVE-4629.3.patch.txt, HIVE-4629.4.patch, HIVE-4629.5.patch, HIVE-4629.6.patch HiveServer2 should support an API to retrieve query logs. This is particularly relevant because HiveServer2 supports async execution but doesn't provide a way to report progress. Providing an API to retrieve query logs will help report progress to the client. -- This message was sent by Atlassian JIRA (v6.2#6252)