[jira] [Commented] (HIVE-9474) truncate table changes permissions on the target
[ https://issues.apache.org/jira/browse/HIVE-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295202#comment-14295202 ] Aihua Xu commented on HIVE-9474: The test failures are unrelated to the change. truncate table changes permissions on the target Key: HIVE-9474 URL: https://issues.apache.org/jira/browse/HIVE-9474 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Aihua Xu Assignee: Aihua Xu Priority: Minor Fix For: 0.15.0 Attachments: HIVE-9474.1.patch, HIVE-9474.2.patch, HIVE-9474.3.patch Original Estimate: 4h Remaining Estimate: 4h Create a table test(a string); Hive create table test(key string); Change the /user/hive/warehouse/test permission to something else other than the default, like 777. Hive dfs -chmod 777 /user/hive/warehouse/test; Hive dfs -ls -d /user/hive/warehouse/test; drwxrwxrwx - axu wheel 68 2015-01-26 18:45 /user/hive/warehouse/test Then truncate table test; Hive truncate table test; The permission goes back to the default. hive dfs -ls -d /user/hive/warehouse/test; drwxr-xr-x - axu wheel 68 2015-01-27 10:09 /user/hive/warehouse/test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7387) Guava version conflict between hadoop and spark [Spark-Branch]
[ https://issues.apache.org/jira/browse/HIVE-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295212#comment-14295212 ] Tim Robertson commented on HIVE-7387: - This affects anyone trying to use a custom UDF from the Hive CLI when the UDF depends on later Guava methods too. Suggest reopening this as a valid issue. Guava version conflict between hadoop and spark [Spark-Branch] -- Key: HIVE-7387 URL: https://issues.apache.org/jira/browse/HIVE-7387 Project: Hive Issue Type: Bug Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Attachments: HIVE-7387-spark.patch The guava conflict happens in hive driver compile stage, as in the follow exception stacktrace, conflict happens while initiate spark RDD in SparkClient, hive driver take both guava 11 from hadoop classpath and spark assembly jar which contains guava 14 classes in its classpath, spark invoked HashFunction.hasInt which method does not exists in guava 11 version, obvious the guava 11 version HashFunction is loaded into the JVM, which lead to a NoSuchMethodError during initiate spark RDD. {code} java.lang.NoSuchMethodError: com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode; at org.apache.spark.util.collection.OpenHashSet.org$apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261) at org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165) at org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102) at org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210) at org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169) at org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161) at org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155) at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:75) at org.apache.spark.storage.MemoryStore.putValues(MemoryStore.scala:92) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:661) at org.apache.spark.storage.BlockManager.put(BlockManager.scala:546) at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:812) at org.apache.spark.broadcast.HttpBroadcast.init(HttpBroadcast.scala:52) at org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:35) at org.apache.spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcastFactory.scala:29) at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62) at org.apache.spark.SparkContext.broadcast(SparkContext.scala:776) at org.apache.spark.rdd.HadoopRDD.init(HadoopRDD.scala:112) at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:527) at org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:307) at org.apache.hadoop.hive.ql.exec.spark.SparkClient.createRDD(SparkClient.java:204) at org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:167) at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:32) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72) {code} NO PRECOMMIT TESTS. This is for spark branch only. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9495) Map Side aggregation affecting map performance
Anand Sridharan created HIVE-9495: - Summary: Map Side aggregation affecting map performance Key: HIVE-9495 URL: https://issues.apache.org/jira/browse/HIVE-9495 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Environment: RHEL 6.4 Hortonworks Hadoop 2.2 Reporter: Anand Sridharan When trying to run a simple aggregation query with hive.map.aggr=true, map tasks take a lot of time in Hive 0.14 as against with hive.map.aggr=false. e.g. Consider the query: INSERT OVERWRITE TABLE lineitem_tgt_agg SELECT alias.a0 as a0, alias.a2 as a1, alias.a1 as a2, alias.a3 as a3, alias.a4 as a4 FROM (SELECT alias.a0 as a0, SUM(alias.a1) as a1, SUM(alias.a2) as a2, SUM(alias.a3) as a3, SUM(alias.a4) as a4 FROM (SELECT lineitem_sf500.l_orderkey as a0, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) AS DOUBLE) as a1, lineitem_sf500.l_quantity as a2, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_discount AS DOUBLE) as a3, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_tax AS DOUBLE) as a4 FROM lineitem_sf500) alias GROUP BY alias.a0) alias; The above query was run with ~376GB of data / ~3billion records in the source. It takes ~10 minutes with hive.map.aggr=false. With map side aggregation set to true, the map tasks don't complete even after an hour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9253) MetaStore server should support timeout for long running requests
[ https://issues.apache.org/jira/browse/HIVE-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Chen updated HIVE-9253: Attachment: HIVE-9253.4.patch Update V4 to address RB comments. Thank you [~leftylev], [~brocknoland] for your review and feedback. With regard to client setting the timeout value, I left some reply comments in RB. A {{SessionPropertiesListener}} is added for handling client requesting timeout change. Client could use {{set metaconf:hive.metastore.server.running.method.timeout 500s}} to change timeout. If this solution is ok, we may need to document it for user. MetaStore server should support timeout for long running requests - Key: HIVE-9253 URL: https://issues.apache.org/jira/browse/HIVE-9253 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-9253.1.patch, HIVE-9253.2.patch, HIVE-9253.2.patch, HIVE-9253.3.patch, HIVE-9253.4.patch, HIVE-9253.patch In the description of HIVE-7195, one issue is that MetaStore client timeout is quite dumb. The client will timeout and the server has no idea the client is gone. The server should support timeout when the request from client runs a long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9302) Beeline add jar local to client
[ https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-9302: --- Attachment: HIVE-9302.2.patch Beeline add jar local to client --- Key: HIVE-9302 URL: https://issues.apache.org/jira/browse/HIVE-9302 Project: Hive Issue Type: New Feature Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: DummyDriver-1.0-SNAPSHOT.jar, HIVE-9302.1.patch, HIVE-9302.2.patch, HIVE-9302.patch, mysql-connector-java-bin.jar, postgresql-9.3.jdbc3.jar At present if a beeline user uses {{add jar}} the path they give is actually on the HS2 server. It'd be great to allow beeline users to add local jars as well. It might be useful to do this in the jdbc driver itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29807: HIVE-9253: MetaStore server should support timeout for long running requests
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29807/ --- (Updated Jan. 28, 2015, 8:58 a.m.) Review request for hive. Changes --- Address comments from Lefty and Brock. Repository: hive-git Description --- HIVE-9253: MetaStore server should support timeout for long running requests Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66f436b metastore/src/java/org/apache/hadoop/hive/metastore/Deadline.java PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/DeadlineException.java PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java fc6f067 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 574141c metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 01ad36a metastore/src/java/org/apache/hadoop/hive/metastore/SessionPropertiesListener.java PRE-CREATION metastore/src/test/org/apache/hadoop/hive/metastore/TestDeadline.java PRE-CREATION metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTimeout.java PRE-CREATION Diff: https://reviews.apache.org/r/29807/diff/ Testing --- UT passed Thanks, Dong Chen
[jira] [Updated] (HIVE-9477) No error thrown when global limit optimization failed to find enough number of rows [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9477: - Status: Patch Available (was: Open) No error thrown when global limit optimization failed to find enough number of rows [Spark Branch] -- Key: HIVE-9477 URL: https://issues.apache.org/jira/browse/HIVE-9477 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9477.1-spark.patch MR will throw an error in such a case and rerun the query with the optimization disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294952#comment-14294952 ] Hive QA commented on HIVE-9489: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12694904/HIVE-9489.1.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7403 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2546/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2546/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2546/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12694904 - PreCommit-HIVE-TRUNK-Build add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9489.1.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance
[ https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anand Sridharan updated HIVE-9495: -- Attachment: profiler_screenshot.PNG Profiler screenshot showing GroupByOperator.processHashAggr as hotspot. Map Side aggregation affecting map performance -- Key: HIVE-9495 URL: https://issues.apache.org/jira/browse/HIVE-9495 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Environment: RHEL 6.4 Hortonworks Hadoop 2.2 Reporter: Anand Sridharan Attachments: profiler_screenshot.PNG When trying to run a simple aggregation query with hive.map.aggr=true, map tasks take a lot of time in Hive 0.14 as against with hive.map.aggr=false. e.g. Consider the query: INSERT OVERWRITE TABLE lineitem_tgt_agg SELECT alias.a0 as a0, alias.a2 as a1, alias.a1 as a2, alias.a3 as a3, alias.a4 as a4 FROM (SELECT alias.a0 as a0, SUM(alias.a1) as a1, SUM(alias.a2) as a2, SUM(alias.a3) as a3, SUM(alias.a4) as a4 FROM (SELECT lineitem_sf500.l_orderkey as a0, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) AS DOUBLE) as a1, lineitem_sf500.l_quantity as a2, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_discount AS DOUBLE) as a3, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_tax AS DOUBLE) as a4 FROM lineitem_sf500) alias GROUP BY alias.a0) alias; The above query was run with ~376GB of data / ~3billion records in the source. It takes ~10 minutes with hive.map.aggr=false. With map side aggregation set to true, the map tasks don't complete even after an hour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294894#comment-14294894 ] Lefty Leverenz commented on HIVE-9489: -- Hmph. Not many typos for me to find. ;) {{+ * Certain optimizations should not be applied if UDF is not deterministic}} ... needs a period at end of line. {{+ * don't apply for such UDFS, as they need to be invoked for each record.}} ... UDFs, not UDFS. {{+ * A UDF is considered distinctLike if the udf can be evaluated on just the}} ... udf should be UDF. add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9489.1.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9486) Use session classloader instead of application loader
[ https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294883#comment-14294883 ] Hive QA commented on HIVE-9486: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12694894/HIVE-9486.1.patch.txt Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2545/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2545/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2545/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-2545/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'metastore/src/java/org/apache/hadoop/hive/metastore/events/InsertEvent.java' Reverted 'metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java' Reverted 'metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java' Reverted 'metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java' Reverted 'metastore/src/gen/thrift/gen-py/hive_metastore/ttypes.py' Reverted 'metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py' Reverted 'metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote' Reverted 'metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp' Reverted 'metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp' Reverted 'metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h' Reverted 'metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h' Reverted 'metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp' Reverted 'metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb' Reverted 'metastore/src/gen/thrift/gen-rb/hive_metastore_types.rb' Reverted 'metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/FireEventRequest.java' Reverted 'metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SkewedInfo.java' Reverted 'metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java' Reverted 'metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php' Reverted 'metastore/src/gen/thrift/gen-php/metastore/Types.php' Reverted 'metastore/if/hive_metastore.thrift' Reverted 'itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/TestDbNotificationListener.java' Reverted 'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java' Reverted 'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/MessageDeserializer.java' Reverted 'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/json/JSONMessageDeserializer.java' Reverted 'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/json/JSONInsertMessage.java' Reverted 'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/json/JSONMessageFactory.java' Reverted 'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/InsertMessage.java' Reverted 'hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/messaging/MessageFactory.java' Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java' ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status
[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted
[ https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294911#comment-14294911 ] Lefty Leverenz commented on HIVE-8966: -- Any documentation needed? Delta files created by hive hcatalog streaming cannot be compacted -- Key: HIVE-8966 URL: https://issues.apache.org/jira/browse/HIVE-8966 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.14.0 Environment: hive Reporter: Jihong Liu Assignee: Alan Gates Priority: Critical Fix For: 1.0.0 Attachments: HIVE-8966-branch-1.patch, HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, HIVE-8966.5.patch, HIVE-8966.6.patch, HIVE-8966.patch hive hcatalog streaming will also create a file like bucket_n_flush_length in each delta directory. Where n is the bucket number. But the compactor.CompactorMR think this file also needs to compact. However this file of course cannot be compacted, so compactor.CompactorMR will not continue to do the compaction. Did a test, after removed the bucket_n_flush_length file, then the alter table partition compact finished successfully. If don't delete that file, nothing will be compacted. This is probably a very severity bug. Both 0.13 and 0.14 have this issue -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9477) No error thrown when global limit optimization failed to find enough number of rows [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9477: - Attachment: HIVE-9477.1-spark.patch Rerun query when global limit optimization fails. No error thrown when global limit optimization failed to find enough number of rows [Spark Branch] -- Key: HIVE-9477 URL: https://issues.apache.org/jira/browse/HIVE-9477 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9477.1-spark.patch MR will throw an error in such a case and rerun the query with the optimization disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9486) Use session classloader instead of application loader
[ https://issues.apache.org/jira/browse/HIVE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9486: Attachment: HIVE-9486.2.patch.txt Use session classloader instead of application loader - Key: HIVE-9486 URL: https://issues.apache.org/jira/browse/HIVE-9486 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-9486.1.patch.txt, HIVE-9486.2.patch.txt From http://www.mail-archive.com/dev@hive.apache.org/msg107615.html Looks reasonable -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29898: HIVE-9298: Support reading alternate timestamp formats
On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote: common/pom.xml, lines 59-63 https://reviews.apache.org/r/29898/diff/2/?file=825966#file825966line59 Since joda jar will be shipped to task nodes, this needs to be added in hive-exec jar. I think we keep that list in one of the pom files. We need to add this dep there. Do you mean the artifact set for the shaded JAR goal in ql/pom.xml? I'll take a look at doing this. On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote: common/src/java/org/apache/hive/common/util/TimestampParser.java, line 76 https://reviews.apache.org/r/29898/diff/2/?file=825967#file825967line76 Name suggests this can be an instance object. If we do that way, than we can avoid creating this object per invocation, which will be nice if possible. The way this is currently set up the LazyTimestampObjectInspector (which I believe could be shared by different threads) points to a single TimestampParser. The Joda DateTimeFormatter is thread safe, so everything in parseTimestamp() should be thread safe except for mdt which is why I was creating a new object. I guess mdt could be made thread safe by making it a thread-local instance. I'll make that change. On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote: common/src/java/org/apache/hive/common/util/TimestampParser.java, line 127 https://reviews.apache.org/r/29898/diff/2/?file=825967#file825967line127 Can't we do Long.valueOf()? That will be faster than BD parsing, I presume. If we don't want to worry about fractional millisecond values, then we can do this. We're throwing away the fractional portion anyway since Joda does not have precision less than 1 ms. I'll change this. On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote: serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde/serdeConstants.java, line 114 https://reviews.apache.org/r/29898/diff/2/?file=825979#file825979line114 This is thrift generated file. Instead of hand modifying you need to put this in thrift file and generate it via thrift compiler. Whoops missed that, thanks for pointing that out, will fix. On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote: serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java, lines 135-137 https://reviews.apache.org/r/29898/diff/2/?file=825983#file825983line135 I wonder why these and lastColtakeRest are not included in LazyOIParams. Seems to me, they should be included too. If you think otherwise, it will be good to add a comment here about what distinguishes these two set of params. So I thought these params had more to do with the SerDe and handling of rows than they did with actual values and ObjectInspector-related handling, which is why I left those out of the lazy OI params. Admittedly it does look a bit odd to bundle some of the params together and leave others out. If you think I should just include those in I can do so. On Jan. 28, 2015, 1:22 a.m., Ashutosh Chauhan wrote: serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java, line 649 https://reviews.apache.org/r/29898/diff/2/?file=825983#file825983line649 I think there is a helper method in apache commons (or guava) which can let you do such parsing. Will be good to reuse that, if available. Not sure if the commons/guava libs have something to escape commas (please correct me if I am wrong). I see that Hive uses opencsv which handles CSV-style escaping, I will use this to parse the list. - Jason --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29898/#review69930 --- On Jan. 20, 2015, 12:34 a.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29898/ --- (Updated Jan. 20, 2015, 12:34 a.m.) Review request for hive and Ashutosh Chauhan. Bugs: HIVE-9298 https://issues.apache.org/jira/browse/HIVE-9298 Repository: hive-git Description --- Add new SerDe parameter timestamp.formats to specify alternate timestamp patterns Diffs - common/pom.xml ede8aea common/src/java/org/apache/hive/common/util/TimestampParser.java PRE-CREATION common/src/test/org/apache/hive/common/util/TestTimestampParser.java PRE-CREATION data/files/ts_formats.txt PRE-CREATION hbase-handler/src/java/org/apache/hadoop/hive/hbase/DefaultHBaseKeyFactory.java 98bc73f hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseLazyObjectFactory.java 78f23cb hbase-handler/src/java/org/apache/hadoop/hive/hbase/struct/AvroHBaseValueFactory.java a2ba827
[jira] [Commented] (HIVE-9493) Failed job may not throw exceptions [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294876#comment-14294876 ] Hive QA commented on HIVE-9493: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12694934/HIVE-9493.1-spark.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7357 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/688/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/688/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-688/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12694934 - PreCommit-HIVE-SPARK-Build Failed job may not throw exceptions [Spark Branch] -- Key: HIVE-9493 URL: https://issues.apache.org/jira/browse/HIVE-9493 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9493.1-spark.patch Currently remote driver assumes exception will be thrown when job fails to run. This may not hold since job is submitted asynchronously. And we have to check the futures before we decide the job is successful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29807: HIVE-9253: MetaStore server should support timeout for long running requests
On Jan. 21, 2015, 6:43 a.m., Lefty Leverenz wrote: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, lines 372-374 https://reviews.apache.org/r/29807/diff/2/?file=827704#file827704line372 Shouldn't long LONG be included in the names hive.metastore.server.running.method.timeout METASTORE_SERVER_RUNNING_METHOD_TIMEOUT? Also, please specify the JIRA number (HIVE-9253) in this review request, either under Bugs in the Information section or in the Summary, or both. Thanks for your review and suggestion! Lefty. I have renamed it in the new patch. - Dong --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29807/#review68878 --- On Jan. 22, 2015, 8:22 a.m., Dong Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29807/ --- (Updated Jan. 22, 2015, 8:22 a.m.) Review request for hive. Repository: hive-git Description --- HIVE-9253: MetaStore server should support timeout for long running requests Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 5e00575 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java caad948 metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 564ac8b metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 01ad36a metastore/src/java/org/apache/hadoop/hive/metastore/RuntimeTimeout.java PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/RuntimeTimeoutException.java PRE-CREATION metastore/src/java/org/apache/hadoop/hive/metastore/SessionPropertiesListener.java PRE-CREATION metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTimeout.java PRE-CREATION metastore/src/test/org/apache/hadoop/hive/metastore/TestRuntimeTimeout.java PRE-CREATION Diff: https://reviews.apache.org/r/29807/diff/ Testing --- UT passed Thanks, Dong Chen
[jira] [Commented] (HIVE-9302) Beeline add jar local to client
[ https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294853#comment-14294853 ] Ferdinand Xu commented on HIVE-9302: Sorry, I meant to. There are two kinds of use cases. One is to add an existing known driver like mysql driver or postgres driver. Current supported driver are postgres and mysql. {noformat} # beeline beeline !addlocaldriverjar /path/to/mysql-connector-java-5.1.27-bin.jar beeline !connect mysql://host:3306/testdb {noformat} And another is to add a customized driver. {noformat} # beeline beeline!addlocaldriverjar /path/to/DummyDriver-1.0-SNAPSHOT.jar beeline!!addlocaldrivername org.apache.dummy.DummyDrive beeline !connect mysql://host:3306/testdb {noformat} Beeline add jar local to client --- Key: HIVE-9302 URL: https://issues.apache.org/jira/browse/HIVE-9302 Project: Hive Issue Type: New Feature Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: DummyDriver-1.0-SNAPSHOT.jar, HIVE-9302.1.patch, HIVE-9302.patch, mysql-connector-java-bin.jar, postgresql-9.3.jdbc3.jar At present if a beeline user uses {{add jar}} the path they give is actually on the HS2 server. It'd be great to allow beeline users to add local jars as well. It might be useful to do this in the jdbc driver itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9460) LLAP: Fix some static vars in the operator pipeline
[ https://issues.apache.org/jira/browse/HIVE-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294854#comment-14294854 ] Lefty Leverenz commented on HIVE-9460: -- Doc note: This adds configuration parameter *hive.execution.mode* to HiveConf.java, so it will need to be documented in the wiki when the LLAP branch gets merged to trunk. Should we add a TODOC-LLAP label to keep track of these doc issues? LLAP: Fix some static vars in the operator pipeline --- Key: HIVE-9460 URL: https://issues.apache.org/jira/browse/HIVE-9460 Project: Hive Issue Type: Sub-task Affects Versions: llap Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-9460.1.patch There are a few static vars left in the operator pipeline. Can't have those with multi-threaded execution... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9273) Add option to fire metastore event on insert
[ https://issues.apache.org/jira/browse/HIVE-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294880#comment-14294880 ] Hive QA commented on HIVE-9273: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12694889/HIVE-9273.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7405 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2544/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2544/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2544/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12694889 - PreCommit-HIVE-TRUNK-Build Add option to fire metastore event on insert Key: HIVE-9273 URL: https://issues.apache.org/jira/browse/HIVE-9273 Project: Hive Issue Type: New Feature Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-9273.patch HIVE-9271 adds the ability for the client to request firing metastore events. This can be used in the MoveTask to fire events when an insert is done that does not add partitions to a table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9477) No error thrown when global limit optimization failed to find enough number of rows [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294985#comment-14294985 ] Hive QA commented on HIVE-9477: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12694949/HIVE-9477.1-spark.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7354 tests executed *Failed tests:* {noformat} TestSQLStdHiveAccessControllerHS2 - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_covar_samp org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/689/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/689/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-689/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12694949 - PreCommit-HIVE-SPARK-Build No error thrown when global limit optimization failed to find enough number of rows [Spark Branch] -- Key: HIVE-9477 URL: https://issues.apache.org/jira/browse/HIVE-9477 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9477.1-spark.patch MR will throw an error in such a case and rerun the query with the optimization disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9292) CBO (Calcite Return Path): Inline GroupBy, Properties
[ https://issues.apache.org/jira/browse/HIVE-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9292: -- Attachment: HIVE-9292.06.patch New patch; addressed [~jpullokkaran] comments to remove the usage of ParseContext in RewriteQueryUsingAggregateIndexCtx after HIVE-9327 went in. CBO (Calcite Return Path): Inline GroupBy, Properties - Key: HIVE-9292 URL: https://issues.apache.org/jira/browse/HIVE-9292 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 0.15.0 Attachments: HIVE-9292.01.patch, HIVE-9292.02.patch, HIVE-9292.03.patch, HIVE-9292.04.patch, HIVE-9292.05.patch, HIVE-9292.06.patch, HIVE-9292.patch, HIVE-9292.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29763: HIVE-9292
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29763/ --- (Updated Jan. 28, 2015, 10:59 a.m.) Review request for hive and John Pullokkaran. Changes --- New patch; addressed John's comments to remove the usage of ParseContext in RewriteQueryUsingAggregateIndexCtx after HIVE-9327 went in. Bugs: HIVE-9292 https://issues.apache.org/jira/browse/HIVE-9292 Repository: hive-git Description --- CBO (Calcite Return Path): Inline GroupBy, Properties Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteParseContextGenerator.java 3097385b92d4398ee57d3544354b383fe24719dd ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java 69a5a4409164fc6cb725b315de08ec9d090b7f22 ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java dda4f75592209d88f25b5ca09ea9f32c77ea4ac6 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c9a5ce53ffc3d5c791e0826be0cac771a4d20254 ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java 0116c85979f02ea0f88bbf8085a7590694eb2dfb Diff: https://reviews.apache.org/r/29763/diff/ Testing --- Existing tests. Thanks, Jesús Camacho Rodríguez
[jira] [Commented] (HIVE-9302) Beeline add jar local to client
[ https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295029#comment-14295029 ] Hive QA commented on HIVE-9302: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12694940/HIVE-9302.2.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 7415 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0] org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[1] org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2548/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2548/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2548/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12694940 - PreCommit-HIVE-TRUNK-Build Beeline add jar local to client --- Key: HIVE-9302 URL: https://issues.apache.org/jira/browse/HIVE-9302 Project: Hive Issue Type: New Feature Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: DummyDriver-1.0-SNAPSHOT.jar, HIVE-9302.1.patch, HIVE-9302.2.patch, HIVE-9302.patch, mysql-connector-java-bin.jar, postgresql-9.3.jdbc3.jar At present if a beeline user uses {{add jar}} the path they give is actually on the HS2 server. It'd be great to allow beeline users to add local jars as well. It might be useful to do this in the jdbc driver itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9253) MetaStore server should support timeout for long running requests
[ https://issues.apache.org/jira/browse/HIVE-9253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295450#comment-14295450 ] Hive QA commented on HIVE-9253: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12694989/HIVE-9253.4.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7407 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2551/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2551/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2551/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12694989 - PreCommit-HIVE-TRUNK-Build MetaStore server should support timeout for long running requests - Key: HIVE-9253 URL: https://issues.apache.org/jira/browse/HIVE-9253 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-9253.1.patch, HIVE-9253.2.patch, HIVE-9253.2.patch, HIVE-9253.3.patch, HIVE-9253.4.patch, HIVE-9253.patch In the description of HIVE-7195, one issue is that MetaStore client timeout is quite dumb. The client will timeout and the server has no idea the client is gone. The server should support timeout when the request from client runs a long time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9317) move Microsoft copyright to NOTICE file
[ https://issues.apache.org/jira/browse/HIVE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-9317: Resolution: Fixed Fix Version/s: 1.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I committed this. Thanks for the review, Alan. move Microsoft copyright to NOTICE file --- Key: HIVE-9317 URL: https://issues.apache.org/jira/browse/HIVE-9317 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.15.0, 1.0.0 Attachments: hive-9327.txt There are a set of files that still have the Microsoft copyright notices. Those notices need to be moved into NOTICES and replaced with the standard Apache headers. {code} ./common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java ./common/src/java/org/apache/hadoop/hive/common/type/SignedInt128.java ./common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java ./common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestSignedInt128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestSqlMathUtil.java ./common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
Congratulations to all! --Xuefu On Wed, Jan 28, 2015 at 1:15 PM, Carl Steinbach c...@apache.org wrote: I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl
[jira] [Commented] (HIVE-9303) Parquet files are written with incorrect definition levels
[ https://issues.apache.org/jira/browse/HIVE-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295780#comment-14295780 ] Hive QA commented on HIVE-9303: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695040/HIVE-9303.1.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 7400 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file TestPigHBaseStorageHandler - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2553/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2553/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2553/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695040 - PreCommit-HIVE-TRUNK-Build Parquet files are written with incorrect definition levels -- Key: HIVE-9303 URL: https://issues.apache.org/jira/browse/HIVE-9303 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Skye Wanderman-Milne Assignee: Sergio Peña Attachments: HIVE-9303.1.patch The definition level, which determines which level of nesting is NULL, appears to always be n or n-1, where n is the maximum definition level. This means that only the innermost level of nesting can be NULL. This is only relevant for Parquet files. For example: {code:sql} CREATE TABLE text_tbl (a STRUCTb:STRUCTc:INT) STORED AS TEXTFILE; INSERT OVERWRITE TABLE text_tbl SELECT IF(false, named_struct(b, named_struct(c, 1)), NULL) FROM tbl LIMIT 1; CREATE TABLE parq_tbl STORED AS PARQUET AS SELECT * FROM text_tbl; SELECT * FROM text_tbl; = NULL # right SELECT * FROM parq_tbl; = {b:{c:null}} # wrong {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9477) No error thrown when global limit optimization failed to find enough number of rows [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9477: -- Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Test failures above don't seem related to this patch in any way. Patch committed to spark branch. Thanks, Rui. No error thrown when global limit optimization failed to find enough number of rows [Spark Branch] -- Key: HIVE-9477 URL: https://issues.apache.org/jira/browse/HIVE-9477 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Priority: Blocker Fix For: spark-branch Attachments: HIVE-9477.1-spark.patch MR will throw an error in such a case and rerun the query with the optimization disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
Thanks and congrats to Vikram, Jason, Owen, and Prasanth ! On Wed, Jan 28, 2015 at 1:28 PM, Hari Subramaniyan hsubramani...@hortonworks.com wrote: Congrats everyone! Thanks Hari -- *From:* cwsteinb...@gmail.com cwsteinb...@gmail.com on behalf of Carl Steinbach c...@apache.org *Sent:* Wednesday, January 28, 2015 1:15 PM *To:* dev@hive.apache.org; u...@hive.apache.org *Cc:* sze...@apache.org; vik...@apache.org; jd...@apache.org; Owen O'Malley; prasan...@apache.org *Subject:* [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl
[jira] [Commented] (HIVE-8807) Obsolete default values in webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295829#comment-14295829 ] Vikram Dixit K commented on HIVE-8807: -- If I end up rolling out a new release and we have a patch for this by then, I will include this in the next roll-out. Thanks Vikram. Obsolete default values in webhcat-default.xml -- Key: HIVE-8807 URL: https://issues.apache.org/jira/browse/HIVE-8807 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Lefty Leverenz Fix For: 0.14.1 The defaults for templeton.pig.path templeton.hive.path are 0.11 in webhcat-default.xml but they ought to match current release numbers. The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
Congratulations e’one! —Vaibhav On Jan 28, 2015, at 1:20 PM, Xuefu Zhang xzh...@cloudera.commailto:xzh...@cloudera.com wrote: Congratulations to all! --Xuefu On Wed, Jan 28, 2015 at 1:15 PM, Carl Steinbach c...@apache.orgmailto:c...@apache.org wrote: I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl
Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
?Congrats everyone! Thanks Hari From: cwsteinb...@gmail.com cwsteinb...@gmail.com on behalf of Carl Steinbach c...@apache.org Sent: Wednesday, January 28, 2015 1:15 PM To: dev@hive.apache.org; u...@hive.apache.org Cc: sze...@apache.org; vik...@apache.org; jd...@apache.org; Owen O'Malley; prasan...@apache.org Subject: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl
[jira] [Commented] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext
[ https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295892#comment-14295892 ] Laljo John Pullokkaran commented on HIVE-9431: -- [~jcamachorodriguez] Could you rebase the patch and resubmit the patch? Build seems to be failing with the patch CBO (Calcite Return Path): Removing AST from ParseContext - Key: HIVE-9431 URL: https://issues.apache.org/jira/browse/HIVE-9431 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 0.15.0 Attachments: HIVE-9431.01.patch, HIVE-9431.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9482) Hive parquet timestamp compatibility
[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295905#comment-14295905 ] Hive QA commented on HIVE-9482: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695072/HIVE-9482.2.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7406 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_external_time org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2554/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2554/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2554/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695072 - PreCommit-HIVE-TRUNK-Build Hive parquet timestamp compatibility Key: HIVE-9482 URL: https://issues.apache.org/jira/browse/HIVE-9482 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.15.0 Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 0.15.0 Attachments: HIVE-9482.2.patch, HIVE-9482.patch, HIVE-9482.patch, parquet_external_time.parq In current Hive implementation, timestamps are stored in UTC (converted from current timezone), based on original parquet timestamp spec. However, we find this is not compatibility with other tools, and after some investigation it is not the way of the other file formats, or even some databases (Hive Timestamp is more equivalent of 'timestamp without timezone' datatype). This is the first part of the fix, which will restore compatibility with parquet-timestamp files generated by external tools by skipping conversion on reading. Later fix will change the write path to not convert, and stop the read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext
[ https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9431: -- Attachment: HIVE-9431.02.patch Rebasing patch. CBO (Calcite Return Path): Removing AST from ParseContext - Key: HIVE-9431 URL: https://issues.apache.org/jira/browse/HIVE-9431 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 0.15.0 Attachments: HIVE-9431.01.patch, HIVE-9431.02.patch, HIVE-9431.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9485) Update trunk to 1.2.0-SNAPSHOT
[ https://issues.apache.org/jira/browse/HIVE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295775#comment-14295775 ] Thejas M Nair commented on HIVE-9485: - +1 Update trunk to 1.2.0-SNAPSHOT -- Key: HIVE-9485 URL: https://issues.apache.org/jira/browse/HIVE-9485 Project: Hive Issue Type: Task Affects Versions: 1.2.0 Reporter: Brock Noland Assignee: Brock Noland Fix For: 1.2.0 Attachments: HIVE-9485.1.patch As discussed on list, 0.14.1 will be 1.0 and 0.15 will be 1.1. As such we should change trunk to 1.2.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-9485) Update trunk to 1.2.0-SNAPSHOT
[ https://issues.apache.org/jira/browse/HIVE-9485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295775#comment-14295775 ] Thejas M Nair edited comment on HIVE-9485 at 1/28/15 8:16 PM: -- +1 Thanks [~brocknoland]! was (Author: thejas): +1 Update trunk to 1.2.0-SNAPSHOT -- Key: HIVE-9485 URL: https://issues.apache.org/jira/browse/HIVE-9485 Project: Hive Issue Type: Task Affects Versions: 1.2.0 Reporter: Brock Noland Assignee: Brock Noland Fix For: 1.2.0 Attachments: HIVE-9485.1.patch As discussed on list, 0.14.1 will be 1.0 and 0.15 will be 1.1. As such we should change trunk to 1.2.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
Congrats!!! On Wed, Jan 28, 2015 at 1:21 PM, Vaibhav Gumashta vgumas...@hortonworks.com wrote: Congratulations e’one! —Vaibhav On Jan 28, 2015, at 1:20 PM, Xuefu Zhang xzh...@cloudera.commailto: xzh...@cloudera.com wrote: Congratulations to all! --Xuefu On Wed, Jan 28, 2015 at 1:15 PM, Carl Steinbach c...@apache.orgmailto: c...@apache.org wrote: I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl -- Best, Chao
Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29900/#review70062 --- common/src/java/org/apache/hadoop/hive/conf/HiveConf.java https://reviews.apache.org/r/29900/#comment115035 we should add a boolean false argument at the end here, so that it does not show up in the hive-default.xml.template file. See hive.in.test for example. There are other hive.test params to be fixed similarly as well, we can do it as part of this one or a separate jira. ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java https://reviews.apache.org/r/29900/#comment115039 SessionState can be shared across multiple query executions, in hiveserver2. One case where this happens is when one user in Hue opens multiple tabs and runs queries from each of them simultaneously. This means that there can be race conditions where multiple get_timestamp invocations in single query returns different results because there was another query whose compilation started in between. (This will happen once the lock around compile is removed in HS2). We need to store this in a real query specific variable. I am still thinking what the best place for that is .. ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java https://reviews.apache.org/r/29900/#comment115041 I have seen at least another place where we have a test timestamp getting injected. I might make sense to use some kind of getTimestamp class that can be customized to give a specific timestamp. But this does not have to be addressed in this jira. ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java https://reviews.apache.org/r/29900/#comment115019 I think it would be good to clarify that for all calls within a query this returns same value. For example, if the query lifetime crosses a date boundary, you would not see two different dates for different records. Maybe reword it something like this - Returns the current date as of starting of query. ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java https://reviews.apache.org/r/29900/#comment115018 looks like we can consider this to be deterministic, since the value does not change within a query. ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java https://reviews.apache.org/r/29900/#comment115021 we should update description to clarify that this is timestamp at begening of query evaluation/execution. maybe evaluation is a better word. - Thejas Nair On Jan. 19, 2015, 10:01 p.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29900/ --- (Updated Jan. 19, 2015, 10:01 p.m.) Review request for hive and Thejas Nair. Bugs: HIVE-5472 https://issues.apache.org/jira/browse/HIVE-5472 Repository: hive-git Description --- Add current_date/current_timestamp. The UDFs get the current_date/timestamp from the SessionState. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 25cccd7 ql/src/java/org/apache/hadoop/hive/ql/Driver.java 0226f28 ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d7c4ca7 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java f45b20a ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java PRE-CREATION ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION ql/src/test/results/clientpositive/current_date_timestamp.q.out PRE-CREATION ql/src/test/results/clientpositive/show_functions.q.out 9ecb0a0 Diff: https://reviews.apache.org/r/29900/diff/ Testing --- qfile test added Thanks, Jason Dere
Re: [ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
Congrats!!! On Wed, Jan 28, 2015 at 1:21 PM, Vaibhav Gumashta vgumas...@hortonworks.com wrote: Congratulations e’one! —Vaibhav On Jan 28, 2015, at 1:20 PM, Xuefu Zhang xzh...@cloudera.commailto: xzh...@cloudera.com wrote: Congratulations to all! --Xuefu On Wed, Jan 28, 2015 at 1:15 PM, Carl Steinbach c...@apache.orgmailto: c...@apache.org wrote: I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl
[jira] [Commented] (HIVE-9482) Hive parquet timestamp compatibility
[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295911#comment-14295911 ] Szehon Ho commented on HIVE-9482: - Test failures dont look related (these spark tests also failed in other builds). parquet_external_time will fail until the attached parquet file is checked in (/data/files/parquet_external_time.parq). Hive parquet timestamp compatibility Key: HIVE-9482 URL: https://issues.apache.org/jira/browse/HIVE-9482 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.15.0 Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 0.15.0 Attachments: HIVE-9482.2.patch, HIVE-9482.patch, HIVE-9482.patch, parquet_external_time.parq In current Hive implementation, timestamps are stored in UTC (converted from current timezone), based on original parquet timestamp spec. However, we find this is not compatibility with other tools, and after some investigation it is not the way of the other file formats, or even some databases (Hive Timestamp is more equivalent of 'timestamp without timezone' datatype). This is the first part of the fix, which will restore compatibility with parquet-timestamp files generated by external tools by skipping conversion on reading. Later fix will change the write path to not convert, and stop the read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9487: - Status: Patch Available (was: Open) Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29900/ --- (Updated Jan. 28, 2015, 11:22 p.m.) Review request for hive and Thejas Nair. Changes --- Update patch based on review comments Bugs: HIVE-5472 https://issues.apache.org/jira/browse/HIVE-5472 Repository: hive-git Description --- Add current_date/current_timestamp. The UDFs get the current_date/timestamp from the SessionState. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66f436b ql/src/java/org/apache/hadoop/hive/ql/Driver.java ef6db3a ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 23d77ca ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java c315985 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java PRE-CREATION ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION ql/src/test/results/clientpositive/current_date_timestamp.q.out PRE-CREATION ql/src/test/results/clientpositive/show_functions.q.out 36c8743 Diff: https://reviews.apache.org/r/29900/diff/ Testing --- qfile test added Thanks, Jason Dere
Review Request 30385: Use SASL to establish the remote context connection.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30385/ --- Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9487 https://issues.apache.org/jira/browse/HIVE-9487 Repository: hive-git Description --- Instead of the insecure, ad-hoc auth mechanism currently used, perform a SASL negotiation to establish trust. This requires the secret to be distributed through some secure channel (just like before). Using SASL with DIGEST-MD5 (or GSSAPI, which hasn't been tested and probably wouldn't work well here) also allows us to add encryption without the need for SSL (yay?). Only DIGEST-MD5 has been really tested. Supporting other mechanisms will probably mean adding new callback handlers in the client and server portions, but shouldn't be hard if desired. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d4d98d7c0c28cdb1d19c700e20537ef405be2e01 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java ce2f9b6b132dc47f899798e47d18a1f6b0dd707f spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java 3a7149341bac086e5efe931595143d3bebbdb5db spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5f9be658a855cc15c576f1a98376fcd85475e3b7 spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java 0c29c9441fb3e9daf690510a2c9b5716671e2571 spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md 2c858a121aaeca6af20f5e332de207694348a030 spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java fffe24b3cbe6a5d7387e751adbc65f5b140c9089 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java eff640f7b24348043dbce734510698d9294579c6 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 5e18a3c0b5ea4f1b9c83f78faa3408e2dd479c2c spark-client/src/main/java/org/apache/hive/spark/client/rpc/SaslHandler.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java af534375a3ed86a3a9ad57c2f21a9a8bf6113714 spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java ec7842398d3c4112f83f00e8cd3e5d4f9fdf8ca9 Diff: https://reviews.apache.org/r/30385/diff/ Testing --- Unit tests. Thanks, Marcelo Vanzin
[jira] [Updated] (HIVE-5472) support a simple scalar which returns the current timestamp
[ https://issues.apache.org/jira/browse/HIVE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5472: - Attachment: HIVE-5472.3.patch Patch v3, incorporating review feedback. support a simple scalar which returns the current timestamp --- Key: HIVE-5472 URL: https://issues.apache.org/jira/browse/HIVE-5472 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0 Reporter: N Campbell Assignee: Jason Dere Attachments: HIVE-5472.1.patch, HIVE-5472.2.patch, HIVE-5472.3.patch ISO-SQL has two forms of functions local and current timestamp where the former is a TIMESTAMP WITHOUT TIMEZONE and the latter with TIME ZONE select cast ( unix_timestamp() as timestamp ) from T implement a function which computes LOCAL TIMESTAMP which would be the current timestamp for the users session time zone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9292) CBO (Calcite Return Path): Inline GroupBy, Properties
[ https://issues.apache.org/jira/browse/HIVE-9292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296082#comment-14296082 ] Jesus Camacho Rodriguez commented on HIVE-9292: --- Fails are not related to the patch (see HIVE-9498). CBO (Calcite Return Path): Inline GroupBy, Properties - Key: HIVE-9292 URL: https://issues.apache.org/jira/browse/HIVE-9292 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 0.15.0 Attachments: HIVE-9292.01.patch, HIVE-9292.02.patch, HIVE-9292.03.patch, HIVE-9292.04.patch, HIVE-9292.05.patch, HIVE-9292.06.patch, HIVE-9292.07.patch, HIVE-9292.patch, HIVE-9292.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions
[ https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-9436: --- Resolution: Fixed Fix Version/s: 1.2.0 Status: Resolved (was: Patch Available) Committed to trunk RetryingMetaStoreClient does not retry JDOExceptions Key: HIVE-9436 URL: https://issues.apache.org/jira/browse/HIVE-9436 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 1.2.0 Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch RetryingMetaStoreClient has a bug in the following bit of code: {code} } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) { caughtException = (MetaException) e.getCause(); } else { throw e.getCause(); } {code} The bug here is that java String.matches matches the entire string to the regex, and thus, that match will fail if the message contains anything before or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we should match (?s).\*JDO[a-zA-Z]\*Exception.\* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9504) [beeline] ZipException when using !scan
[ https://issues.apache.org/jira/browse/HIVE-9504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-9504: --- Attachment: (was: HIVE-9504.00.patch) [beeline] ZipException when using !scan --- Key: HIVE-9504 URL: https://issues.apache.org/jira/browse/HIVE-9504 Project: Hive Issue Type: Bug Components: Beeline Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor Fix For: 0.15.0 Attachments: HIVE-9504.00.patch Notice this while mucking around: {noformat} 0: jdbc:hive2://localhost:1/ !scan java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.init(ZipFile.java:220) at java.util.zip.ZipFile.init(ZipFile.java:150) at java.util.jar.JarFile.init(JarFile.java:166) at java.util.jar.JarFile.init(JarFile.java:130) at org.apache.hive.beeline.ClassNameCompleter.getClassNames(ClassNameCompleter.java:128) at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1589) at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1579) at org.apache.hive.beeline.Commands.scan(Commands.java:278) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:935) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:778) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:740) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:470) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9504) [beeline] ZipException when using !scan
[ https://issues.apache.org/jira/browse/HIVE-9504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-9504: --- Attachment: HIVE-9504.00.patch [beeline] ZipException when using !scan --- Key: HIVE-9504 URL: https://issues.apache.org/jira/browse/HIVE-9504 Project: Hive Issue Type: Bug Components: Beeline Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor Fix For: 0.15.0 Attachments: HIVE-9504.00.patch Notice this while mucking around: {noformat} 0: jdbc:hive2://localhost:1/ !scan java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.init(ZipFile.java:220) at java.util.zip.ZipFile.init(ZipFile.java:150) at java.util.jar.JarFile.init(JarFile.java:166) at java.util.jar.JarFile.init(JarFile.java:130) at org.apache.hive.beeline.ClassNameCompleter.getClassNames(ClassNameCompleter.java:128) at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1589) at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1579) at org.apache.hive.beeline.Commands.scan(Commands.java:278) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:935) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:778) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:740) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:470) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9504) [beeline] ZipException when using !scan
[ https://issues.apache.org/jira/browse/HIVE-9504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-9504: --- Status: Patch Available (was: Open) [beeline] ZipException when using !scan --- Key: HIVE-9504 URL: https://issues.apache.org/jira/browse/HIVE-9504 Project: Hive Issue Type: Bug Components: Beeline Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor Fix For: 0.15.0 Attachments: HIVE-9504.00.patch Notice this while mucking around: {noformat} 0: jdbc:hive2://localhost:1/ !scan java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.init(ZipFile.java:220) at java.util.zip.ZipFile.init(ZipFile.java:150) at java.util.jar.JarFile.init(JarFile.java:166) at java.util.jar.JarFile.init(JarFile.java:130) at org.apache.hive.beeline.ClassNameCompleter.getClassNames(ClassNameCompleter.java:128) at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1589) at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1579) at org.apache.hive.beeline.Commands.scan(Commands.java:278) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:935) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:778) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:740) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:470) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 30385: Use SASL to establish the remote context connection.
On Jan. 29, 2015, 12:36 a.m., Xuefu Zhang wrote: spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java, line 20 https://reviews.apache.org/r/30385/diff/1/?file=839319#file839319line20 Nit: if you need to submit another patch, let's not auto reorg the imports. I changed this because someone broke it... now it's in line with the usual order you see in the rest of Hive code. - Marcelo --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30385/#review70119 --- On Jan. 28, 2015, 11:22 p.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30385/ --- (Updated Jan. 28, 2015, 11:22 p.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9487 https://issues.apache.org/jira/browse/HIVE-9487 Repository: hive-git Description --- Instead of the insecure, ad-hoc auth mechanism currently used, perform a SASL negotiation to establish trust. This requires the secret to be distributed through some secure channel (just like before). Using SASL with DIGEST-MD5 (or GSSAPI, which hasn't been tested and probably wouldn't work well here) also allows us to add encryption without the need for SSL (yay?). Only DIGEST-MD5 has been really tested. Supporting other mechanisms will probably mean adding new callback handlers in the client and server portions, but shouldn't be hard if desired. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d4d98d7c0c28cdb1d19c700e20537ef405be2e01 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java ce2f9b6b132dc47f899798e47d18a1f6b0dd707f spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java 3a7149341bac086e5efe931595143d3bebbdb5db spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5f9be658a855cc15c576f1a98376fcd85475e3b7 spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java 0c29c9441fb3e9daf690510a2c9b5716671e2571 spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md 2c858a121aaeca6af20f5e332de207694348a030 spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java fffe24b3cbe6a5d7387e751adbc65f5b140c9089 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java eff640f7b24348043dbce734510698d9294579c6 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 5e18a3c0b5ea4f1b9c83f78faa3408e2dd479c2c spark-client/src/main/java/org/apache/hive/spark/client/rpc/SaslHandler.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java af534375a3ed86a3a9ad57c2f21a9a8bf6113714 spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java ec7842398d3c4112f83f00e8cd3e5d4f9fdf8ca9 Diff: https://reviews.apache.org/r/30385/diff/ Testing --- Unit tests. Thanks, Marcelo Vanzin
[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296148#comment-14296148 ] Xuefu Zhang commented on HIVE-9487: --- Patch looks good to me. I left a minor comment on RB. [~chengxiang li] Could you also take a look? Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9103) Support backup task for join related optimization [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-9103: --- Attachment: HIVE-9103-1.spark.patch Support backup task for join related optimization [Spark Branch] Key: HIVE-9103 URL: https://issues.apache.org/jira/browse/HIVE-9103 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chao Priority: Blocker Attachments: HIVE-9103-1.spark.patch In MR, backup task can be executed if the original task, which probably contains certain (join) optimization fails. This JIRA is to track this topic for Spark. We need to determine if we need this and implement if necessary. This is a followup of HIVE-9099. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9103) Support backup task for join related optimization [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-9103: --- Attachment: (was: HIVE-9103-1.spark.patch) Support backup task for join related optimization [Spark Branch] Key: HIVE-9103 URL: https://issues.apache.org/jira/browse/HIVE-9103 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chao Priority: Blocker Attachments: HIVE-9103-1.spark.patch In MR, backup task can be executed if the original task, which probably contains certain (join) optimization fails. This JIRA is to track this topic for Spark. We need to determine if we need this and implement if necessary. This is a followup of HIVE-9099. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
[ https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296187#comment-14296187 ] Xuefu Zhang commented on HIVE-9503: --- I see. I guess the overhead is bearable. It gives much better a user experience than if we auto convert the task and the query fails, leaving the user in the blue. Update 'hive.auto.convert.join.noconditionaltask.*' descriptions Key: HIVE-9503 URL: https://issues.apache.org/jira/browse/HIVE-9503 Project: Hive Issue Type: Bug Components: Configuration Reporter: Szehon Ho Priority: Minor 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or Tez, and only to MR (which has the legacy conditional mapjoin) However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to Spark, Tez, and MR, even though the description indicates it only applies if the above flag is on, which is true only for MR. These configs should be updated to reflect this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
[ https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296178#comment-14296178 ] Xuefu Zhang edited comment on HIVE-9503 at 1/29/15 1:01 AM: Yeah, Hive has long reached to a point where the properties are confusing and sometimes contradicting and duplicating. These two properties plus hive.auto.convert.join are an example. The two properties are meant to be used together. Ignoring one while taking the other doesn't seem to be a clean solution. While it's already a legacy for MR and Tez, I'd like to have a cleaner solution for Spark since we still have the chance. If we want to be consistent across engines, I'd rather fix Spark to be consistent with MR. was (Author: xuefuz): Yeah, Hive has long reached to a point where the properties are confusing and sometimes contradicting and duplicating. These two properties plus hive.auto.convert.join are an example. The two properties are meant to be used together. Ignoring one while taking the other doesn't seem to be a clean solution. While it's already a legacy for MR and Tez, I'd like to have a cleaner solution for Spark since we still have the chance. Update 'hive.auto.convert.join.noconditionaltask.*' descriptions Key: HIVE-9503 URL: https://issues.apache.org/jira/browse/HIVE-9503 Project: Hive Issue Type: Bug Components: Configuration Reporter: Szehon Ho Priority: Minor 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or Tez, and only to MR (which has the legacy conditional mapjoin) However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to Spark, Tez, and MR, even though the description indicates it only applies if the above flag is on, which is true only for MR. These configs should be updated to reflect this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9399) ppd_multi_insert.q generate same output in different order, when mapred.reduce.tasks is set to larger than 1
[ https://issues.apache.org/jira/browse/HIVE-9399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao reassigned HIVE-9399: -- Assignee: Chao ppd_multi_insert.q generate same output in different order, when mapred.reduce.tasks is set to larger than 1 Key: HIVE-9399 URL: https://issues.apache.org/jira/browse/HIVE-9399 Project: Hive Issue Type: Test Reporter: Chao Assignee: Chao If running ppd_multi_insert.q with {{set mapred.reduce.tasks=3}}, the output order is different, even with {{SORT_QUERY_RESULTS}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9470) Use a generic writable object to run ColumnaStorageBench write/read tests
[ https://issues.apache.org/jira/browse/HIVE-9470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296201#comment-14296201 ] Ferdinand Xu commented on HIVE-9470: Thank you for your update. +1 Use a generic writable object to run ColumnaStorageBench write/read tests -- Key: HIVE-9470 URL: https://issues.apache.org/jira/browse/HIVE-9470 Project: Hive Issue Type: Improvement Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9470.1.patch, HIVE-9470.2.patch The ColumnarStorageBench benchmark class is using a Parquet writable object to run all write/read/serialize/deserialize tests. It would be better to use a more generic writable object (like text writables) to get better benchmark results between format storages. Using parquet writables may add advantage when writing parquet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296198#comment-14296198 ] Hive QA commented on HIVE-9487: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695117/HIVE-9487.1-spark.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7359 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/690/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/690/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-690/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695117 - PreCommit-HIVE-SPARK-Build Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9317) move Microsoft copyright to NOTICE file
[ https://issues.apache.org/jira/browse/HIVE-9317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295858#comment-14295858 ] Thejas M Nair commented on HIVE-9317: - This is not in 1.0 RC1, do we need another RC for this ? Looks like we need one. cc [~vikram.dixit] move Microsoft copyright to NOTICE file --- Key: HIVE-9317 URL: https://issues.apache.org/jira/browse/HIVE-9317 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.15.0, 1.0.0 Attachments: hive-9327.txt There are a set of files that still have the Microsoft copyright notices. Those notices need to be moved into NOTICES and replaced with the standard Apache headers. {code} ./common/src/java/org/apache/hadoop/hive/common/type/Decimal128.java ./common/src/java/org/apache/hadoop/hive/common/type/SignedInt128.java ./common/src/java/org/apache/hadoop/hive/common/type/SqlMathUtil.java ./common/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestDecimal128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestSignedInt128.java ./common/src/test/org/apache/hadoop/hive/common/type/TestSqlMathUtil.java ./common/src/test/org/apache/hadoop/hive/common/type/TestUnsignedInt128.java {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29900/#review70083 --- ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java https://reviews.apache.org/r/29900/#comment115045 //you can use substring(0, 10) to get date part of the timestamp String dtStr = SessionState.get().getQueryCurrentTimestamp().toString().substring(0,10); dateVal = Date.valueOf(dtStr); - Alexander Pivovarov On Jan. 19, 2015, 10:01 p.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29900/ --- (Updated Jan. 19, 2015, 10:01 p.m.) Review request for hive and Thejas Nair. Bugs: HIVE-5472 https://issues.apache.org/jira/browse/HIVE-5472 Repository: hive-git Description --- Add current_date/current_timestamp. The UDFs get the current_date/timestamp from the SessionState. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 25cccd7 ql/src/java/org/apache/hadoop/hive/ql/Driver.java 0226f28 ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d7c4ca7 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java f45b20a ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java PRE-CREATION ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION ql/src/test/results/clientpositive/current_date_timestamp.q.out PRE-CREATION ql/src/test/results/clientpositive/show_functions.q.out 9ecb0a0 Diff: https://reviews.apache.org/r/29900/diff/ Testing --- qfile test added Thanks, Jason Dere
[ANNOUNCE] New Hive PMC Members - Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran
I am pleased to announce that Szehon Ho, Vikram Dixit, Jason Dere, Owen O'Malley and Prasanth Jayachandran have been elected to the Hive Project Management Committee. Please join me in congratulating the these new PMC members! Thanks. - Carl
[jira] [Created] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
Szehon Ho created HIVE-9503: --- Summary: Update 'hive.auto.convert.join.noconditionaltask.*' descriptions Key: HIVE-9503 URL: https://issues.apache.org/jira/browse/HIVE-9503 Project: Hive Issue Type: Bug Components: Configuration Reporter: Szehon Ho Priority: Minor 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or Tez, and only to MR (which has the legacy conditional mapjoin) However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to Spark, Tez, and MR, even though the description indicates it only applies if the above flag is on, which is true only for MR. These configs should be updated to reflect this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9188) BloomFilter in ORC row group index
[ https://issues.apache.org/jira/browse/HIVE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-9188: Attachment: HIVE-9188.5.patch [~owen.omalley] Updated patch creates separate streams for bloom filter. It only has row group level bloom filter are dropped. The disk IO is merged while reading row index. [~gopalv] Addressed all your review comments. Additionally FileDump will aggregate the bloom filters to stripe level and will print the stats. You might want to use {code} hive --orcfiledump --rowindex=column_index_csv_list file_path {code} BloomFilter in ORC row group index -- Key: HIVE-9188 URL: https://issues.apache.org/jira/browse/HIVE-9188 Project: Hive Issue Type: New Feature Components: File Formats Affects Versions: 0.15.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Labels: orcfile Attachments: HIVE-9188.1.patch, HIVE-9188.2.patch, HIVE-9188.3.patch, HIVE-9188.4.patch, HIVE-9188.5.patch BloomFilters are well known probabilistic data structure for set membership checking. We can use bloom filters in ORC index for better row group pruning. Currently, ORC row group index uses min/max statistics to eliminate row groups (stripes as well) that do not satisfy predicate condition specified in the query. But in some cases, the efficiency of min/max based elimination is not optimal (unsorted columns with wide range of entries). Bloom filters can be an effective and efficient alternative for row group/split elimination for point queries or queries with IN clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
[ https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296112#comment-14296112 ] Szehon Ho commented on HIVE-9503: - Well its not overriding the meaning, the value means the same thing (size of small-tables), except for the clause that it depends on the first property. The name also makes sense as we don't use a conditional task in Spark. So I think having a Spark-only property for size of small-tables in mapjoin might be more confusing, as users will need to set both properties to get the same behavior in different execution engines. Update 'hive.auto.convert.join.noconditionaltask.*' descriptions Key: HIVE-9503 URL: https://issues.apache.org/jira/browse/HIVE-9503 Project: Hive Issue Type: Bug Components: Configuration Reporter: Szehon Ho Priority: Minor 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or Tez, and only to MR (which has the legacy conditional mapjoin) However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to Spark, Tez, and MR, even though the description indicates it only applies if the above flag is on, which is true only for MR. These configs should be updated to reflect this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext
[ https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296140#comment-14296140 ] Hive QA commented on HIVE-9431: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695096/HIVE-9431.02.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2557/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2557/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2557/ Messages: {noformat} This message was trimmed, see log for full details As a result, alternative(s) 9 were disabled for that input warning(200): IdentifiersParser.g:526:5: Decision can match input such as {AMPERSAND..BITWISEXOR, DIV..DIVIDE, EQUAL..EQUAL_NS, GREATERTHAN..GREATERTHANOREQUALTO, KW_AND, KW_ARRAY, KW_BETWEEN..KW_BOOLEAN, KW_CASE, KW_DOUBLE, KW_FLOAT, KW_IF, KW_IN, KW_INT, KW_LIKE, KW_MAP, KW_NOT, KW_OR, KW_REGEXP, KW_RLIKE, KW_SMALLINT, KW_STRING..KW_STRUCT, KW_TINYINT, KW_UNIONTYPE, KW_WHEN, LESSTHAN..LESSTHANOREQUALTO, MINUS..NOTEQUAL, PLUS, STAR, TILDE} using multiple alternatives: 1, 3 As a result, alternative(s) 3 were disabled for that input [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-exec --- [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-exec --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 2 resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-exec --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-exec --- [INFO] Compiling 2087 source files to /data/hive-ptest/working/apache-svn-trunk-source/ql/target/classes [INFO] - [ERROR] COMPILATION ERROR : [INFO] - [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10160,1] illegal start of expression [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10160,3] illegal start of expression [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10160,5] illegal start of expression [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10161,5] expected [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10161,17] ';' expected [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10161,23] illegal start of expression [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10161,24] ';' expected [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10163,1] illegal start of expression [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10163,3] illegal start of expression [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10163,5] illegal start of expression [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10163,7] illegal start of expression [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10164,17] ')' expected [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10164,23] illegal start of expression [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10164,24] ';' expected [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10166,1] illegal start of expression [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10166,4] illegal start of expression [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:[10166,7] illegal start of expression [ERROR]
Re: Review Request 30385: Use SASL to establish the remote context connection.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30385/#review70119 --- spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java https://reviews.apache.org/r/30385/#comment115111 Nit: if you need to submit another patch, let's not auto reorg the imports. - Xuefu Zhang On Jan. 28, 2015, 11:22 p.m., Marcelo Vanzin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30385/ --- (Updated Jan. 28, 2015, 11:22 p.m.) Review request for hive, Brock Noland, chengxiang li, and Xuefu Zhang. Bugs: HIVE-9487 https://issues.apache.org/jira/browse/HIVE-9487 Repository: hive-git Description --- Instead of the insecure, ad-hoc auth mechanism currently used, perform a SASL negotiation to establish trust. This requires the secret to be distributed through some secure channel (just like before). Using SASL with DIGEST-MD5 (or GSSAPI, which hasn't been tested and probably wouldn't work well here) also allows us to add encryption without the need for SSL (yay?). Only DIGEST-MD5 has been really tested. Supporting other mechanisms will probably mean adding new callback handlers in the client and server portions, but shouldn't be hard if desired. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d4d98d7c0c28cdb1d19c700e20537ef405be2e01 spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java ce2f9b6b132dc47f899798e47d18a1f6b0dd707f spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java 3a7149341bac086e5efe931595143d3bebbdb5db spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 5f9be658a855cc15c576f1a98376fcd85475e3b7 spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java 0c29c9441fb3e9daf690510a2c9b5716671e2571 spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md 2c858a121aaeca6af20f5e332de207694348a030 spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java fffe24b3cbe6a5d7387e751adbc65f5b140c9089 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java eff640f7b24348043dbce734510698d9294579c6 spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 5e18a3c0b5ea4f1b9c83f78faa3408e2dd479c2c spark-client/src/main/java/org/apache/hive/spark/client/rpc/SaslHandler.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java af534375a3ed86a3a9ad57c2f21a9a8bf6113714 spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java ec7842398d3c4112f83f00e8cd3e5d4f9fdf8ca9 Diff: https://reviews.apache.org/r/30385/diff/ Testing --- Unit tests. Thanks, Marcelo Vanzin
[jira] [Created] (HIVE-9504) [beeline] ZipException when using !scan
Nick Dimiduk created HIVE-9504: -- Summary: [beeline] ZipException when using !scan Key: HIVE-9504 URL: https://issues.apache.org/jira/browse/HIVE-9504 Project: Hive Issue Type: Bug Components: Beeline Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor Fix For: 0.15.0 Notice this while mucking around: {noformat} 0: jdbc:hive2://localhost:1/ !scan java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.init(ZipFile.java:220) at java.util.zip.ZipFile.init(ZipFile.java:150) at java.util.jar.JarFile.init(JarFile.java:166) at java.util.jar.JarFile.init(JarFile.java:130) at org.apache.hive.beeline.ClassNameCompleter.getClassNames(ClassNameCompleter.java:128) at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1589) at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1579) at org.apache.hive.beeline.Commands.scan(Commands.java:278) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:935) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:778) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:740) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:470) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp
On Jan. 28, 2015, 8:40 p.m., Thejas Nair wrote: ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java, line 1404 https://reviews.apache.org/r/29900/diff/2/?file=825431#file825431line1404 I have seen at least another place where we have a test timestamp getting injected. I might make sense to use some kind of getTimestamp class that can be customized to give a specific timestamp. But this does not have to be addressed in this jira. Jason Dere wrote: Where does this occur? in 'show grants' , it shows the timestamp when the grant was made, there is code to return -1 for timestamp in the test mode. On Jan. 28, 2015, 8:40 p.m., Thejas Nair wrote: ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java, line 32 https://reviews.apache.org/r/29900/diff/2/?file=825432#file825432line32 looks like we can consider this to be deterministic, since the value does not change within a query. Jason Dere wrote: ok, will change. If we ever have a new descriptor to specify deterministic within the same query but different for different queries, this would fit the description. Would be needed for stuff like determining suitability of queries for materialized views. good point about materialized views - Thejas --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29900/#review70062 --- On Jan. 28, 2015, 11:22 p.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29900/ --- (Updated Jan. 28, 2015, 11:22 p.m.) Review request for hive and Thejas Nair. Bugs: HIVE-5472 https://issues.apache.org/jira/browse/HIVE-5472 Repository: hive-git Description --- Add current_date/current_timestamp. The UDFs get the current_date/timestamp from the SessionState. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66f436b ql/src/java/org/apache/hadoop/hive/ql/Driver.java ef6db3a ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 23d77ca ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java c315985 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java PRE-CREATION ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION ql/src/test/results/clientpositive/current_date_timestamp.q.out PRE-CREATION ql/src/test/results/clientpositive/show_functions.q.out 36c8743 Diff: https://reviews.apache.org/r/29900/diff/ Testing --- qfile test added Thanks, Jason Dere
[jira] [Updated] (HIVE-9504) [beeline] ZipException when using !scan
[ https://issues.apache.org/jira/browse/HIVE-9504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-9504: --- Attachment: HIVE-9504.00.patch [beeline] ZipException when using !scan --- Key: HIVE-9504 URL: https://issues.apache.org/jira/browse/HIVE-9504 Project: Hive Issue Type: Bug Components: Beeline Reporter: Nick Dimiduk Assignee: Nick Dimiduk Priority: Minor Fix For: 0.15.0 Attachments: HIVE-9504.00.patch Notice this while mucking around: {noformat} 0: jdbc:hive2://localhost:1/ !scan java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.init(ZipFile.java:220) at java.util.zip.ZipFile.init(ZipFile.java:150) at java.util.jar.JarFile.init(JarFile.java:166) at java.util.jar.JarFile.init(JarFile.java:130) at org.apache.hive.beeline.ClassNameCompleter.getClassNames(ClassNameCompleter.java:128) at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1589) at org.apache.hive.beeline.BeeLine.scanDrivers(BeeLine.java:1579) at org.apache.hive.beeline.Commands.scan(Commands.java:278) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:935) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:778) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:740) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:470) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:453) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext
[ https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9431: -- Attachment: HIVE-9431.03.patch CBO (Calcite Return Path): Removing AST from ParseContext - Key: HIVE-9431 URL: https://issues.apache.org/jira/browse/HIVE-9431 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 0.15.0 Attachments: HIVE-9431.01.patch, HIVE-9431.02.patch, HIVE-9431.03.patch, HIVE-9431.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
[ https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296154#comment-14296154 ] Xuefu Zhang commented on HIVE-9503: --- I'm not sure of the difference between backup task and the conditional task that these two properties are referring to, but I don't feel we need a property to control whether to have a backup task. As long as we auto converted a join, we should have a backup task. Update 'hive.auto.convert.join.noconditionaltask.*' descriptions Key: HIVE-9503 URL: https://issues.apache.org/jira/browse/HIVE-9503 Project: Hive Issue Type: Bug Components: Configuration Reporter: Szehon Ho Priority: Minor 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or Tez, and only to MR (which has the legacy conditional mapjoin) However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to Spark, Tez, and MR, even though the description indicates it only applies if the above flag is on, which is true only for MR. These configs should be updated to reflect this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext
[ https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9431: -- Status: Open (was: Patch Available) CBO (Calcite Return Path): Removing AST from ParseContext - Key: HIVE-9431 URL: https://issues.apache.org/jira/browse/HIVE-9431 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 0.15.0 Attachments: HIVE-9431.01.patch, HIVE-9431.02.patch, HIVE-9431.03.patch, HIVE-9431.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9431) CBO (Calcite Return Path): Removing AST from ParseContext
[ https://issues.apache.org/jira/browse/HIVE-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9431: -- Status: Patch Available (was: Open) CBO (Calcite Return Path): Removing AST from ParseContext - Key: HIVE-9431 URL: https://issues.apache.org/jira/browse/HIVE-9431 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 0.15.0 Attachments: HIVE-9431.01.patch, HIVE-9431.02.patch, HIVE-9431.03.patch, HIVE-9431.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
[ https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296175#comment-14296175 ] Chao commented on HIVE-9503: In MR, both conditional task AND backup task are used, but for us, only backup task is needed, since no decision needs to be made (only one mapjoin task). If we always use backup task for auto converted join, it will add overhead to plan compilation, because to generate a backup task we need to clone the whole operator tree. Update 'hive.auto.convert.join.noconditionaltask.*' descriptions Key: HIVE-9503 URL: https://issues.apache.org/jira/browse/HIVE-9503 Project: Hive Issue Type: Bug Components: Configuration Reporter: Szehon Ho Priority: Minor 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or Tez, and only to MR (which has the legacy conditional mapjoin) However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to Spark, Tez, and MR, even though the description indicates it only applies if the above flag is on, which is true only for MR. These configs should be updated to reflect this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8807) Obsolete default values in webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296193#comment-14296193 ] Thejas M Nair commented on HIVE-8807: - +1 Obsolete default values in webhcat-default.xml -- Key: HIVE-8807 URL: https://issues.apache.org/jira/browse/HIVE-8807 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Lefty Leverenz Assignee: Eugene Koifman Fix For: 1.0.0 Attachments: HIVE8807.patch The defaults for templeton.pig.path templeton.hive.path are 0.11 in webhcat-default.xml but they ought to match current release numbers. The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml). no precommit tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HIVE-9487: - Attachment: HIVE-9487.1-spark.patch Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Attachments: HIVE-9487.1-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9468) Test groupby3_map_skew.q fails due to decimal precision difference
[ https://issues.apache.org/jira/browse/HIVE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296057#comment-14296057 ] Szehon Ho commented on HIVE-9468: - One more: udaf_covar_pop.q {noformat} Running: diff -a /home/hiveptest/54.145.215.245-hiveptest-2/apache-svn-trunk-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/udaf_covar_pop.q.out /home/hiveptest/54.145.215.245-hiveptest-2/apache-svn-trunk-source/itests/qtest/../../ql/src/test/results/clientpositive/udaf_covar_pop.q.out 91c91 3.625 --- 3.624 {noformat} Test groupby3_map_skew.q fails due to decimal precision difference -- Key: HIVE-9468 URL: https://issues.apache.org/jira/browse/HIVE-9468 Project: Hive Issue Type: Bug Components: Tests Reporter: Xuefu Zhang From test run, http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/682/testReport: {code} Running: diff -a /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/groupby3_map_skew.q.out /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/groupby3_map_skew.q.out 162c162 130091.0260.182 256.10355987055016 98.00.0 142.92680950752379 143.06995106518903 20428.07288 20469.0109 --- 130091.0260.182 256.10355987055016 98.00.0 142.9268095075238 143.06995106518906 20428.07288 20469.0109 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9474) truncate table changes permissions on the target
[ https://issues.apache.org/jira/browse/HIVE-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9474: Resolution: Fixed Fix Version/s: (was: 0.15.0) 1.2.0 Status: Resolved (was: Patch Available) Committed to trunk, thanks Aihua! truncate table changes permissions on the target Key: HIVE-9474 URL: https://issues.apache.org/jira/browse/HIVE-9474 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Aihua Xu Assignee: Aihua Xu Priority: Minor Fix For: 1.2.0 Attachments: HIVE-9474.1.patch, HIVE-9474.2.patch, HIVE-9474.3.patch Original Estimate: 4h Remaining Estimate: 4h Create a table test(a string); Hive create table test(key string); Change the /user/hive/warehouse/test permission to something else other than the default, like 777. Hive dfs -chmod 777 /user/hive/warehouse/test; Hive dfs -ls -d /user/hive/warehouse/test; drwxrwxrwx - axu wheel 68 2015-01-26 18:45 /user/hive/warehouse/test Then truncate table test; Hive truncate table test; The permission goes back to the default. hive dfs -ls -d /user/hive/warehouse/test; drwxr-xr-x - axu wheel 68 2015-01-27 10:09 /user/hive/warehouse/test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions
[ https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296096#comment-14296096 ] Sushanth Sowmyan commented on HIVE-9436: The test failure now reported is down to 3, and they're unconnected to the issue being fixed here, so I will go ahead and commit this. RetryingMetaStoreClient does not retry JDOExceptions Key: HIVE-9436 URL: https://issues.apache.org/jira/browse/HIVE-9436 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch RetryingMetaStoreClient has a bug in the following bit of code: {code} } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) { caughtException = (MetaException) e.getCause(); } else { throw e.getCause(); } {code} The bug here is that java String.matches matches the entire string to the regex, and thus, that match will fail if the message contains anything before or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we should match (?s).\*JDO[a-zA-Z]\*Exception.\* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
[ https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296098#comment-14296098 ] Xuefu Zhang commented on HIVE-9503: --- If hive.auto.convert.join.noconditionaltask.size is used by spark regardless of hive.auto.convert.join.noconditionaltask, we should probably have a different property. Reusing the same property while overwriting its meaning could cause confusion for either existing users or new users. Update 'hive.auto.convert.join.noconditionaltask.*' descriptions Key: HIVE-9503 URL: https://issues.apache.org/jira/browse/HIVE-9503 Project: Hive Issue Type: Bug Components: Configuration Reporter: Szehon Ho Priority: Minor 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or Tez, and only to MR (which has the legacy conditional mapjoin) However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to Spark, Tez, and MR, even though the description indicates it only applies if the above flag is on, which is true only for MR. These configs should be updated to reflect this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions
[ https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296104#comment-14296104 ] Sushanth Sowmyan commented on HIVE-9436: [~vikram.dixit] - if you do respin an rc for 1.0, it would be useful to have this fix in as well - it's a simple fix which fixes retries from the client, and is a robustness fix. It isn't, however a breaking bug as I see it, since it is used only in the case of connection issues, where we give it a chance to retry instead of failing directly. RetryingMetaStoreClient does not retry JDOExceptions Key: HIVE-9436 URL: https://issues.apache.org/jira/browse/HIVE-9436 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 1.2.0 Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch RetryingMetaStoreClient has a bug in the following bit of code: {code} } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) { caughtException = (MetaException) e.getCause(); } else { throw e.getCause(); } {code} The bug here is that java String.matches matches the entire string to the regex, and thus, that match will fail if the message contains anything before or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we should match (?s).\*JDO[a-zA-Z]\*Exception.\* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9468) Test groupby3_map_skew.q fails due to decimal precision difference
[ https://issues.apache.org/jira/browse/HIVE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296105#comment-14296105 ] Xuefu Zhang commented on HIVE-9468: --- Yet another one: udaf_covar_samp.q {code} Running: diff -a /home/hiveptest/50.18.32.237-hiveptest-0/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/udaf_covar_samp.q.out /home/hiveptest/50.18.32.237-hiveptest-0/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/udaf_covar_samp.q.out 91c91 4.833 --- 4.832 {code} Test groupby3_map_skew.q fails due to decimal precision difference -- Key: HIVE-9468 URL: https://issues.apache.org/jira/browse/HIVE-9468 Project: Hive Issue Type: Bug Components: Tests Reporter: Xuefu Zhang From test run, http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/682/testReport: {code} Running: diff -a /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/groupby3_map_skew.q.out /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/groupby3_map_skew.q.out 162c162 130091.0260.182 256.10355987055016 98.00.0 142.92680950752379 143.06995106518903 20428.07288 20469.0109 --- 130091.0260.182 256.10355987055016 98.00.0 142.9268095075238 143.06995106518906 20428.07288 20469.0109 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9302) Beeline add commands to register local jdbc driver names and jars
[ https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296117#comment-14296117 ] Ferdinand Xu commented on HIVE-9302: Thanks [~thejas] for your update! Beeline add commands to register local jdbc driver names and jars - Key: HIVE-9302 URL: https://issues.apache.org/jira/browse/HIVE-9302 Project: Hive Issue Type: New Feature Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: DummyDriver-1.0-SNAPSHOT.jar, HIVE-9302.1.patch, HIVE-9302.2.patch, HIVE-9302.patch, mysql-connector-java-bin.jar, postgresql-9.3.jdbc3.jar At present if a beeline user uses {{add jar}} the path they give is actually on the HS2 server. It'd be great to allow beeline users to add local jdbc driver jars and register custom jdbc driver names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9501) DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification
[ https://issues.apache.org/jira/browse/HIVE-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296133#comment-14296133 ] Hive QA commented on HIVE-9501: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695060/HIVE-9501.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7403 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join38 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2556/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2556/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2556/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695060 - PreCommit-HIVE-TRUNK-Build DbNotificationListener doesn't include dbname in create database notification and does not include tablename in create table notification - Key: HIVE-9501 URL: https://issues.apache.org/jira/browse/HIVE-9501 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-9501.patch This is a hold over from the JMS stuff, where create database is sent on the general topic and create table on the db topic. But since DbNotificationListener isn't for JMS, keeping this semantic doesn't make sense. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
[ https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296141#comment-14296141 ] Chao commented on HIVE-9503: For backup task (HIVE-9103), I'm thinking about reusing hive.auto.convert.join.noconditionaltask to specify whether backup task is needed. This is slightly misleading, but we can add some description to the property. Thoughts? Update 'hive.auto.convert.join.noconditionaltask.*' descriptions Key: HIVE-9503 URL: https://issues.apache.org/jira/browse/HIVE-9503 Project: Hive Issue Type: Bug Components: Configuration Reporter: Szehon Ho Priority: Minor 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or Tez, and only to MR (which has the legacy conditional mapjoin) However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to Spark, Tez, and MR, even though the description indicates it only applies if the above flag is on, which is true only for MR. These configs should be updated to reflect this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29900/#review70128 --- Ship it! Ship It! - Thejas Nair On Jan. 28, 2015, 11:22 p.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29900/ --- (Updated Jan. 28, 2015, 11:22 p.m.) Review request for hive and Thejas Nair. Bugs: HIVE-5472 https://issues.apache.org/jira/browse/HIVE-5472 Repository: hive-git Description --- Add current_date/current_timestamp. The UDFs get the current_date/timestamp from the SessionState. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66f436b ql/src/java/org/apache/hadoop/hive/ql/Driver.java ef6db3a ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 23d77ca ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java c315985 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java PRE-CREATION ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION ql/src/test/results/clientpositive/current_date_timestamp.q.out PRE-CREATION ql/src/test/results/clientpositive/show_functions.q.out 36c8743 Diff: https://reviews.apache.org/r/29900/diff/ Testing --- qfile test added Thanks, Jason Dere
[jira] [Updated] (HIVE-9473) sql std auth should disallow built-in udfs that allow any java methods to be called
[ https://issues.apache.org/jira/browse/HIVE-9473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-9473: Status: Patch Available (was: Open) sql std auth should disallow built-in udfs that allow any java methods to be called --- Key: HIVE-9473 URL: https://issues.apache.org/jira/browse/HIVE-9473 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-9473.1.patch As mentioned in HIVE-8893, some udfs can be used to execute arbitrary java methods. This should be disallowed when sql standard authorization is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9103) Support backup task for join related optimization [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-9103: --- Status: Patch Available (was: Open) Support backup task for join related optimization [Spark Branch] Key: HIVE-9103 URL: https://issues.apache.org/jira/browse/HIVE-9103 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chao Priority: Blocker Attachments: HIVE-9103-1.spark.patch In MR, backup task can be executed if the original task, which probably contains certain (join) optimization fails. This JIRA is to track this topic for Spark. We need to determine if we need this and implement if necessary. This is a followup of HIVE-9099. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9103) Support backup task for join related optimization [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-9103: --- Attachment: HIVE-9103-1.spark.patch Support backup task for join related optimization [Spark Branch] Key: HIVE-9103 URL: https://issues.apache.org/jira/browse/HIVE-9103 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chao Priority: Blocker Attachments: HIVE-9103-1.spark.patch In MR, backup task can be executed if the original task, which probably contains certain (join) optimization fails. This JIRA is to track this topic for Spark. We need to determine if we need this and implement if necessary. This is a followup of HIVE-9099. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5472) support a simple scalar which returns the current timestamp
[ https://issues.apache.org/jira/browse/HIVE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296168#comment-14296168 ] Thejas M Nair commented on HIVE-5472: - +1 support a simple scalar which returns the current timestamp --- Key: HIVE-5472 URL: https://issues.apache.org/jira/browse/HIVE-5472 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0 Reporter: N Campbell Assignee: Jason Dere Attachments: HIVE-5472.1.patch, HIVE-5472.2.patch, HIVE-5472.3.patch ISO-SQL has two forms of functions local and current timestamp where the former is a TIMESTAMP WITHOUT TIMEZONE and the latter with TIME ZONE select cast ( unix_timestamp() as timestamp ) from T implement a function which computes LOCAL TIMESTAMP which would be the current timestamp for the users session time zone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9503) Update 'hive.auto.convert.join.noconditionaltask.*' descriptions
[ https://issues.apache.org/jira/browse/HIVE-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296178#comment-14296178 ] Xuefu Zhang commented on HIVE-9503: --- Yeah, Hive has long reached to a point where the properties are confusing and sometimes contradicting and duplicating. These two properties plus hive.auto.convert.join are an example. The two properties are meant to be used together. Ignoring one while taking the other doesn't seem to be a clean solution. While it's already a legacy for MR and Tez, I'd like to have a cleaner solution for Spark since we still have the chance. Update 'hive.auto.convert.join.noconditionaltask.*' descriptions Key: HIVE-9503 URL: https://issues.apache.org/jira/browse/HIVE-9503 Project: Hive Issue Type: Bug Components: Configuration Reporter: Szehon Ho Priority: Minor 'hive.auto.convert.join.noconditionaltask' flag does not apply to Spark or Tez, and only to MR (which has the legacy conditional mapjoin) However, 'hive.auto.convert.join.noconditionaltask.size' flag does apply to Spark, Tez, and MR, even though the description indicates it only applies if the above flag is on, which is true only for MR. These configs should be updated to reflect this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 30388: HIVE-9103 - Support backup task for join related optimization [Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30388/ --- Review request for hive and Xuefu Zhang. Bugs: HIVE-9103 https://issues.apache.org/jira/browse/HIVE-9103 Repository: hive-git Description --- This patch adds backup task to map join task. The backup task, which uses common join, will be triggered in case the mapjoin task failed. Note that, no matter how many map joins there are in the SparkTask, we will only generate one backup task. This means that if the original task failed at the very last map join, the whole task will be re-executed. The handling of backup task is a little bit different from what MR does, mostly because we convert JOIN to MAPJOIN during the operator plan optimization phase, at which time no task/work exist yet. In the patch, we cloned the whole operator tree before the JOIN operator is converted. The operator tree will be processed and generate a separate work tree for a separate backup SparkTask. Diffs - ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SparkMapJoinResolver.java 69004dc ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/StageIDsRearranger.java 79c3e02 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkJoinOptimizer.java d57ceff ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java 9ff47c7 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkSortMergeJoinFactory.java 6e0ac38 ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java b838bff ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 773cfbd ql/src/java/org/apache/hadoop/hive/ql/parse/spark/OptimizeSparkProcContext.java f7586a4 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 3a7477a ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 0e85990 ql/src/test/results/clientpositive/spark/auto_join25.q.out ab01b8a Diff: https://reviews.apache.org/r/30388/diff/ Testing --- auto_join25.q Thanks, Chao Sun
Re: Review Request 29900: HIVE-5472 support a simple scalar which returns the current timestamp
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29900/#review70130 --- ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java https://reviews.apache.org/r/29900/#comment115137 you can remove = null. Class attributes in java are null by default - Alexander Pivovarov On Jan. 28, 2015, 11:22 p.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29900/ --- (Updated Jan. 28, 2015, 11:22 p.m.) Review request for hive and Thejas Nair. Bugs: HIVE-5472 https://issues.apache.org/jira/browse/HIVE-5472 Repository: hive-git Description --- Add current_date/current_timestamp. The UDFs get the current_date/timestamp from the SessionState. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 66f436b ql/src/java/org/apache/hadoop/hive/ql/Driver.java ef6db3a ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 23d77ca ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f412010 ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g c960a6b ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java c315985 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentDate.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentTimestamp.java PRE-CREATION ql/src/test/queries/clientpositive/current_date_timestamp.q PRE-CREATION ql/src/test/results/clientpositive/current_date_timestamp.q.out PRE-CREATION ql/src/test/results/clientpositive/show_functions.q.out 36c8743 Diff: https://reviews.apache.org/r/29900/diff/ Testing --- qfile test added Thanks, Jason Dere
[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.
[ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-9471: --- Attachment: (was: HIVE-9471.1.patch) Bad seek in uncompressed ORC, at row-group boundary. Key: HIVE-9471 URL: https://issues.apache.org/jira/browse/HIVE-9471 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group. {code:title=stacktrace} java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) ... Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112) at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852) {code} I'll attach the script to reproduce the problem herewith. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9471) Bad seek in uncompressed ORC, at row-group boundary.
[ https://issues.apache.org/jira/browse/HIVE-9471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-9471: --- Status: Open (was: Patch Available) Bad seek in uncompressed ORC, at row-group boundary. Key: HIVE-9471 URL: https://issues.apache.org/jira/browse/HIVE-9471 Project: Hive Issue Type: Bug Components: File Formats, Serializers/Deserializers Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9471.1.patch, data.txt, orc_bad_seek_failure_case.hive, orc_bad_seek_setup.hive Under at least one specific condition, using index-filters in ORC causes a bad seek into the ORC row-group. {code:title=stacktrace} java.io.IOException: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1655) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:305) ... Caused by: java.lang.IllegalArgumentException: Seek in Stream for column 2 kind DATA to 0 is outside of the data at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:112) at org.apache.hadoop.hive.ql.io.orc.InStream$UncompressedStream.seek(InStream.java:96) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.seek(RunLengthIntegerReaderV2.java:310) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.seek(RecordReaderImpl.java:1596) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.seek(RecordReaderImpl.java:1337) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.seek(RecordReaderImpl.java:1852) {code} I'll attach the script to reproduce the problem herewith. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9496) Sl4j warning in hive command
Philippe Kernevez created HIVE-9496: --- Summary: Sl4j warning in hive command Key: HIVE-9496 URL: https://issues.apache.org/jira/browse/HIVE-9496 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.14.0 Environment: HDP 2.2.0 on CentOS. With Horton Sand Box and my own cluster. Reporter: Philippe Kernevez Priority: Minor Each time 'hive' command is ran, we have an Sl4J warning about multiple jars containing SL4J classes. This bug is similar to Hive-6162, but doesn't seems to be solved. Logging initialized using configuration in file:/etc/hive/conf/hive-log4j.properties SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-1084/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-1084/hive/lib/hive-jdbc-0.14.0.2.2.0.0-1084-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6308) COLUMNS_V2 Metastore table not populated for tables created without an explicit column list.
[ https://issues.apache.org/jira/browse/HIVE-6308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295305#comment-14295305 ] Yongzhi Chen commented on HIVE-6308: Thank you Szehon! This fix treats creating Avro tables without col defs in hive the same as creating table with all col defs. This fix does not address this kind of avro tables created before the fix. Tested with hive command: analyze table compute statistics for column. COLUMNS_V2 Metastore table not populated for tables created without an explicit column list. Key: HIVE-6308 URL: https://issues.apache.org/jira/browse/HIVE-6308 Project: Hive Issue Type: Bug Components: Database/Schema Affects Versions: 0.10.0 Reporter: Alexander Behm Assignee: Yongzhi Chen Fix For: 1.2.0 Attachments: HIVE-6308.1.patch Consider this example table: CREATE TABLE avro_test ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.url'='file:///path/to/the/schema/test_serializer.avsc'); When I try to run an ANALYZE TABLE for computing column stats on any of the columns, then I get: org.apache.hadoop.hive.ql.metadata.HiveException: NoSuchObjectException(message:Column o_orderpriority for which stats gathering is requested doesn't exist.) at org.apache.hadoop.hive.ql.metadata.Hive.updateTableColumnStatistics(Hive.java:2280) at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.persistTableStats(ColumnStatsTask.java:331) at org.apache.hadoop.hive.ql.exec.ColumnStatsTask.execute(ColumnStatsTask.java:343) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) The root cause appears to be that the COLUMNS_V2 table in the Metastore isn't populated properly during the table creation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 30281: Move parquet serialize implementation to DataWritableWriter to improve write speeds
On Ene. 28, 2015, 5:23 a.m., cheng xu wrote: ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java, lines 218-225 https://reviews.apache.org/r/30281/diff/2-3/?file=835466#file835466line218 How about the following code snippet? recordConsummer.startField(fieldName, i); if(i % 2 == 0){ writeValue(keyElement, KeyInspector, fieldType); }else{ writeValue(valueElement, valueInspector, fieldType); } recordConsumer.endField(fieldName, i); The parquet API does not accept NULL values inside startField/endField. This is why I had to check if key or value are nulls before starting the field. Or in the change I did, we check for null values everywhere, and then call startField/endField on writePrimitive. You can see the TestDataWritableWriter.testMapType() method for how null values should work. This is how Parquet adds map value 'key3 = null' startGroup(); startField(key, 0); addString(key3); endField(key, 0); endGroup(); On Ene. 28, 2015, 5:23 a.m., cheng xu wrote: ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java, line 76 https://reviews.apache.org/r/30281/diff/2-3/?file=835466#file835466line76 Hi Sergio, I am a little confused about the purpose of pushing startFiled endField down. As the method name writeGroupFields indicates, it will write fields of group one by one. My suggestion is moving back these two lines. If I missed anything, please tell me your consideration about this change. See the comment regarind thte writeMap() method. We can go back to the original implemenation to make it look better, but writeMap() will look not very clean. The thing is that we cannot add null values inside startField/endField. - Sergio --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30281/#review69935 --- On Ene. 27, 2015, 6:47 p.m., Sergio Pena wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30281/ --- (Updated Ene. 27, 2015, 6:47 p.m.) Review request for hive, Ryan Blue, cheng xu, and Dong Chen. Bugs: HIVE-9333 https://issues.apache.org/jira/browse/HIVE-9333 Repository: hive-git Description --- This patch moves the ParquetHiveSerDe.serialize() implementation to DataWritableWriter class in order to save time in materializing data on serialize(). Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java ea4109d358f7c48d1e2042e5da299475de4a0a29 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9caa4ed169ba92dbd863e4a2dc6d06ab226a4465 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java 060b1b722d32f3b2f88304a1a73eb249e150294b ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 41b5f1c3b0ab43f734f8a211e3e03d5060c75434 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java a693aff18516d133abf0aae4847d3fe00b9f1c96 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java 667d3671547190d363107019cd9a2d105d26d336 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 007a665529857bcec612f638a157aa5043562a15 serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetWritable.java PRE-CREATION Diff: https://reviews.apache.org/r/30281/diff/ Testing --- The tests run were the following: 1. JMH (Java microbenchmark) This benchmark called parquet serialize/write methods using text writable objects. Class.method Before Change (ops/s) After Change (ops/s) --- ParquetHiveSerDe.serialize: 19,113 249,528 - 19x speed increase DataWritableWriter.write: 5,033 5,201 - 3.34% speed increase 2. Write 20 million rows (~1GB file) from Text to Parquet I wrote a ~1Gb file in Textfile format, then convert it to a Parquet format using the following statement: CREATE TABLE parquet STORED AS parquet AS SELECT * FROM text; Time (s) it took to write the whole file BEFORE changes: 93.758 s Time (s) it took to write the whole file AFTER changes: 83.903 s It got a 10% of speed inscrease. Thanks, Sergio Pena