[jira] [Commented] (HIVE-9293) Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268948#comment-14268948 ] Hive QA commented on HIVE-9293: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12690718/HIVE-9293.1-spark.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7283 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/618/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/618/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-618/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12690718 - PreCommit-HIVE-SPARK-Build > Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch] > --- > > Key: HIVE-9293 > URL: https://issues.apache.org/jira/browse/HIVE-9293 > Project: Hive > Issue Type: Task > Components: Spark >Affects Versions: spark-branch >Reporter: Szehon Ho >Assignee: Chao >Priority: Minor > Attachments: HIVE-9293.1-spark.patch > > > As we don't have UnionWork anymore, we can simplify the logic to get root > mapworks from the SparkWork. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9289) TODO : Store user name in session [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268928#comment-14268928 ] Chengxiang Li commented on HIVE-9289: - We does reuse session for queries from same client, [~xuefuz], I verified before and i just did again, just may not work in the way we thought before. HiveServer2 introduce a new approach to manager session, every RPC call references a session ID which the server then maps to persistent session state,the linear mapping is like: Hive Client->SessionHandler(session id inside)\->HiveSessionImpl->SessionState. For Hive on Spark, as we would like to share the singleton SparkContext for a user session, we may expand the linear mapping like: Hive Client->SessionHandler(session id inside)\-> HiveSessionImpl->SessionState->SparkSession->SparkClient->RemoteDriver->SparkContext, with only exception: create new SparkSession while spark configuration update. Hive Client->SessionHandler(session id inside)->HiveSessionImpl mapping should already ensure that whether Hive session should reused, I think we do not need to check again by user name in SparkSessionManager unless we have other reasons to create new SparkSession beside spark configuration update. [~chinnalalam], I just read related code today, not sure if i fully understand it, do you have any idea about it? > TODO : Store user name in session [Spark Branch] > > > Key: HIVE-9289 > URL: https://issues.apache.org/jira/browse/HIVE-9289 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chinna Rao Lalam >Assignee: Chinna Rao Lalam > Attachments: HIVE-9289.1-spark.patch > > > TODO : this we need to store the session username somewhere else as > getUGIForConf never used the conf SparkSessionManagerImpl.java > /hive-exec/src/java/org/apache/hadoop/hive/ql/exec/spark/session line 145 > Java Task -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9188) BloomFilter in ORC row group index
[ https://issues.apache.org/jira/browse/HIVE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268907#comment-14268907 ] Gopal V commented on HIVE-9188: --- Left some comments, particularly about the encoding of the bloom filter itself. The List is a bad idea as the 2nd long in the list is actually a double containing the fpp value. Otherwise the patch looks good. I've added it to the build right now, will ETL in a bunch of the NYC taxi data with this and run some point-scan queries. > BloomFilter in ORC row group index > -- > > Key: HIVE-9188 > URL: https://issues.apache.org/jira/browse/HIVE-9188 > Project: Hive > Issue Type: New Feature > Components: File Formats >Affects Versions: 0.15.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Labels: orcfile > Attachments: HIVE-9188.1.patch, HIVE-9188.2.patch, HIVE-9188.3.patch, > HIVE-9188.4.patch > > > BloomFilters are well known probabilistic data structure for set membership > checking. We can use bloom filters in ORC index for better row group pruning. > Currently, ORC row group index uses min/max statistics to eliminate row > groups (stripes as well) that do not satisfy predicate condition specified in > the query. But in some cases, the efficiency of min/max based elimination is > not optimal (unsorted columns with wide range of entries). Bloom filters can > be an effective and efficient alternative for row group/split elimination for > point queries or queries with IN clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9299) Reuse Configuration in AvroSerdeUtils
[ https://issues.apache.org/jira/browse/HIVE-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268903#comment-14268903 ] Hive QA commented on HIVE-9299: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12690643/HIVE-9299.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6732 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2285/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2285/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2285/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12690643 - PreCommit-HIVE-TRUNK-Build > Reuse Configuration in AvroSerdeUtils > - > > Key: HIVE-9299 > URL: https://issues.apache.org/jira/browse/HIVE-9299 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 0.14.0, 0.13.1, 0.15.0 >Reporter: Nitay Joffe >Assignee: Nitay Joffe > Fix For: 0.15.0 > > Attachments: HIVE-9299.patch > > > I am getting an issue where the original Configuration has some parameters > needed to read the remote Avro schema (specifically S3 keys). > Doing new Configuration doesn't pick it up because the keys are not on the > classpath. > We should reuse the Configuration already present in callers. > I'm using Hive/Avro from Spark so it'd be nice if we could put this into Hive > 0.13 since that's what Spark's built against. > See also https://github.com/jghoman/haivvreo/pull/30 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9306) Let Context.isLocalOnlyExecutionMode() return false if execution engine is Spark [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268896#comment-14268896 ] Hive QA commented on HIVE-9306: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12690716/HIVE-9306.1-spark.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 7283 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/617/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/617/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-617/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12690716 - PreCommit-HIVE-SPARK-Build > Let Context.isLocalOnlyExecutionMode() return false if execution engine is > Spark [Spark Branch] > --- > > Key: HIVE-9306 > URL: https://issues.apache.org/jira/browse/HIVE-9306 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-9306.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9305: Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Committed to spark-branch. Thanks Chengxiang! > Set default miniClusterType back to none in QTestUtil.[Spark branch] > > > Key: HIVE-9305 > URL: https://issues.apache.org/jira/browse/HIVE-9305 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Chengxiang Li >Assignee: Chengxiang Li >Priority: Minor > Fix For: spark-branch > > Attachments: HIVE-9305.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4639) Add has null flag to ORC internal index
[ https://issues.apache.org/jira/browse/HIVE-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268884#comment-14268884 ] Gopal V commented on HIVE-4639: --- Added this patch to my daily TPC-H 1Tb ETL & reloaded lineitem with the new format. Testing {{select * from lineitem where l_shipdate is null;}}. Before: 66.728 seconds (208774320430 bytes read) After: 7.87 seconds (539046900 bytes read) LGTM - +1. > Add has null flag to ORC internal index > --- > > Key: HIVE-4639 > URL: https://issues.apache.org/jira/browse/HIVE-4639 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Owen O'Malley >Assignee: Prasanth Jayachandran > Attachments: HIVE-4639.1.patch, HIVE-4639.2.patch > > > It would enable more predicate pushdown if we added a flag to the index entry > recording if there were any null values in the column for the 10k rows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9038) Join tests fail on Tez
[ https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268881#comment-14268881 ] Navis commented on HIVE-9038: - Unlike MR, MapJoin in TEZ does not make filterTag in previous vertice but still want to read it. At first it sounded simple but I couldn't get all of those complex codes in MapJoinTableContainer, ReusableGetAdaptor, KeyValueHelper, KvSource,etc. Maybe [~sershe] can fix this at a glance. > Join tests fail on Tez > -- > > Key: HIVE-9038 > URL: https://issues.apache.org/jira/browse/HIVE-9038 > Project: Hive > Issue Type: Bug > Components: Tests, Tez >Reporter: Ashutosh Chauhan >Assignee: Vikram Dixit K > > Tez doesn't run all tests. But, if you run them, following tests fail with > runt time exception pointing to bugs. > {{auto_join21.q,auto_join29.q,auto_join30.q > ,auto_join_filters.q,auto_join_nulls.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs
[ https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268870#comment-14268870 ] Gopal V commented on HIVE-8327: --- I tested this now and findbugs adds about 4 mins to the build, but regular "mvn site" takes 40+ mins without this patch because it tries to print out a dependency report and throws up warnings like {code} [WARNING] The repository url 'http://s3.amazonaws.com/maven.springframework.org/milestone' is invalid - Repository 'spring-milestone' will be blacklisted. [WARNING] The repository url 'https://nexus.codehaus.org/content/repositories/snapshots/' is invalid - Repository 'codehaus-nexus-snapshots' will be blacklisted. {code} > mvn site -Pfindbugs > --- > > Key: HIVE-8327 > URL: https://issues.apache.org/jira/browse/HIVE-8327 > Project: Hive > Issue Type: Test > Components: Diagnosability >Reporter: Gopal V >Assignee: Gopal V > Fix For: 0.15.0 > > Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html > > > HIVE-3099 originally added findbugs into the old ant build. > Get basic findbugs working for the maven build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9308) Refine the logic for the isSub method to support local file in HIVE.java
[ https://issues.apache.org/jira/browse/HIVE-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-9308: --- Attachment: HIVE-9308-encryption.patch > Refine the logic for the isSub method to support local file in HIVE.java > > > Key: HIVE-9308 > URL: https://issues.apache.org/jira/browse/HIVE-9308 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: HIVE-9308-encryption.patch > > > Refine the isSubDir method from Hive to support the local file instead of > using the method from FileUtil which only supports files in hdfs schema. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9308) Refine the logic for the isSub method to support local file in HIVE.java
[ https://issues.apache.org/jira/browse/HIVE-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-9308: --- Status: Patch Available (was: Open) > Refine the logic for the isSub method to support local file in HIVE.java > > > Key: HIVE-9308 > URL: https://issues.apache.org/jira/browse/HIVE-9308 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Attachments: HIVE-9308-encryption.patch > > > Refine the isSubDir method from Hive to support the local file instead of > using the method from FileUtil which only supports files in hdfs schema. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9308) Refine the logic for the isSub method to support local file in HIVE.java
[ https://issues.apache.org/jira/browse/HIVE-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-9308: --- Fix Version/s: encryption-branch > Refine the logic for the isSub method to support local file in HIVE.java > > > Key: HIVE-9308 > URL: https://issues.apache.org/jira/browse/HIVE-9308 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > Fix For: encryption-branch > > Attachments: HIVE-9308-encryption.patch > > > Refine the isSubDir method from Hive to support the local file instead of > using the method from FileUtil which only supports files in hdfs schema. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9308) Refine the logic for the isSub method to support local file in HIVE.java
[ https://issues.apache.org/jira/browse/HIVE-9308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu reassigned HIVE-9308: -- Assignee: Ferdinand Xu > Refine the logic for the isSub method to support local file in HIVE.java > > > Key: HIVE-9308 > URL: https://issues.apache.org/jira/browse/HIVE-9308 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > > Refine the isSubDir method from Hive to support the local file instead of > using the method from FileUtil which only supports files in hdfs schema. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9308) Refine the logic for the isSub method to support local file in HIVE.java
Ferdinand Xu created HIVE-9308: -- Summary: Refine the logic for the isSub method to support local file in HIVE.java Key: HIVE-9308 URL: https://issues.apache.org/jira/browse/HIVE-9308 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Refine the isSubDir method from Hive to support the local file instead of using the method from FileUtil which only supports files in hdfs schema. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9290) Make some test results deterministic
[ https://issues.apache.org/jira/browse/HIVE-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268843#comment-14268843 ] Rui Li commented on HIVE-9290: -- Thanks [~xuefuz] for the explanation! > Make some test results deterministic > > > Key: HIVE-9290 > URL: https://issues.apache.org/jira/browse/HIVE-9290 > Project: Hive > Issue Type: Test >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9290.1.patch > > > {noformat} > limit_pushdown.q > optimize_nullscan.q > ppd_gby_join.q > vector_string_concat.q > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268841#comment-14268841 ] Xuefu Zhang commented on HIVE-9251: --- It should be okay. Limit is still pushed down in the extra stage introduced by order by. > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, > HIVE-9251.3-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs
[ https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268837#comment-14268837 ] Hive QA commented on HIVE-8327: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12690640/HIVE-8327.2.patch {color:green}SUCCESS:{color} +1 6732 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2284/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2284/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2284/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12690640 - PreCommit-HIVE-TRUNK-Build > mvn site -Pfindbugs > --- > > Key: HIVE-8327 > URL: https://issues.apache.org/jira/browse/HIVE-8327 > Project: Hive > Issue Type: Test > Components: Diagnosability >Reporter: Gopal V >Assignee: Gopal V > Fix For: 0.15.0 > > Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html > > > HIVE-3099 originally added findbugs into the old ant build. > Get basic findbugs working for the maven build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9283) Improve encryption related test cases
[ https://issues.apache.org/jira/browse/HIVE-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268836#comment-14268836 ] Dong Chen commented on HIVE-9283: - Good idea!. Thanks for your review, [~spena], [~brocknoland]. I filed HIVE-9307 to track this. > Improve encryption related test cases > - > > Key: HIVE-9283 > URL: https://issues.apache.org/jira/browse/HIVE-9283 > Project: Hive > Issue Type: Sub-task >Reporter: Dong Chen >Assignee: Dong Chen > Fix For: encryption-branch > > Attachments: HIVE-9283.patch > > > NO PRECOMMIT TESTS > I found some test cases .q file could be improved by: > 1. change the table location from {{/user/hive/warehouse...}} to > {{/build/ql/test/data/warehouse/...}}. > The reason is that the default warehouse dir defined in QTestUtil is the > latter one, and the partial mask is based on it. I think it is better to make > test cases consistent with code. Also the .hive_staging location we want in > .out will be shown then. > 2. add cleanup at the end. > drop table and delete key. Otherwise, some cases will fail caused by cannot > create existed key. (Put in HIVE-9286) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9290) Make some test results deterministic
[ https://issues.apache.org/jira/browse/HIVE-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268835#comment-14268835 ] Xuefu Zhang commented on HIVE-9290: --- It should be okay. Even though an extra stage is introduced, I see the limit is still pushed down in the second stage according to the plan. > Make some test results deterministic > > > Key: HIVE-9290 > URL: https://issues.apache.org/jira/browse/HIVE-9290 > Project: Hive > Issue Type: Test >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9290.1.patch > > > {noformat} > limit_pushdown.q > optimize_nullscan.q > ppd_gby_join.q > vector_string_concat.q > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9307) Use MetaStore dir variable from conf instead of hard coded dir in encryption test
Dong Chen created HIVE-9307: --- Summary: Use MetaStore dir variable from conf instead of hard coded dir in encryption test Key: HIVE-9307 URL: https://issues.apache.org/jira/browse/HIVE-9307 Project: Hive Issue Type: Sub-task Reporter: Dong Chen Assignee: Dong Chen Use the following variable to get the metastore directory $\{hiveconf:hive.metastore.warehouse.dir\} in test cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268833#comment-14268833 ] Hive QA commented on HIVE-9305: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12690713/HIVE-9305.1-spark.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 7283 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/616/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/616/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-616/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12690713 - PreCommit-HIVE-SPARK-Build > Set default miniClusterType back to none in QTestUtil.[Spark branch] > > > Key: HIVE-9305 > URL: https://issues.apache.org/jira/browse/HIVE-9305 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Chengxiang Li >Assignee: Chengxiang Li >Priority: Minor > Attachments: HIVE-9305.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9194) Support select distinct *
[ https://issues.apache.org/jira/browse/HIVE-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268820#comment-14268820 ] Pengcheng Xiong commented on HIVE-9194: --- [~jpullokkaran], could you please take a look? It is ready to go. Thanks. > Support select distinct * > - > > Key: HIVE-9194 > URL: https://issues.apache.org/jira/browse/HIVE-9194 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-9194.00.patch > > > As per [~jpullokkaran]'s review comments, implement select distinct * -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9293) Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-9293: --- Status: Patch Available (was: Open) > Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch] > --- > > Key: HIVE-9293 > URL: https://issues.apache.org/jira/browse/HIVE-9293 > Project: Hive > Issue Type: Task > Components: Spark >Affects Versions: spark-branch >Reporter: Szehon Ho >Assignee: Chao >Priority: Minor > Attachments: HIVE-9293.1-spark.patch > > > As we don't have UnionWork anymore, we can simplify the logic to get root > mapworks from the SparkWork. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9293) Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-9293: --- Attachment: HIVE-9293.1-spark.patch > Cleanup SparkTask getMapWork to skip UnionWork check [Spark Branch] > --- > > Key: HIVE-9293 > URL: https://issues.apache.org/jira/browse/HIVE-9293 > Project: Hive > Issue Type: Task > Components: Spark >Affects Versions: spark-branch >Reporter: Szehon Ho >Assignee: Chao >Priority: Minor > Attachments: HIVE-9293.1-spark.patch > > > As we don't have UnionWork anymore, we can simplify the logic to get root > mapworks from the SparkWork. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268805#comment-14268805 ] Szehon Ho commented on HIVE-9305: - +1 > Set default miniClusterType back to none in QTestUtil.[Spark branch] > > > Key: HIVE-9305 > URL: https://issues.apache.org/jira/browse/HIVE-9305 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Chengxiang Li >Assignee: Chengxiang Li >Priority: Minor > Attachments: HIVE-9305.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9306) Let Context.isLocalOnlyExecutionMode() return false if execution engine is Spark [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268803#comment-14268803 ] Szehon Ho commented on HIVE-9306: - +1 > Let Context.isLocalOnlyExecutionMode() return false if execution engine is > Spark [Spark Branch] > --- > > Key: HIVE-9306 > URL: https://issues.apache.org/jira/browse/HIVE-9306 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-9306.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9306) Let Context.isLocalOnlyExecutionMode() return false if execution engine is Spark [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9306: -- Status: Patch Available (was: Open) > Let Context.isLocalOnlyExecutionMode() return false if execution engine is > Spark [Spark Branch] > --- > > Key: HIVE-9306 > URL: https://issues.apache.org/jira/browse/HIVE-9306 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-9306.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9306) Let Context.isLocalOnlyExecutionMode() return false if execution engine is Spark [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9306: -- Attachment: HIVE-9306.1-spark.patch > Let Context.isLocalOnlyExecutionMode() return false if execution engine is > Spark [Spark Branch] > --- > > Key: HIVE-9306 > URL: https://issues.apache.org/jira/browse/HIVE-9306 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-9306.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9194) Support select distinct *
[ https://issues.apache.org/jira/browse/HIVE-9194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268795#comment-14268795 ] Hive QA commented on HIVE-9194: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12690634/HIVE-9194.00.patch {color:green}SUCCESS:{color} +1 6734 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2283/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2283/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2283/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12690634 - PreCommit-HIVE-TRUNK-Build > Support select distinct * > - > > Key: HIVE-9194 > URL: https://issues.apache.org/jira/browse/HIVE-9194 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-9194.00.patch > > > As per [~jpullokkaran]'s review comments, implement select distinct * -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9306) Let Context.isLocalOnlyExecutionMode() return false if execution engine is Spark [Spark Branch]
Xuefu Zhang created HIVE-9306: - Summary: Let Context.isLocalOnlyExecutionMode() return false if execution engine is Spark [Spark Branch] Key: HIVE-9306 URL: https://issues.apache.org/jira/browse/HIVE-9306 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Xuefu Zhang -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-9305: Status: Patch Available (was: Open) > Set default miniClusterType back to none in QTestUtil.[Spark branch] > > > Key: HIVE-9305 > URL: https://issues.apache.org/jira/browse/HIVE-9305 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Chengxiang Li >Assignee: Chengxiang Li >Priority: Minor > Attachments: HIVE-9305.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-9305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-9305: Attachment: HIVE-9305.1-spark.patch > Set default miniClusterType back to none in QTestUtil.[Spark branch] > > > Key: HIVE-9305 > URL: https://issues.apache.org/jira/browse/HIVE-9305 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Chengxiang Li >Assignee: Chengxiang Li >Priority: Minor > Attachments: HIVE-9305.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9305) Set default miniClusterType back to none in QTestUtil.[Spark branch]
Chengxiang Li created HIVE-9305: --- Summary: Set default miniClusterType back to none in QTestUtil.[Spark branch] Key: HIVE-9305 URL: https://issues.apache.org/jira/browse/HIVE-9305 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9259) Fix ClassCastException when CBO is enabled for HOS [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268777#comment-14268777 ] Chao commented on HIVE-9259: After updating the cluster with latest jars, I cannot reproduce this exception anymore. The only exception I found in hiveserver2.log is: {noformat} 2015-01-07 22:01:22,421 ERROR [HiveServer2-Handler-Pool: Thread-29]: server.TThreadPoolServer (TThreadPoolServer.java:run(296)) - Error occurred during processing of message. java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status 71 at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.thrift.transport.TTransportException: Invalid status 71 at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184) at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) ... 4 more {noformat} Not sure if it's related. > Fix ClassCastException when CBO is enabled for HOS [Spark Branch] > - > > Key: HIVE-9259 > URL: https://issues.apache.org/jira/browse/HIVE-9259 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: spark-branch >Reporter: Brock Noland >Assignee: Chao > > {noformat} > 2015-01-05 22:10:19,414 ERROR [HiveServer2-Handler-Pool: Thread-33]: > parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(10109)) - CBO > failed, skipping CBO. > java.lang.ClassCastException: > org.apache.hadoop.hive.ql.optimizer.calcite.HiveTypeSystemImpl cannot be cast > to org.eigenbase.reltype.RelDataTypeSystem > at > net.hydromatic.optiq.jdbc.OptiqConnectionImpl.(OptiqConnectionImpl.java:92) > at > net.hydromatic.optiq.jdbc.OptiqJdbc41Factory$OptiqJdbc41Connection.(OptiqJdbc41Factory.java:103) > at > net.hydromatic.optiq.jdbc.OptiqJdbc41Factory.newConnection(OptiqJdbc41Factory.java:49) > at > net.hydromatic.optiq.jdbc.OptiqJdbc41Factory.newConnection(OptiqJdbc41Factory.java:34) > at > net.hydromatic.optiq.jdbc.OptiqFactory.newConnection(OptiqFactory.java:52) > at > net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:135) > at java.sql.DriverManager.getConnection(DriverManager.java:571) > at java.sql.DriverManager.getConnection(DriverManager.java:187) > at > org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:140) > at > org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$CalciteBasedPlanner.getOptimizedAST(SemanticAnalyzer.java:12560) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer$CalciteBasedPlanner.access$400(SemanticAnalyzer.java:12540) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10070) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108) > at > org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1102) > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101) > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172) > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:388) > at > org.apache.hive.service.
[jira] [Commented] (HIVE-9242) Many places in CBO code eat exceptions
[ https://issues.apache.org/jira/browse/HIVE-9242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268776#comment-14268776 ] Brock Noland commented on HIVE-9242: Thx Navis, +1 > Many places in CBO code eat exceptions > -- > > Key: HIVE-9242 > URL: https://issues.apache.org/jira/browse/HIVE-9242 > Project: Hive > Issue Type: Bug >Reporter: Brock Noland >Priority: Blocker > Attachments: HIVE-9242.1.patch.txt > > > I've noticed that there are a number of places in the CBO code which eat > exceptions. This is not acceptable. Example: > https://github.com/apache/hive/blob/357b473a354aace3bd59b522ad7108be561e9d0f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java#L274 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9242) Many places in CBO code eat exceptions
[ https://issues.apache.org/jira/browse/HIVE-9242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9242: Status: Patch Available (was: Open) > Many places in CBO code eat exceptions > -- > > Key: HIVE-9242 > URL: https://issues.apache.org/jira/browse/HIVE-9242 > Project: Hive > Issue Type: Bug >Reporter: Brock Noland >Priority: Blocker > Attachments: HIVE-9242.1.patch.txt > > > I've noticed that there are a number of places in the CBO code which eat > exceptions. This is not acceptable. Example: > https://github.com/apache/hive/blob/357b473a354aace3bd59b522ad7108be561e9d0f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java#L274 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9242) Many places in CBO code eat exceptions
[ https://issues.apache.org/jira/browse/HIVE-9242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9242: Attachment: HIVE-9242.1.patch.txt > Many places in CBO code eat exceptions > -- > > Key: HIVE-9242 > URL: https://issues.apache.org/jira/browse/HIVE-9242 > Project: Hive > Issue Type: Bug >Reporter: Brock Noland >Priority: Blocker > Attachments: HIVE-9242.1.patch.txt > > > I've noticed that there are a number of places in the CBO code which eat > exceptions. This is not acceptable. Example: > https://github.com/apache/hive/blob/357b473a354aace3bd59b522ad7108be561e9d0f/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java#L274 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9272) Tests for utf-8 support
[ https://issues.apache.org/jira/browse/HIVE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aswathy Chellammal Sreekumar updated HIVE-9272: --- Attachment: HIVE-9272.1.patch > Tests for utf-8 support > --- > > Key: HIVE-9272 > URL: https://issues.apache.org/jira/browse/HIVE-9272 > Project: Hive > Issue Type: Test > Components: Tests, WebHCat >Reporter: Aswathy Chellammal Sreekumar >Priority: Minor > Attachments: HIVE-9272.1.patch, HIVE-9272.patch > > > Including some test cases for utf8 support in webhcat. The first four tests > invoke hive, pig, mapred and streaming apis for testing the utf8 support for > data processed, file names and job name. The last test case tests the > filtering of job name with utf8 character -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9272) Tests for utf-8 support
[ https://issues.apache.org/jira/browse/HIVE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268764#comment-14268764 ] Aswathy Chellammal Sreekumar commented on HIVE-9272: Please find the updated patch for utf-8 tests for review. > Tests for utf-8 support > --- > > Key: HIVE-9272 > URL: https://issues.apache.org/jira/browse/HIVE-9272 > Project: Hive > Issue Type: Test > Components: Tests, WebHCat >Reporter: Aswathy Chellammal Sreekumar >Priority: Minor > Attachments: HIVE-9272.patch > > > Including some test cases for utf8 support in webhcat. The first four tests > invoke hive, pig, mapred and streaming apis for testing the utf8 support for > data processed, file names and job name. The last test case tests the > filtering of job name with utf8 character -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9302) Beeline add jar local to client
[ https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu reassigned HIVE-9302: -- Assignee: Ferdinand Xu > Beeline add jar local to client > --- > > Key: HIVE-9302 > URL: https://issues.apache.org/jira/browse/HIVE-9302 > Project: Hive > Issue Type: New Feature >Reporter: Brock Noland >Assignee: Ferdinand Xu > > At present if a beeline user uses {{add jar}} the path they give is actually > on the HS2 server. It'd be great to allow beeline users to add local jars as > well. > It might be useful to do this in the jdbc driver itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9290) Make some test results deterministic
[ https://issues.apache.org/jira/browse/HIVE-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9290: - Status: Patch Available (was: Open) > Make some test results deterministic > > > Key: HIVE-9290 > URL: https://issues.apache.org/jira/browse/HIVE-9290 > Project: Hive > Issue Type: Test >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9290.1.patch > > > {noformat} > limit_pushdown.q > optimize_nullscan.q > ppd_gby_join.q > vector_string_concat.q > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9290) Make some test results deterministic
[ https://issues.apache.org/jira/browse/HIVE-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9290: - Attachment: HIVE-9290.1.patch Not sure if it's correct to make limit_pushdown.q deterministic. cc [~xuefuz] > Make some test results deterministic > > > Key: HIVE-9290 > URL: https://issues.apache.org/jira/browse/HIVE-9290 > Project: Hive > Issue Type: Test >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9290.1.patch > > > {noformat} > limit_pushdown.q > optimize_nullscan.q > ppd_gby_join.q > vector_string_concat.q > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268728#comment-14268728 ] Rui Li commented on HIVE-9251: -- [~xuefuz] - you're right. I think we should fix HIVE-9290 first and merge it to spark. One thing I'm not sure is about limit_pushdown.q. To make it deterministic, I have to add order by to the query. Will that somehow make the limit pushdown not work? > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, > HIVE-9251.3-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9281) Code cleanup [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9281: Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Committed to spark. Thanks Xuefu for the heavy reading. > Code cleanup [Spark Branch] > --- > > Key: HIVE-9281 > URL: https://issues.apache.org/jira/browse/HIVE-9281 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: spark-branch >Reporter: Szehon Ho >Assignee: Szehon Ho > Fix For: spark-branch > > Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch > > > In preparation for merge, we need to cleanup the codes. > This includes removing TODO's, fixing checkstyles, removing commented or > unused code, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9281) Code cleanup [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268689#comment-14268689 ] Hive QA commented on HIVE-9281: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12690647/HIVE-9281.2-spark.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7283 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/615/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/615/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-615/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12690647 - PreCommit-HIVE-SPARK-Build > Code cleanup [Spark Branch] > --- > > Key: HIVE-9281 > URL: https://issues.apache.org/jira/browse/HIVE-9281 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: spark-branch >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch > > > In preparation for merge, we need to cleanup the codes. > This includes removing TODO's, fixing checkstyles, removing commented or > unused code, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9304) [Refactor] remove unused method in SemAly
[ https://issues.apache.org/jira/browse/HIVE-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-9304: --- Attachment: HIVE-9304.patch > [Refactor] remove unused method in SemAly > - > > Key: HIVE-9304 > URL: https://issues.apache.org/jira/browse/HIVE-9304 > Project: Hive > Issue Type: Task > Components: Query Processor >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-9304.patch > > > Seems like method {{genConversionOps}} don't serve any purpose any longer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9304) [Refactor] remove unused method in SemAly
[ https://issues.apache.org/jira/browse/HIVE-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-9304: --- Status: Patch Available (was: Open) > [Refactor] remove unused method in SemAly > - > > Key: HIVE-9304 > URL: https://issues.apache.org/jira/browse/HIVE-9304 > Project: Hive > Issue Type: Task > Components: Query Processor >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-9304.patch > > > Seems like method {{genConversionOps}} don't serve any purpose any longer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9304) [Refactor] remove unused method in SemAly
Ashutosh Chauhan created HIVE-9304: -- Summary: [Refactor] remove unused method in SemAly Key: HIVE-9304 URL: https://issues.apache.org/jira/browse/HIVE-9304 Project: Hive Issue Type: Task Components: Query Processor Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Seems like method {{genConversionOps}} don't serve any purpose any longer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9278) Cached expression feature broken in one case
[ https://issues.apache.org/jira/browse/HIVE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268680#comment-14268680 ] Ashutosh Chauhan commented on HIVE-9278: Aah.. I see. +1 > Cached expression feature broken in one case > > > Key: HIVE-9278 > URL: https://issues.apache.org/jira/browse/HIVE-9278 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Matt McCline >Assignee: Navis >Priority: Blocker > Attachments: HIVE-9278.1.patch.txt > > > Different query result depending on whether hive.cache.expr.evaluation is > true or false. When true, no query results are produced (this is wrong). > The q file: > {noformat} > set hive.cache.expr.evaluation=true; > CREATE TABLE cache_expr_repro (date_str STRING); > LOAD DATA LOCAL INPATH '../../data/files/cache_expr_repro.txt' INTO TABLE > cache_expr_repro; > SELECT MONTH(date_str) AS `mon`, CAST((MONTH(date_str) - 1) / 3 + 1 AS int) > AS `quarter`, YEAR(date_str) AS `year` FROM cache_expr_repro WHERE > ((CAST((MONTH(date_str) - 1) / 3 + 1 AS int) = 1) AND (YEAR(date_str) = > 2015)) GROUP BY MONTH(date_str), CAST((MONTH(date_str) - 1) / 3 + 1 AS int), > YEAR(date_str) ; > {noformat} > cache_expr_repro.txt > {noformat} > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-01-01 00:00:00 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4639) Add has null flag to ORC internal index
[ https://issues.apache.org/jira/browse/HIVE-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-4639: Attachment: HIVE-4639.2.patch Fixes test failures. All of them are file size diffs. > Add has null flag to ORC internal index > --- > > Key: HIVE-4639 > URL: https://issues.apache.org/jira/browse/HIVE-4639 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Owen O'Malley >Assignee: Prasanth Jayachandran > Attachments: HIVE-4639.1.patch, HIVE-4639.2.patch > > > It would enable more predicate pushdown if we added a flag to the index entry > recording if there were any null values in the column for the 10k rows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9290) Make some test results deterministic
[ https://issues.apache.org/jira/browse/HIVE-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9290: - Description: {noformat} limit_pushdown.q optimize_nullscan.q ppd_gby_join.q vector_string_concat.q {noformat} was: {noformat} limit_pushdown.q ppd_gby_join.q vector_string_concat.q {noformat} > Make some test results deterministic > > > Key: HIVE-9290 > URL: https://issues.apache.org/jira/browse/HIVE-9290 > Project: Hive > Issue Type: Test >Reporter: Rui Li >Assignee: Rui Li > > {noformat} > limit_pushdown.q > optimize_nullscan.q > ppd_gby_join.q > vector_string_concat.q > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9303) Parquet files are written with incorrect definition levels
Skye Wanderman-Milne created HIVE-9303: -- Summary: Parquet files are written with incorrect definition levels Key: HIVE-9303 URL: https://issues.apache.org/jira/browse/HIVE-9303 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Skye Wanderman-Milne The definition level, which determines which level of nesting is NULL, appears to always be n or n-1, where n is the maximum definition level. This means that only the innermost level of nesting can be NULL. This is only relevant for Parquet files. For example: {code:sql} CREATE TABLE text_tbl (a STRUCT>) STORED AS TEXTFILE; INSERT OVERWRITE TABLE text_tbl SELECT IF(false, named_struct("b", named_struct("c", 1)), NULL) FROM tbl LIMIT 1; CREATE TABLE parq_tbl STORED AS PARQUET AS SELECT * FROM text_tbl; SELECT * FROM text_tbl; => NULL # right SELECT * FROM parq_tbl; => {"b":{"c":null}} # wrong {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9224) CBO (Calcite Return Path): Inline Table, Properties
[ https://issues.apache.org/jira/browse/HIVE-9224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268676#comment-14268676 ] Hive QA commented on HIVE-9224: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12690584/HIVE-9224.2.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6732 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast_qualified_types org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_acid org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_main org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2281/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2281/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2281/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12690584 - PreCommit-HIVE-TRUNK-Build > CBO (Calcite Return Path): Inline Table, Properties > --- > > Key: HIVE-9224 > URL: https://issues.apache.org/jira/browse/HIVE-9224 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Laljo John Pullokkaran >Assignee: Laljo John Pullokkaran > Fix For: 0.15.0 > > Attachments: HIVE-9224.1.patch, HIVE-9224.2.patch, HIVE-9224.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9278) Cached expression feature broken in one case
[ https://issues.apache.org/jira/browse/HIVE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268669#comment-14268669 ] Navis commented on HIVE-9278: - Should be included in hive-0.14.1 > Cached expression feature broken in one case > > > Key: HIVE-9278 > URL: https://issues.apache.org/jira/browse/HIVE-9278 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Matt McCline >Assignee: Navis >Priority: Blocker > Attachments: HIVE-9278.1.patch.txt > > > Different query result depending on whether hive.cache.expr.evaluation is > true or false. When true, no query results are produced (this is wrong). > The q file: > {noformat} > set hive.cache.expr.evaluation=true; > CREATE TABLE cache_expr_repro (date_str STRING); > LOAD DATA LOCAL INPATH '../../data/files/cache_expr_repro.txt' INTO TABLE > cache_expr_repro; > SELECT MONTH(date_str) AS `mon`, CAST((MONTH(date_str) - 1) / 3 + 1 AS int) > AS `quarter`, YEAR(date_str) AS `year` FROM cache_expr_repro WHERE > ((CAST((MONTH(date_str) - 1) / 3 + 1 AS int) = 1) AND (YEAR(date_str) = > 2015)) GROUP BY MONTH(date_str), CAST((MONTH(date_str) - 1) / 3 + 1 AS int), > YEAR(date_str) ; > {noformat} > cache_expr_repro.txt > {noformat} > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-01-01 00:00:00 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9278) Cached expression feature broken in one case
[ https://issues.apache.org/jira/browse/HIVE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9278: Priority: Blocker (was: Critical) > Cached expression feature broken in one case > > > Key: HIVE-9278 > URL: https://issues.apache.org/jira/browse/HIVE-9278 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Matt McCline >Assignee: Navis >Priority: Blocker > Attachments: HIVE-9278.1.patch.txt > > > Different query result depending on whether hive.cache.expr.evaluation is > true or false. When true, no query results are produced (this is wrong). > The q file: > {noformat} > set hive.cache.expr.evaluation=true; > CREATE TABLE cache_expr_repro (date_str STRING); > LOAD DATA LOCAL INPATH '../../data/files/cache_expr_repro.txt' INTO TABLE > cache_expr_repro; > SELECT MONTH(date_str) AS `mon`, CAST((MONTH(date_str) - 1) / 3 + 1 AS int) > AS `quarter`, YEAR(date_str) AS `year` FROM cache_expr_repro WHERE > ((CAST((MONTH(date_str) - 1) / 3 + 1 AS int) = 1) AND (YEAR(date_str) = > 2015)) GROUP BY MONTH(date_str), CAST((MONTH(date_str) - 1) / 3 + 1 AS int), > YEAR(date_str) ; > {noformat} > cache_expr_repro.txt > {noformat} > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-01-01 00:00:00 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7550) Extend cached evaluation to multiple expressions
[ https://issues.apache.org/jira/browse/HIVE-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268664#comment-14268664 ] Navis commented on HIVE-7550: - Sure, but HIVE-9278 should be included first. > Extend cached evaluation to multiple expressions > > > Key: HIVE-7550 > URL: https://issues.apache.org/jira/browse/HIVE-7550 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Trivial > Attachments: HIVE-7550.1.patch.txt > > > Currently, hive.cache.expr.evaluation caches per expression. But cache > context might be shared for multiple expressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9278) Cached expression feature broken in one case
[ https://issues.apache.org/jira/browse/HIVE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268662#comment-14268662 ] Navis commented on HIVE-9278: - [~ashutoshc] In caching, identity was checked by comparing toString(). But for UDFs (not GenericUDF), these return always same class name (org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge) making them shared between different expressions. In the testcase, "length(key)" and "reverse(key)" are both UDFs, resulting length(key)=reverse(key) always. Now it's checked correctly with ExprNodeDesc itself (with isSame() method). > Cached expression feature broken in one case > > > Key: HIVE-9278 > URL: https://issues.apache.org/jira/browse/HIVE-9278 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 >Reporter: Matt McCline >Assignee: Navis >Priority: Critical > Attachments: HIVE-9278.1.patch.txt > > > Different query result depending on whether hive.cache.expr.evaluation is > true or false. When true, no query results are produced (this is wrong). > The q file: > {noformat} > set hive.cache.expr.evaluation=true; > CREATE TABLE cache_expr_repro (date_str STRING); > LOAD DATA LOCAL INPATH '../../data/files/cache_expr_repro.txt' INTO TABLE > cache_expr_repro; > SELECT MONTH(date_str) AS `mon`, CAST((MONTH(date_str) - 1) / 3 + 1 AS int) > AS `quarter`, YEAR(date_str) AS `year` FROM cache_expr_repro WHERE > ((CAST((MONTH(date_str) - 1) / 3 + 1 AS int) = 1) AND (YEAR(date_str) = > 2015)) GROUP BY MONTH(date_str), CAST((MONTH(date_str) - 1) / 3 + 1 AS int), > YEAR(date_str) ; > {noformat} > cache_expr_repro.txt > {noformat} > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-01-01 00:00:00 > 2015-02-01 00:00:00 > 2015-02-01 00:00:00 > 2015-01-01 00:00:00 > 2015-01-01 00:00:00 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9296) Need to add schema upgrade changes for queueing events in the database
[ https://issues.apache.org/jira/browse/HIVE-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-9296: - Fix Version/s: 0.15.0 Status: Patch Available (was: Open) > Need to add schema upgrade changes for queueing events in the database > -- > > Key: HIVE-9296 > URL: https://issues.apache.org/jira/browse/HIVE-9296 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.15.0 >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: 0.15.0 > > Attachments: HIVE-9296.patch > > > HIVE-9174 added the ability to queue notification events in the database, but > did not include the schema upgrade scripts. > Also, in the thrift changes the convention was not followed properly in > naming the thrift methods. HIVE-9174 used camel case, where the thrift > methods use all lower case separated by underscores. > Both of these issues should be fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4790) MapredLocalTask task does not make virtual columns
[ https://issues.apache.org/jira/browse/HIVE-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4790: Attachment: HIVE-4790.13.patch.txt > MapredLocalTask task does not make virtual columns > -- > > Key: HIVE-4790 > URL: https://issues.apache.org/jira/browse/HIVE-4790 > Project: Hive > Issue Type: Bug > Components: Query Processor >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: D11511.3.patch, D11511.4.patch, HIVE-4790.10.patch.txt, > HIVE-4790.11.patch.txt, HIVE-4790.12.patch.txt, HIVE-4790.13.patch.txt, > HIVE-4790.5.patch.txt, HIVE-4790.6.patch.txt, HIVE-4790.7.patch.txt, > HIVE-4790.8.patch.txt, HIVE-4790.9.patch.txt, HIVE-4790.D11511.1.patch, > HIVE-4790.D11511.2.patch > > > From mailing list, > http://www.mail-archive.com/user@hive.apache.org/msg08264.html > {noformat} > SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON > b.rownumber = a.number; > fails with this error: > > > SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = > a.number; > Automatically selecting local only mode for query > Total MapReduce jobs = 1 > setting HADOOP_USER_NAMEpmarron > 13/06/25 10:52:56 WARN conf.HiveConf: DEPRECATED: Configuration property > hive.metastore.local no longer has any effect. Make sure to provide a valid > value for hive.metastore.uris if you are connecting to a remote metastore. > Execution log at: /tmp/pmarron/.log > 2013-06-25 10:52:56 Starting to launch local task to process map join; > maximum memory = 932118528 > java.lang.RuntimeException: cannot find field block__offset__inside__file > from [0:rownumber, 1:offset] > at > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366) > at > org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldRef(LazySimpleStructObjectInspector.java:168) > at > org.apache.hadoop.hive.serde2.objectinspector.DelegatedStructObjectInspector.getStructFieldRef(DelegatedStructObjectInspector.java:74) > at > org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57) > at > org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:68) > at > org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:222) > at > org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at > org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451) > at > org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:186) > at > org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at > org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:394) > at > org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277) > at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Execution failed with exit status: 2 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9296) Need to add schema upgrade changes for queueing events in the database
[ https://issues.apache.org/jira/browse/HIVE-9296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-9296: - Attachment: HIVE-9296.patch This patch fixes the thrift function name issues and adds the two new tables to the metastore creation and upgrade scripts. > Need to add schema upgrade changes for queueing events in the database > -- > > Key: HIVE-9296 > URL: https://issues.apache.org/jira/browse/HIVE-9296 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.15.0 >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: 0.15.0 > > Attachments: HIVE-9296.patch > > > HIVE-9174 added the ability to queue notification events in the database, but > did not include the schema upgrade scripts. > Also, in the thrift changes the convention was not followed properly in > naming the thrift methods. HIVE-9174 used camel case, where the thrift > methods use all lower case separated by underscores. > Both of these issues should be fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9299) Reuse Configuration in AvroSerdeUtils
[ https://issues.apache.org/jira/browse/HIVE-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268646#comment-14268646 ] Ashutosh Chauhan commented on HIVE-9299: +1 > Reuse Configuration in AvroSerdeUtils > - > > Key: HIVE-9299 > URL: https://issues.apache.org/jira/browse/HIVE-9299 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 0.14.0, 0.13.1, 0.15.0 >Reporter: Nitay Joffe >Assignee: Nitay Joffe > Fix For: 0.15.0 > > Attachments: HIVE-9299.patch > > > I am getting an issue where the original Configuration has some parameters > needed to read the remote Avro schema (specifically S3 keys). > Doing new Configuration doesn't pick it up because the keys are not on the > classpath. > We should reuse the Configuration already present in callers. > I'm using Hive/Avro from Spark so it'd be nice if we could put this into Hive > 0.13 since that's what Spark's built against. > See also https://github.com/jghoman/haivvreo/pull/30 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8327) mvn site -Pfindbugs
[ https://issues.apache.org/jira/browse/HIVE-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268637#comment-14268637 ] Szehon Ho commented on HIVE-8327: - Sorry I dont have the env right now to give it a try. If it doesnt increase build time too much, should be possible for HiveQA to run it before and after, and parse the summary report to do comparison. Or alternatively, much simpler if HiveQA was to just print out the summary report. > mvn site -Pfindbugs > --- > > Key: HIVE-8327 > URL: https://issues.apache.org/jira/browse/HIVE-8327 > Project: Hive > Issue Type: Test > Components: Diagnosability >Reporter: Gopal V >Assignee: Gopal V > Fix For: 0.15.0 > > Attachments: HIVE-8327.1.patch, HIVE-8327.2.patch, ql-findbugs.html > > > HIVE-3099 originally added findbugs into the old ant build. > Get basic findbugs working for the maven build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9188) BloomFilter in ORC row group index
[ https://issues.apache.org/jira/browse/HIVE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268632#comment-14268632 ] Prasanth Jayachandran commented on HIVE-9188: - [~owen.omalley] Current patch has bloom filters at all 3 levels. The size is kept constant for all 3 levels. But fpp for stripe will be >0.05 (assuming >10k unique items) and for file it will be much worse. With this we will get good row group elimination and considerably good stripe elimination. I can drop the file level bloom filter which we don't use for any purpose. The merging of disk ranges happens after we pick the row groups that satisfy the SARG (readPartialDataStreams() happens after pickRowGroups()). But we need bloom filter before that for eliminating row groups. > BloomFilter in ORC row group index > -- > > Key: HIVE-9188 > URL: https://issues.apache.org/jira/browse/HIVE-9188 > Project: Hive > Issue Type: New Feature > Components: File Formats >Affects Versions: 0.15.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Labels: orcfile > Attachments: HIVE-9188.1.patch, HIVE-9188.2.patch, HIVE-9188.3.patch, > HIVE-9188.4.patch > > > BloomFilters are well known probabilistic data structure for set membership > checking. We can use bloom filters in ORC index for better row group pruning. > Currently, ORC row group index uses min/max statistics to eliminate row > groups (stripes as well) that do not satisfy predicate condition specified in > the query. But in some cases, the efficiency of min/max based elimination is > not optimal (unsorted columns with wide range of entries). Bloom filters can > be an effective and efficient alternative for row group/split elimination for > point queries or queries with IN clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4022) Structs and struct fields cannot be NULL in INSERT statements
[ https://issues.apache.org/jira/browse/HIVE-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268625#comment-14268625 ] Alexander Behm commented on HIVE-4022: -- Easier workaround: IF(false, named_struct("a", 1), NULL) > Structs and struct fields cannot be NULL in INSERT statements > - > > Key: HIVE-4022 > URL: https://issues.apache.org/jira/browse/HIVE-4022 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Reporter: Michael Malak > > Originally thought to be Avro-specific, and first noted with respect to > HIVE-3528 "Avro SerDe doesn't handle serializing Nullable types that require > access to a Schema", it turns out even native Hive tables cannot store NULL > in a STRUCT field or for the entire STRUCT itself, at least when the NULL is > specified directly in the INSERT statement. > Again, this affects both Avro-backed tables and native Hive tables. > ***For native Hive tables: > The following: > echo 1,2 >twovalues.csv > hive > CREATE TABLE tc (x INT, y INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; > LOAD DATA LOCAL INPATH 'twovalues.csv' INTO TABLE tc; > CREATE TABLE oc (z STRUCT); > INSERT INTO TABLE oc SELECT null FROM tc; > produces the error > FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target > table because column number/types are different 'oc': Cannot convert column 0 > from void to struct. > The following: > INSERT INTO TABLE oc SELECT named_struct('a', null, 'b', null) FROM tc; > produces the error: > FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target > table because column number/types are different 'oc': Cannot convert column 0 > from struct to struct. > ***For Avro: > In HIVE-3528, there is in fact a null-struct test case in line 14 of > https://github.com/apache/hive/blob/15cc604bf10f4c2502cb88fb8bb3dcd45647cf2c/data/files/csv.txt > The test script at > https://github.com/apache/hive/blob/12d6f3e7d21f94e8b8490b7c6d291c9f4cac8a4f/ql/src/test/queries/clientpositive/avro_nullable_fields.q > does indeed work. But in that test, the query gets all of its data from a > test table verbatim: > INSERT OVERWRITE TABLE as_avro SELECT * FROM test_serializer; > If instead we stick in a hard-coded null for the struct directly into the > query, it fails: > INSERT OVERWRITE TABLE as_avro SELECT string1, int1, tinyint1, smallint1, > bigint1, boolean1, float1, double1, list1, map1, null, enum1, nullableint, > bytes1, fixed1 FROM test_serializer; > with the following error: > FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target > table because column number/types are different 'as_avro': Cannot convert > column 10 from void to struct. > Note, though, that substituting a hard-coded null for string1 (and restoring > struct1 into the query) does work: > INSERT OVERWRITE TABLE as_avro SELECT null, int1, tinyint1, smallint1, > bigint1, boolean1, float1, double1, list1, map1, struct1, enum1, nullableint, > bytes1, fixed1 FROM test_serializer; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-9039: -- Attachment: HIVE-9039.10.patch > Support Union Distinct > -- > > Key: HIVE-9039 > URL: https://issues.apache.org/jira/browse/HIVE-9039 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, > HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, > HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, > HIVE-9039.09.patch, HIVE-9039.10.patch > > > Current version (Hive 0.14) does not support union (or union distinct). It > only supports union all. In this patch, we try to add this new feature by > rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-9039: -- Status: Patch Available (was: Open) > Support Union Distinct > -- > > Key: HIVE-9039 > URL: https://issues.apache.org/jira/browse/HIVE-9039 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, > HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, > HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, > HIVE-9039.09.patch, HIVE-9039.10.patch > > > Current version (Hive 0.14) does not support union (or union distinct). It > only supports union all. In this patch, we try to add this new feature by > rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-9039: -- Attachment: HIVE-9039.10.patch After discussing with [~jpullokkaran], we are going to separate the union distinct with union order by/limit bug. This patch is for union distinct implementation only and should be committed after select distinct * > Support Union Distinct > -- > > Key: HIVE-9039 > URL: https://issues.apache.org/jira/browse/HIVE-9039 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, > HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, > HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch > > > Current version (Hive 0.14) does not support union (or union distinct). It > only supports union all. In this patch, we try to add this new feature by > rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-9039: -- Attachment: (was: HIVE-9039.10.patch) > Support Union Distinct > -- > > Key: HIVE-9039 > URL: https://issues.apache.org/jira/browse/HIVE-9039 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, > HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, > HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch > > > Current version (Hive 0.14) does not support union (or union distinct). It > only supports union all. In this patch, we try to add this new feature by > rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-9039: -- Status: Open (was: Patch Available) > Support Union Distinct > -- > > Key: HIVE-9039 > URL: https://issues.apache.org/jira/browse/HIVE-9039 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, > HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, > HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch > > > Current version (Hive 0.14) does not support union (or union distinct). It > only supports union all. In this patch, we try to add this new feature by > rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8814) Support custom virtual columns from serde implementation
[ https://issues.apache.org/jira/browse/HIVE-8814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8814: Attachment: HIVE-8814.7.patch.txt > Support custom virtual columns from serde implementation > > > Key: HIVE-8814 > URL: https://issues.apache.org/jira/browse/HIVE-8814 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Reporter: Navis >Assignee: Navis >Priority: Minor > Attachments: HIVE-8814.1.patch.txt, HIVE-8814.2.patch.txt, > HIVE-8814.3.patch.txt, HIVE-8814.4.patch.txt, HIVE-8814.5.patch.txt, > HIVE-8814.6.patch.txt, HIVE-8814.7.patch.txt > > > Currently, virtual columns are fixed in hive. But some serdes can provide > more virtual columns if needed. Idea from > https://issues.apache.org/jira/browse/HIVE-7513?focusedCommentId=14073912&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14073912 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable
[ https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268593#comment-14268593 ] Brock Noland commented on HIVE-9300: +1 pending tests Thank you Prasanth! > Revert HIVE-9049 and make TCompactProtocol configurable > --- > > Key: HIVE-9300 > URL: https://issues.apache.org/jira/browse/HIVE-9300 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.15.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-9300.1.patch, HIVE-9300.2.patch > > > Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol > configurable with default disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9104) windowing.q failed when mapred.reduce.tasks is set to larger than one
[ https://issues.apache.org/jira/browse/HIVE-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-9104: --- Attachment: HIVE-9104.patch The issue seems to be that, in {{FirstValStreamingFixedWindow::terminate}}, it doesn't expect {{fb.skipNulls}} to be false AND {{s.valueChain.size() == 0}}, but this could happen in case there are multiple reduce tasks, some of which will get 0 rows. I changed the code to set ValIndexPair to null in such case. Also, I added SORT_QUERY_RESULTS to the qfile, since I get some ordering problem after regenerating the golden file. [~rhbutani] Can you take a look at this patch, and give some suggestions? Thanks. > windowing.q failed when mapred.reduce.tasks is set to larger than one > - > > Key: HIVE-9104 > URL: https://issues.apache.org/jira/browse/HIVE-9104 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Chao >Assignee: Chao > Attachments: HIVE-9104.patch > > > Test {{windowing.q}} is actually not enabled in Spark branch - in test > configurations it is {{windowing.q.q}}. > I just run this test, and query > {code} > -- 12. testFirstLastWithWhere > select p_mfgr,p_name, p_size, > rank() over(distribute by p_mfgr sort by p_name) as r, > sum(p_size) over (distribute by p_mfgr sort by p_name rows between current > row and current row) as s2, > first_value(p_size) over w1 as f, > last_value(p_size, false) over w1 as l > from part > where p_mfgr = 'Manufacturer#3' > window w1 as (distribute by p_mfgr sort by p_name rows between 2 preceding > and 2 following); > {code} > failed with the following exception: > {noformat} > java.lang.RuntimeException: Hive Runtime Error while closing operators: null > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:446) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.closeRecordProcessor(HiveReduceFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115) > at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390) > at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.NoSuchElementException > at java.util.ArrayDeque.getFirst(ArrayDeque.java:318) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFFirstValue$FirstValStreamingFixedWindow.terminate(GenericUDAFFirstValue.java:290) > at > org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:413) > at > org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337) > at org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:431) > ... 15 more > {noformat} > We need to find out: > - Since which commit this test started failing, and > - Why it fails -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9104) windowing.q failed when mapred.reduce.tasks is set to larger than one
[ https://issues.apache.org/jira/browse/HIVE-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-9104: --- Status: Patch Available (was: Open) > windowing.q failed when mapred.reduce.tasks is set to larger than one > - > > Key: HIVE-9104 > URL: https://issues.apache.org/jira/browse/HIVE-9104 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Chao >Assignee: Chao > Attachments: HIVE-9104.patch > > > Test {{windowing.q}} is actually not enabled in Spark branch - in test > configurations it is {{windowing.q.q}}. > I just run this test, and query > {code} > -- 12. testFirstLastWithWhere > select p_mfgr,p_name, p_size, > rank() over(distribute by p_mfgr sort by p_name) as r, > sum(p_size) over (distribute by p_mfgr sort by p_name rows between current > row and current row) as s2, > first_value(p_size) over w1 as f, > last_value(p_size, false) over w1 as l > from part > where p_mfgr = 'Manufacturer#3' > window w1 as (distribute by p_mfgr sort by p_name rows between 2 preceding > and 2 following); > {code} > failed with the following exception: > {noformat} > java.lang.RuntimeException: Hive Runtime Error while closing operators: null > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:446) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.closeRecordProcessor(HiveReduceFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115) > at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390) > at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.NoSuchElementException > at java.util.ArrayDeque.getFirst(ArrayDeque.java:318) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFFirstValue$FirstValStreamingFixedWindow.terminate(GenericUDAFFirstValue.java:290) > at > org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:413) > at > org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337) > at org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:431) > ... 15 more > {noformat} > We need to find out: > - Since which commit this test started failing, and > - Why it fails -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9301) Potential null dereference in MoveTask#createTargetPath()
[ https://issues.apache.org/jira/browse/HIVE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268580#comment-14268580 ] Ted Yu commented on HIVE-9301: -- I stepped through similar code in debugger - the single ampersand prevents short circuit evaluation of the expression, leading to NPE. > Potential null dereference in MoveTask#createTargetPath() > - > > Key: HIVE-9301 > URL: https://issues.apache.org/jira/browse/HIVE-9301 > Project: Hive > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HIVE-9301.patch > > > {code} > if (mkDirPath != null & !fs.exists(mkDirPath)) { > {code} > '&&' should be used instead of single ampersand. > If mkDirPath is null, fs.exists() would still be called - resulting in NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9188) BloomFilter in ORC row group index
[ https://issues.apache.org/jira/browse/HIVE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268573#comment-14268573 ] Owen O'Malley commented on HIVE-9188: - [~prasanth_j] Ok, I thought that you said that you were going to have bloom filters at row group, stripe, and file level. I agree completely that ORC should only have bloom filters at the row group level. Having the bloom filter as a separate stream means the reader does *far* less IO. It will still go through the code that merges adjacent ranges together into a single read. So if you need all of the indexes and bloom filters for all of the columns the reader should read them in a single IO operation. On the other hand, if it doesn't need any bloom filter it shouldn't have to load the extra mb of data it doesn't need. > BloomFilter in ORC row group index > -- > > Key: HIVE-9188 > URL: https://issues.apache.org/jira/browse/HIVE-9188 > Project: Hive > Issue Type: New Feature > Components: File Formats >Affects Versions: 0.15.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Labels: orcfile > Attachments: HIVE-9188.1.patch, HIVE-9188.2.patch, HIVE-9188.3.patch, > HIVE-9188.4.patch > > > BloomFilters are well known probabilistic data structure for set membership > checking. We can use bloom filters in ORC index for better row group pruning. > Currently, ORC row group index uses min/max statistics to eliminate row > groups (stripes as well) that do not satisfy predicate condition specified in > the query. But in some cases, the efficiency of min/max based elimination is > not optimal (unsorted columns with wide range of entries). Bloom filters can > be an effective and efficient alternative for row group/split elimination for > point queries or queries with IN clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9301) Potential null dereference in MoveTask#createTargetPath()
[ https://issues.apache.org/jira/browse/HIVE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268564#comment-14268564 ] Xuefu Zhang commented on HIVE-9301: --- +1 > Potential null dereference in MoveTask#createTargetPath() > - > > Key: HIVE-9301 > URL: https://issues.apache.org/jira/browse/HIVE-9301 > Project: Hive > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HIVE-9301.patch > > > {code} > if (mkDirPath != null & !fs.exists(mkDirPath)) { > {code} > '&&' should be used instead of single ampersand. > If mkDirPath is null, fs.exists() would still be called - resulting in NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9302) Beeline add jar local to client
[ https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-9302: --- Description: At present if a beeline user uses {{add jar}} the path they give is actually on the HS2 server. It'd be great to allow beeline users to add local jars as well. It might be useful to do this in the jdbc driver itself. was:At present if a beeline user uses {{add jar}} the path they give is actually on the HS2 server. It'd be great to allow beeline users to add local jars as well. > Beeline add jar local to client > --- > > Key: HIVE-9302 > URL: https://issues.apache.org/jira/browse/HIVE-9302 > Project: Hive > Issue Type: New Feature >Reporter: Brock Noland > > At present if a beeline user uses {{add jar}} the path they give is actually > on the HS2 server. It'd be great to allow beeline users to add local jars as > well. > It might be useful to do this in the jdbc driver itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9302) Beeline add jar local to client
[ https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-9302: --- Summary: Beeline add jar local to client (was: Beeline add jar) > Beeline add jar local to client > --- > > Key: HIVE-9302 > URL: https://issues.apache.org/jira/browse/HIVE-9302 > Project: Hive > Issue Type: New Feature >Reporter: Brock Noland > > At present if a beeline user uses {{add jar}} the path they give is actually > on the HS2 server. It'd be great to allow beeline users to add local jars as > well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9302) Beeline add jar
Brock Noland created HIVE-9302: -- Summary: Beeline add jar Key: HIVE-9302 URL: https://issues.apache.org/jira/browse/HIVE-9302 Project: Hive Issue Type: New Feature Reporter: Brock Noland At present if a beeline user uses {{add jar}} the path they give is actually on the HS2 server. It'd be great to allow beeline users to add local jars as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-6179) OOM occurs when query spans to a large number of partitions
[ https://issues.apache.org/jira/browse/HIVE-6179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268550#comment-14268550 ] Brock Noland edited comment on HIVE-6179 at 1/8/15 12:09 AM: - I think this is actually about having an API so either (1) the client can get a remote iterator over the list of partitions (2) providing an API which gives only the information required by the client so as to remove bloat or (3) finding a more compact way to transfer this data. The iterator idea is discussed here: HIVE-7195 was (Author: brocknoland): I think this is actually about having an API so either (1) the client can get a remote iterator over the list of partitions (2) providing an API which gives only the information required by the client so as to remove bloat or (3) finding a more compact way to transfer this data. > OOM occurs when query spans to a large number of partitions > --- > > Key: HIVE-6179 > URL: https://issues.apache.org/jira/browse/HIVE-6179 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.12.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > > When executing a query against a large number of partitions, such as "select > count(\*) from table", OOM error may occur because Hive fetches the metadata > for all partitions involved and tries to store it in memory. > {code} > 2014-01-09 13:14:17,090 ERROR metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invoke(141)) - java.lang.OutOfMemoryError: Java heap > space > at java.util.Arrays.copyOf(Arrays.java:2367) > at > java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) > at > java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) > at java.lang.StringBuffer.append(StringBuffer.java:237) > at > org.apache.derby.impl.sql.conn.GenericStatementContext.appendErrorInfo(Unknown > Source) > at > org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown > Source) > at > org.apache.derby.impl.jdbc.EmbedResultSet.closeOnTransactionError(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedResultSet.next(Unknown Source) > at > org.datanucleus.store.rdbms.query.ForwardQueryResult.nextResultSetElement(ForwardQueryResult.java:191) > at > org.datanucleus.store.rdbms.query.ForwardQueryResult$QueryResultIterator.next(ForwardQueryResult.java:379) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.loopJoinOrderedResult(MetaStoreDirectSql.java:641) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:410) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitions(MetaStoreDirectSql.java:205) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1433) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1420) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:122) > at com.sun.proxy.$Proxy7.getPartitions(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:2128) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103) > {code} > The above error happened when executing "select count(\*)" on a table with > 40K partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6179) OOM occurs when query spans to a large number of partitions
[ https://issues.apache.org/jira/browse/HIVE-6179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268550#comment-14268550 ] Brock Noland commented on HIVE-6179: I think this is actually about having an API so either (1) the client can get a remote iterator over the list of partitions (2) providing an API which gives only the information required by the client so as to remove bloat or (3) finding a more compact way to transfer this data. > OOM occurs when query spans to a large number of partitions > --- > > Key: HIVE-6179 > URL: https://issues.apache.org/jira/browse/HIVE-6179 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.12.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > > When executing a query against a large number of partitions, such as "select > count(\*) from table", OOM error may occur because Hive fetches the metadata > for all partitions involved and tries to store it in memory. > {code} > 2014-01-09 13:14:17,090 ERROR metastore.RetryingHMSHandler > (RetryingHMSHandler.java:invoke(141)) - java.lang.OutOfMemoryError: Java heap > space > at java.util.Arrays.copyOf(Arrays.java:2367) > at > java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) > at > java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) > at java.lang.StringBuffer.append(StringBuffer.java:237) > at > org.apache.derby.impl.sql.conn.GenericStatementContext.appendErrorInfo(Unknown > Source) > at > org.apache.derby.iapi.services.context.ContextManager.cleanupOnError(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown > Source) > at > org.apache.derby.impl.jdbc.EmbedResultSet.closeOnTransactionError(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedResultSet.next(Unknown Source) > at > org.datanucleus.store.rdbms.query.ForwardQueryResult.nextResultSetElement(ForwardQueryResult.java:191) > at > org.datanucleus.store.rdbms.query.ForwardQueryResult$QueryResultIterator.next(ForwardQueryResult.java:379) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.loopJoinOrderedResult(MetaStoreDirectSql.java:641) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:410) > at > org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitions(MetaStoreDirectSql.java:205) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1433) > at > org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1420) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:122) > at com.sun.proxy.$Proxy7.getPartitions(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:2128) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103) > {code} > The above error happened when executing "select count(\*)" on a table with > 40K partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9301) Potential null dereference in MoveTask#createTargetPath()
[ https://issues.apache.org/jira/browse/HIVE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HIVE-9301: - Assignee: Ted Yu Status: Patch Available (was: Open) > Potential null dereference in MoveTask#createTargetPath() > - > > Key: HIVE-9301 > URL: https://issues.apache.org/jira/browse/HIVE-9301 > Project: Hive > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HIVE-9301.patch > > > {code} > if (mkDirPath != null & !fs.exists(mkDirPath)) { > {code} > '&&' should be used instead of single ampersand. > If mkDirPath is null, fs.exists() would still be called - resulting in NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9301) Potential null dereference in MoveTask#createTargetPath()
[ https://issues.apache.org/jira/browse/HIVE-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HIVE-9301: - Attachment: HIVE-9301.patch > Potential null dereference in MoveTask#createTargetPath() > - > > Key: HIVE-9301 > URL: https://issues.apache.org/jira/browse/HIVE-9301 > Project: Hive > Issue Type: Bug >Reporter: Ted Yu > Attachments: HIVE-9301.patch > > > {code} > if (mkDirPath != null & !fs.exists(mkDirPath)) { > {code} > '&&' should be used instead of single ampersand. > If mkDirPath is null, fs.exists() would still be called - resulting in NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable
[ https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-9300: Attachment: HIVE-9300.2.patch Updated to fix import ordering difference. > Revert HIVE-9049 and make TCompactProtocol configurable > --- > > Key: HIVE-9300 > URL: https://issues.apache.org/jira/browse/HIVE-9300 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.15.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-9300.1.patch, HIVE-9300.2.patch > > > Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol > configurable with default disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9301) Potential null dereference in MoveTask#createTargetPath()
Ted Yu created HIVE-9301: Summary: Potential null dereference in MoveTask#createTargetPath() Key: HIVE-9301 URL: https://issues.apache.org/jira/browse/HIVE-9301 Project: Hive Issue Type: Bug Reporter: Ted Yu {code} if (mkDirPath != null & !fs.exists(mkDirPath)) { {code} '&&' should be used instead of single ampersand. If mkDirPath is null, fs.exists() would still be called - resulting in NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable
[ https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268528#comment-14268528 ] Prasanth Jayachandran commented on HIVE-9300: - Thats import order difference between IntelliJ vs Eclipse.. I will see if IntelliJ can be configured to follow eclipse import order. > Revert HIVE-9049 and make TCompactProtocol configurable > --- > > Key: HIVE-9300 > URL: https://issues.apache.org/jira/browse/HIVE-9300 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.15.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-9300.1.patch > > > Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol > configurable with default disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable
[ https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268527#comment-14268527 ] Brock Noland commented on HIVE-9300: Thank you [~prasanth_j]! LGTM but can we remove the re-ordering of the imports in {{HiveConf}}? I don't see a need for that? > Revert HIVE-9049 and make TCompactProtocol configurable > --- > > Key: HIVE-9300 > URL: https://issues.apache.org/jira/browse/HIVE-9300 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.15.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-9300.1.patch > > > Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol > configurable with default disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9104) windowing.q failed when mapred.reduce.tasks is set to larger than one
[ https://issues.apache.org/jira/browse/HIVE-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9104: -- Affects Version/s: (was: spark-branch) > windowing.q failed when mapred.reduce.tasks is set to larger than one > - > > Key: HIVE-9104 > URL: https://issues.apache.org/jira/browse/HIVE-9104 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Chao >Assignee: Chao > > Test {{windowing.q}} is actually not enabled in Spark branch - in test > configurations it is {{windowing.q.q}}. > I just run this test, and query > {code} > -- 12. testFirstLastWithWhere > select p_mfgr,p_name, p_size, > rank() over(distribute by p_mfgr sort by p_name) as r, > sum(p_size) over (distribute by p_mfgr sort by p_name rows between current > row and current row) as s2, > first_value(p_size) over w1 as f, > last_value(p_size, false) over w1 as l > from part > where p_mfgr = 'Manufacturer#3' > window w1 as (distribute by p_mfgr sort by p_name rows between 2 preceding > and 2 following); > {code} > failed with the following exception: > {noformat} > java.lang.RuntimeException: Hive Runtime Error while closing operators: null > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:446) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.closeRecordProcessor(HiveReduceFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115) > at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390) > at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.NoSuchElementException > at java.util.ArrayDeque.getFirst(ArrayDeque.java:318) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFFirstValue$FirstValStreamingFixedWindow.terminate(GenericUDAFFirstValue.java:290) > at > org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:413) > at > org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337) > at org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:431) > ... 15 more > {noformat} > We need to find out: > - Since which commit this test started failing, and > - Why it fails -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9281) Code cleanup [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268523#comment-14268523 ] Xuefu Zhang commented on HIVE-9281: --- +1 > Code cleanup [Spark Branch] > --- > > Key: HIVE-9281 > URL: https://issues.apache.org/jira/browse/HIVE-9281 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: spark-branch >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch > > > In preparation for merge, we need to cleanup the codes. > This includes removing TODO's, fixing checkstyles, removing commented or > unused code, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9104) windowing.q failed when mapred.reduce.tasks is set to larger than one
[ https://issues.apache.org/jira/browse/HIVE-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-9104: --- Summary: windowing.q failed when mapred.reduce.tasks is set to larger than one (was: windowing.q failed [Spark Branch]) > windowing.q failed when mapred.reduce.tasks is set to larger than one > - > > Key: HIVE-9104 > URL: https://issues.apache.org/jira/browse/HIVE-9104 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: spark-branch >Reporter: Chao >Assignee: Chao > > Test {{windowing.q}} is actually not enabled in Spark branch - in test > configurations it is {{windowing.q.q}}. > I just run this test, and query > {code} > -- 12. testFirstLastWithWhere > select p_mfgr,p_name, p_size, > rank() over(distribute by p_mfgr sort by p_name) as r, > sum(p_size) over (distribute by p_mfgr sort by p_name rows between current > row and current row) as s2, > first_value(p_size) over w1 as f, > last_value(p_size, false) over w1 as l > from part > where p_mfgr = 'Manufacturer#3' > window w1 as (distribute by p_mfgr sort by p_name rows between 2 preceding > and 2 following); > {code} > failed with the following exception: > {noformat} > java.lang.RuntimeException: Hive Runtime Error while closing operators: null > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:446) > at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.closeRecordProcessor(HiveReduceFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:108) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$2.apply(AsyncRDDActions.scala:115) > at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390) > at org.apache.spark.SparkContext$$anonfun$30.apply(SparkContext.scala:1390) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.NoSuchElementException > at java.util.ArrayDeque.getFirst(ArrayDeque.java:318) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFFirstValue$FirstValStreamingFixedWindow.terminate(GenericUDAFFirstValue.java:290) > at > org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:413) > at > org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337) > at org.apache.hadoop.hive.ql.exec.PTFOperator.closeOp(PTFOperator.java:95) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) > at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:431) > ... 15 more > {noformat} > We need to find out: > - Since which commit this test started failing, and > - Why it fails -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9281) Code cleanup [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268517#comment-14268517 ] Szehon Ho commented on HIVE-9281: - Hope this works: [https://reviews.apache.org/r/29686/|https://reviews.apache.org/r/29686/] > Code cleanup [Spark Branch] > --- > > Key: HIVE-9281 > URL: https://issues.apache.org/jira/browse/HIVE-9281 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: spark-branch >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch > > > In preparation for merge, we need to cleanup the codes. > This includes removing TODO's, fixing checkstyles, removing commented or > unused code, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable
[ https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-9300: Status: Patch Available (was: Open) > Revert HIVE-9049 and make TCompactProtocol configurable > --- > > Key: HIVE-9300 > URL: https://issues.apache.org/jira/browse/HIVE-9300 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.15.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-9300.1.patch > > > Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol > configurable with default disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable
[ https://issues.apache.org/jira/browse/HIVE-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-9300: Attachment: HIVE-9300.1.patch [~brocknoland]/[~ashutoshc] can someone take a look? > Revert HIVE-9049 and make TCompactProtocol configurable > --- > > Key: HIVE-9300 > URL: https://issues.apache.org/jira/browse/HIVE-9300 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 0.15.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-9300.1.patch > > > Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol > configurable with default disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9300) Revert HIVE-9049 and make TCompactProtocol configurable
Prasanth Jayachandran created HIVE-9300: --- Summary: Revert HIVE-9049 and make TCompactProtocol configurable Key: HIVE-9300 URL: https://issues.apache.org/jira/browse/HIVE-9300 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.15.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Revert HIVE-9049 as it breaks compatibility. Make TCompactProtocol configurable with default disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9049) Metastore should use TCompactProtocol as opposed to TBinaryProtocol
[ https://issues.apache.org/jira/browse/HIVE-9049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268499#comment-14268499 ] Prasanth Jayachandran commented on HIVE-9049: - Created HIVE-9300 to track the change. > Metastore should use TCompactProtocol as opposed to TBinaryProtocol > --- > > Key: HIVE-9049 > URL: https://issues.apache.org/jira/browse/HIVE-9049 > Project: Hive > Issue Type: Improvement >Affects Versions: 0.15.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Minor > Fix For: 0.15.0 > > Attachments: HIVE-9049.1.patch > > > Hive metastore server/client uses TBinaryProtocol. Although binary protocol > is better than simple text/json protocol it is not as effective as > TCompactProtocol. TCompactProtocol is typically more efficient in terms of > space and processing (CPU). As seen from this benchmark TCompactProtocol is > better in almost all aspect when compared to TBinaryProtocol > https://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8485) HMS on Oracle incompatibility
[ https://issues.apache.org/jira/browse/HIVE-8485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-8485: --- Status: Patch Available (was: Open) > HMS on Oracle incompatibility > - > > Key: HIVE-8485 > URL: https://issues.apache.org/jira/browse/HIVE-8485 > Project: Hive > Issue Type: Bug > Components: Metastore > Environment: Oracle as metastore DB >Reporter: Ryan Pridgeon >Assignee: Chaoyu Tang > Attachments: HIVE-8485.2.patch, HIVE-8485.patch > > > Oracle does not distinguish between empty strings and NULL,which proves > problematic for DataNucleus. > In the event a user creates a table with some property stored as an empty > string the table will no longer be accessible. > i.e. TBLPROPERTIES ('serialization.null.format'='') > If they try to select, describe, drop, etc the client prints the following > exception. > ERROR ql.Driver: FAILED: SemanticException [Error 10001]: Table not found > > The work around for this was to go into the hive metastore on the Oracle > database and replace NULL with some other string. Users could then drop the > tables or alter their data to use the new null format they just set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8485) HMS on Oracle incompatibility
[ https://issues.apache.org/jira/browse/HIVE-8485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-8485: --- Attachment: HIVE-8485.2.patch Updated patch. > HMS on Oracle incompatibility > - > > Key: HIVE-8485 > URL: https://issues.apache.org/jira/browse/HIVE-8485 > Project: Hive > Issue Type: Bug > Components: Metastore > Environment: Oracle as metastore DB >Reporter: Ryan Pridgeon >Assignee: Chaoyu Tang > Attachments: HIVE-8485.2.patch, HIVE-8485.patch > > > Oracle does not distinguish between empty strings and NULL,which proves > problematic for DataNucleus. > In the event a user creates a table with some property stored as an empty > string the table will no longer be accessible. > i.e. TBLPROPERTIES ('serialization.null.format'='') > If they try to select, describe, drop, etc the client prints the following > exception. > ERROR ql.Driver: FAILED: SemanticException [Error 10001]: Table not found > > The work around for this was to go into the hive metastore on the Oracle > database and replace NULL with some other string. Users could then drop the > tables or alter their data to use the new null format they just set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9281) Code cleanup [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268466#comment-14268466 ] Xuefu Zhang commented on HIVE-9281: --- Could you load both versions to RB so that I can just look at the diff between the versions? > Code cleanup [Spark Branch] > --- > > Key: HIVE-9281 > URL: https://issues.apache.org/jira/browse/HIVE-9281 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: spark-branch >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch > > > In preparation for merge, we need to cleanup the codes. > This includes removing TODO's, fixing checkstyles, removing commented or > unused code, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9281) Code cleanup [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268461#comment-14268461 ] Szehon Ho commented on HIVE-9281: - Its mostly the same patch, you can quickly look through it or I can commit once tests pass on latest patch. > Code cleanup [Spark Branch] > --- > > Key: HIVE-9281 > URL: https://issues.apache.org/jira/browse/HIVE-9281 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: spark-branch >Reporter: Szehon Ho >Assignee: Szehon Ho > Attachments: HIVE-9281-spark.patch, HIVE-9281.2-spark.patch > > > In preparation for merge, we need to cleanup the codes. > This includes removing TODO's, fixing checkstyles, removing commented or > unused code, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9175) Add alters to list of events handled by NotificationListener
[ https://issues.apache.org/jira/browse/HIVE-9175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-9175: - Fix Version/s: 0.15.0 Status: Patch Available (was: Open) > Add alters to list of events handled by NotificationListener > > > Key: HIVE-9175 > URL: https://issues.apache.org/jira/browse/HIVE-9175 > Project: Hive > Issue Type: New Feature > Components: HCatalog >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: 0.15.0 > > Attachments: HIVE-9175.patch > > > HCatalog currently doesn't implement onAlterTable and onAlterPartition. It > should. -- This message was sent by Atlassian JIRA (v6.3.4#6332)