[jira] [Commented] (HIVE-11391) CBO (Calcite Return Path): Add CBO tests with return path on
[ https://issues.apache.org/jira/browse/HIVE-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652008#comment-14652008 ] Hive QA commented on HIVE-11391: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748441/HIVE-11391.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4801/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4801/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4801/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: java.io.IOException: Could not create /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4801/succeeded/TestVectorSerDeRow {noformat} This message is automatically generated. ATTACHMENT ID: 12748441 - PreCommit-HIVE-TRUNK-Build CBO (Calcite Return Path): Add CBO tests with return path on Key: HIVE-11391 URL: https://issues.apache.org/jira/browse/HIVE-11391 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11391.patch, HIVE-11391.patch, HIVE-11391.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11250) Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2 [Spark Brnach]
[ https://issues.apache.org/jira/browse/HIVE-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HIVE-11250: -- Assignee: Jimmy Xiang Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2 [Spark Brnach] Key: HIVE-11250 URL: https://issues.apache.org/jira/browse/HIVE-11250 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Jimmy Xiang Hive CLI works as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11397) Parse Hive OR clauses as they are written into the AST
[ https://issues.apache.org/jira/browse/HIVE-11397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651921#comment-14651921 ] Hive QA commented on HIVE-11397: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748440/HIVE-11397.2.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4800/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4800/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4800/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: org.apache.hive.ptest.execution.ssh.SSHExecutionException: RSyncResult [localFile=/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4800/succeeded/TestJdbcWithMiniHS2, remoteFile=/home/hiveptest/54.92.254.244-hiveptest-2/logs/, getExitCode()=12, getException()=null, getUser()=hiveptest, getHost()=54.92.254.244, getInstance()=2]: 'Address 54.92.254.244 maps to ec2-54-92-254-244.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list ./ TEST-TestJdbcWithMiniHS2-TEST-org.apache.hive.jdbc.TestJdbcWithMiniHS2.xml 0 0%0.00kB/s0:00:00 5795 100%5.53MB/s0:00:00 (xfer#1, to-check=3/5) hive.log 0 0%0.00kB/s0:00:00 46399488 0% 44.25MB/s0:03:07 96206848 1% 45.88MB/s0:02:59 147062784 1% 46.75MB/s0:02:55 197394432 2% 47.07MB/s0:02:52 247791616 2% 48.02MB/s0:02:48 298057728 3% 48.12MB/s0:02:47 345767936 4% 47.34MB/s0:02:48 386564096 4% 45.07MB/s0:02:56 419495936 4% 40.92MB/s0:03:13 458489856 5% 38.22MB/s0:03:26 494141440 5% 35.38MB/s0:03:41 529956864 6% 34.19MB/s0:03:48 565248000 6% 34.74MB/s0:03:44 601096192 7% 34.00MB/s0:03:47 636878848 7% 34.03MB/s0:03:46 672628736 7% 34.02MB/s0:03:45 708247552 8% 34.10MB/s0:03:44 743931904 8% 34.06MB/s0:03:43 779714560 9% 34.06MB/s0:03:42 815661056 9% 34.09MB/s0:03:41 851509248 9% 34.15MB/s0:03:39 887193600 10% 34.15MB/s0:03:38 922943488 10% 34.15MB/s0:03:37 958464000 11% 34.06MB/s0:03:37 994246656 11% 34.04MB/s0:03:36 1006370816 11% 28.41MB/s0:04:18 1011187712 11% 21.03MB/s0:05:49 1049296896 12% 21.65MB/s0:05:37 1094189056 12% 23.82MB/s0:05:05 1138884608 13% 31.59MB/s0:03:48 1185710080 13% 41.62MB/s0:02:52 1229750272 14% 43.02MB/s0:02:45 1272774656 14% 42.46MB/s0:02:47 1288437760 15% 35.42MB/s0:03:19 1319600128 15% 31.72MB/s0:03:42 1357086720 15% 30.15MB/s0:03:52 1394507776 16% 28.92MB/s0:04:01 1428783104 16% 33.47MB/s0:03:27 1471709184 17% 36.25MB/s0:03:10 1516240896 17% 37.90MB/s0:03:00 1552056320 18% 37.51MB/s0:03:01 1587838976 18% 37.86MB/s0:02:59 1623359488 19% 36.12MB/s0:03:06 1646526464 19% 30.83MB/s0:03:38 1678508032 19% 29.49MB/s0:03:47 1687027712 19% 23.13MB/s0:04:49 1718878208 20% 22.27MB/s0:04:58 1739259904 20% 21.77MB/s0:05:04 1778286592 20% 23.78MB/s0:04:37 1823539200 21% 32.51MB/s0:03:21 1868365824 21% 35.60MB/s0:03:02 1914175488 22% 41.68MB/s0:02:35 1959886848 22% 43.30MB/s0:02:28 1989738496 23% 39.60MB/s0:02:41 2005401600 23% 32.20MB/s0:03:18 2024275968 23% 25.70MB/s0:04:07 2050228224 24% 20.26MB/s0:05:12 2061762560 24% 15.75MB/s0:06:41 2099314688 24% 20.80MB/s0:05:02 2145157120 25% 26.93MB/s0:03:51 2190475264 25% 32.50MB/s0:03:10 2236284928 26% 41.61MB/s0:02:27 2281996288 26% 43.57MB/s0:02:20 2301100032 26% 36.52MB/s0:02:46 2319974400 27% 30.34MB/s0:03:20 2338062336 27% 23.52MB/s0:04:17 2344353792 27% 14.36MB/s0:07:00 2351169536 27% 11.73MB/s0:08:34 2369191936 27% 11.53MB/s0:08:42 2413297664 28% 17.87MB/s0:05:34 rsync: write failed on /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4800/succeeded/TestJdbcWithMiniHS2/hive.log: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6] rsync: connection unexpectedly closed (198 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(600) [generator=3.0.6] Address
[jira] [Commented] (HIVE-11426) lineage3.q fails with -Phadoop-1
[ https://issues.apache.org/jira/browse/HIVE-11426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652062#comment-14652062 ] Sergio Peña commented on HIVE-11426: Thanks [~jxiang] +1 lineage3.q fails with -Phadoop-1 Key: HIVE-11426 URL: https://issues.apache.org/jira/browse/HIVE-11426 Project: Hive Issue Type: Bug Components: Test Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11426.1.patch Some queries in lineage3.q emit different results with -Phadoop-1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11434) Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors
[ https://issues.apache.org/jira/browse/HIVE-11434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652881#comment-14652881 ] Hive QA commented on HIVE-11434: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748493/HIVE-11434.1.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9319 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4807/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4807/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4807/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748493 - PreCommit-HIVE-TRUNK-Build Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors - Key: HIVE-11434 URL: https://issues.apache.org/jira/browse/HIVE-11434 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11434.1.patch, HIVE-11434.patch It appears that the patch other than the latest from HIVE-11363 was committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11295) LLAP: clean up ORC dependencies on object pools
[ https://issues.apache.org/jira/browse/HIVE-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11295: Attachment: (was: HIVE-11259.patch) LLAP: clean up ORC dependencies on object pools --- Key: HIVE-11295 URL: https://issues.apache.org/jira/browse/HIVE-11295 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11434) Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors
[ https://issues.apache.org/jira/browse/HIVE-11434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652930#comment-14652930 ] Chao Sun commented on HIVE-11434: - +1 Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors - Key: HIVE-11434 URL: https://issues.apache.org/jira/browse/HIVE-11434 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11434.1.patch, HIVE-11434.patch It appears that the patch other than the latest from HIVE-11363 was committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11398) Parse wide OR and wide AND trees to balanced structures or a ANY/ALL list
[ https://issues.apache.org/jira/browse/HIVE-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11398: --- Attachment: HIVE-11398.patch Parse wide OR and wide AND trees to balanced structures or a ANY/ALL list - Key: HIVE-11398 URL: https://issues.apache.org/jira/browse/HIVE-11398 Project: Hive Issue Type: New Feature Components: Logical Optimizer, UDF Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11398.patch Deep trees of AND/OR are hard to traverse particularly when they are merely the same structure in nested form as a version of the operator that takes an arbitrary number of args. One potential way to convert the DFS searches into a simpler BFS search is to introduce a new Operator pair named ALL and ANY. ALL(A, B, C, D, E) represents AND(AND(AND(AND(E, D), C), B), A) ANY(A, B, C, D, E) represents OR(OR(OR(OR(E, D), C),B),A) The SemanticAnalyser would be responsible for generating these operators and this would mean that the depth and complexity of traversals for the simplest case of wide AND/OR trees would be trivial. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression
[ https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11405: - Attachment: HIVE-11405.2.patch Reuploading to trigger precommit QA. Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression -- Key: HIVE-11405 URL: https://issues.apache.org/jira/browse/HIVE-11405 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Attachments: HIVE-11405.1.patch, HIVE-11405.2.patch, HIVE-11405.2.patch, HIVE-11405.patch Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330. Quoting him, The recursion protection works well with an AND expr, but it doesn't work against (OR a=1 (OR a=2 (OR a=3 (OR ...) since the for the rows will never be reduced during recursion due to the nature of the OR. We need to execute a short-circuit to satisfy the OR properly - no case which matches a=1 qualifies for the rest of the filters. Recursion should pass in the numRows - branch1Rows for the branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-11434) Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors
[ https://issues.apache.org/jira/browse/HIVE-11434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652964#comment-14652964 ] Xuefu Zhang edited comment on HIVE-11434 at 8/4/15 2:15 AM: Committed to master and branch-1. Thanks for the review, Chao and Lefty. was (Author: xuefuz): Committed to master and branch-1. Thanks for the review, Chao. Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors - Key: HIVE-11434 URL: https://issues.apache.org/jira/browse/HIVE-11434 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11434.1.patch, HIVE-11434.patch It appears that the patch other than the latest from HIVE-11363 was committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11416) CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby Optimizer assumes the schema can match after removing RS and GBY
[ https://issues.apache.org/jira/browse/HIVE-11416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652782#comment-14652782 ] Pengcheng Xiong commented on HIVE-11416: The test case failures are not related. They also fail on the previous precommit build too. [~jcamachorodriguez], could u please take a look? Thanks. CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby Optimizer assumes the schema can match after removing RS and GBY -- Key: HIVE-11416 URL: https://issues.apache.org/jira/browse/HIVE-11416 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11416.01.patch, HIVE-11416.02.patch, HIVE-11416.03.patch, HIVE-11416.04.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11087) DbTxnManager exceptions should include txnid
[ https://issues.apache.org/jira/browse/HIVE-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652869#comment-14652869 ] Eugene Koifman commented on HIVE-11087: --- [~alangates], could you review It's all logging improvements except the change in TxnHandler.abortTxns(). this make the behavior match commit of committed txn which makes bugs in clients more obvious. DbTxnManager exceptions should include txnid Key: HIVE-11087 URL: https://issues.apache.org/jira/browse/HIVE-11087 Project: Hive Issue Type: Sub-task Components: Transactions Affects Versions: 1.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-11087.2.patch, HIVE-11087.patch must include txnid in the exception so that user visible error can be correlated with log file info -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11416) CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby Optimizer assumes the schema can match after removing RS and GBY
[ https://issues.apache.org/jira/browse/HIVE-11416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652772#comment-14652772 ] Hive QA commented on HIVE-11416: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748486/HIVE-11416.04.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9319 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4806/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4806/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4806/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748486 - PreCommit-HIVE-TRUNK-Build CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby Optimizer assumes the schema can match after removing RS and GBY -- Key: HIVE-11416 URL: https://issues.apache.org/jira/browse/HIVE-11416 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11416.01.patch, HIVE-11416.02.patch, HIVE-11416.03.patch, HIVE-11416.04.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries
[ https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11415: Attachment: (was: HIVE-11415.01.patch) Add early termination for recursion in vectorization for deep filter queries Key: HIVE-11415 URL: https://issues.apache.org/jira/browse/HIVE-11415 Project: Hive Issue Type: Bug Reporter: Prasanth Jayachandran Assignee: Matt McCline Queries with deep filters (left deep) throws StackOverflowException in vectorization {code} Exception in thread main java.lang.StackOverflowError at java.lang.Class.getAnnotation(Class.java:3415) at org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29) at org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) {code} Sample query: {code} explain select count(*) from over1k where ( (t=1 and si=2) or (t=2 and si=3) or (t=3 and si=4) or (t=4 and si=5) or (t=5 and si=6) or (t=6 and si=7) or (t=7 and si=8) ... .. {code} repeat the filter for few thousand times for reproduction of the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries
[ https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11415: Attachment: HIVE-11415.01.patch Vectorized support for Multi-OR and Multi-AND. Specifically, the FilterExprOrExpr and FilterExprAndExpr. Add early termination for recursion in vectorization for deep filter queries Key: HIVE-11415 URL: https://issues.apache.org/jira/browse/HIVE-11415 Project: Hive Issue Type: Bug Reporter: Prasanth Jayachandran Assignee: Matt McCline Queries with deep filters (left deep) throws StackOverflowException in vectorization {code} Exception in thread main java.lang.StackOverflowError at java.lang.Class.getAnnotation(Class.java:3415) at org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29) at org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) {code} Sample query: {code} explain select count(*) from over1k where ( (t=1 and si=2) or (t=2 and si=3) or (t=3 and si=4) or (t=4 and si=5) or (t=5 and si=6) or (t=6 and si=7) or (t=7 and si=8) ... .. {code} repeat the filter for few thousand times for reproduction of the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10692) LLAP: DAGs get stuck at start with no tasks executing
[ https://issues.apache.org/jira/browse/HIVE-10692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-10692. - Resolution: Cannot Reproduce LLAP: DAGs get stuck at start with no tasks executing - Key: HIVE-10692 URL: https://issues.apache.org/jira/browse/HIVE-10692 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Siddharth Seth Internal app ID application_1429683757595_0914, LLAP application_1429683757595_0913. If someone without access wants to investigate I'll get the logs. 2nd dag failed to start executing: See syslog_dag_1429683757595_0914_2 log file. This happened to me a couple of times today, didn't see it before. After many S_TA_LAUNCH_REQUEST-s, the following is logged and after that there's no more logging aside from refreshes until I killed the DAG. LLAP daemons were idling meanwhile. I don't see any errors (aside from ATS) before this happened {noformat} 2015-05-12 13:52:08,997 INFO [TaskSchedulerEventHandlerThread] rm.TaskSchedulerEventHandler: Processing the event EventType: S_TA_LAUNCH_REQUEST 2015-05-12 13:52:18,507 INFO [LlapSchedulerNodeEnabler] impl.LlapYarnRegistryImpl: Starting to refresh ServiceInstanceSet 556007888 2015-05-12 13:52:25,315 INFO [HistoryEventHandlingThread] ats.ATSHistoryLoggingService: Event queue stats, eventsProcessedSinceLastUpdate=407, eventQueueSize=614 2015-05-12 13:52:28,507 INFO [LlapSchedulerNodeEnabler] impl.LlapYarnRegistryImpl: Starting to refresh ServiceInstanceSet 556007888 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10743) LLAP: rare NPE in IO
[ https://issues.apache.org/jira/browse/HIVE-10743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-10743. - Resolution: Cannot Reproduce LLAP: rare NPE in IO Key: HIVE-10743 URL: https://issues.apache.org/jira/browse/HIVE-10743 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin {noformat} 2015-05-18 15:37:33,702 [TezTaskRunner_attempt_1431919257083_0116_1_00_09_0(container_1_0116_01_10_sershe_20150518153700_b3649675-c035-4d9a-8dfb-2818b0173022:1_Map 1_9_0)] INFO org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader: Processing file hdfs://cn041-10.l42scl.hortonworks.com:8020/apps/hive/warehouse/tpch_orc_snappy_1000.db/lineitem/93_0 2015-05-18 15:37:33,743 [IO-Elevator-Thread-9(container_1_0116_01_10_sershe_20150518153700_b3649675-c035-4d9a-8dfb-2818b0173022:1_Map 1_9_0)] INFO org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl: Resulting disk ranges to read (file 7895017): [{range start: 28153685 end: 70814209}] 2015-05-18 15:37:33,743 [IO-Elevator-Thread-9(container_1_0116_01_10_sershe_20150518153700_b3649675-c035-4d9a-8dfb-2818b0173022:1_Map 1_9_0)] INFO org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl: Disk ranges after cache (file 7895017, base offset 3): [{range start: 28153685 end: 70814209}] 2015-05-18 15:37:33,791 [IO-Elevator-Thread-9(container_1_0116_01_10_sershe_20150518153700_b3649675-c035-4d9a-8dfb-2818b0173022:1_Map 1_9_0)] INFO org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl: Disk ranges after disk read (file 7895017, base offset 3): [{data range [28153685, 70814209), size: 42660524 type: direct}] 2015-05-18 15:37:33,804 [IO-Elevator-Thread-9(container_1_0116_01_10_sershe_20150518153700_b3649675-c035-4d9a-8dfb-2818b0173022:1_Map 1_9_0)] INFO org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl: setError called; closed false, done false, err null, pending 0 ... Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.orc.InStream.readEncodedStream(InStream.java:763) at org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:445) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:294) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:56) at org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37) ... 4 more {noformat} Not sure yet how this happened. May add some logging or look more if I see it again -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11295) LLAP: clean up ORC dependencies on object pools
[ https://issues.apache.org/jira/browse/HIVE-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11295: Description: Before there's storage API module, we can clean some things up (was: Before there's storage handler module, we can clean some things up) LLAP: clean up ORC dependencies on object pools --- Key: HIVE-11295 URL: https://issues.apache.org/jira/browse/HIVE-11295 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Before there's storage API module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures
[ https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652828#comment-14652828 ] Xuefu Zhang commented on HIVE-11430: I pinpointed HIVE-11333 changed the behavior. Followup HIVE-10166: investigate and fix the two test failures -- Key: HIVE-11430 URL: https://issues.apache.org/jira/browse/HIVE-11430 Project: Hive Issue Type: Bug Components: Test Affects Versions: 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11430.patch, HIVE-11430.patch {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache {code} As show in https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11448) Support vectorization of Multi-OR and Multi-AND
[ https://issues.apache.org/jira/browse/HIVE-11448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11448: Attachment: HIVE-11448.01.patch Support vectorization of Multi-OR and Multi-AND --- Key: HIVE-11448 URL: https://issues.apache.org/jira/browse/HIVE-11448 Project: Hive Issue Type: Bug Components: Hive Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-11448.01.patch Support more than 2 children for OR and AND when all children are expressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11295) LLAP: clean up ORC dependencies on object pools
[ https://issues.apache.org/jira/browse/HIVE-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652934#comment-14652934 ] Sergey Shelukhin commented on HIVE-11295: - this was actually some bogus unrelated patch LLAP: clean up ORC dependencies on object pools --- Key: HIVE-11295 URL: https://issues.apache.org/jira/browse/HIVE-11295 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Before there's storage API module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11436) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char
[ https://issues.apache.org/jira/browse/HIVE-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652580#comment-14652580 ] Hive QA commented on HIVE-11436: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748487/HIVE-11436.02.patch {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 9319 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_char_length_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_char_length_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_char_length_3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_varchar_length_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_varchar_length_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_varchar_length_3 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4805/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4805/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4805/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748487 - PreCommit-HIVE-TRUNK-Build CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char -- Key: HIVE-11436 URL: https://issues.apache.org/jira/browse/HIVE-11436 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11436.01.patch, HIVE-11436.02.patch BaseCharUtils checks whether the length of a char is in between [1,255]. This causes return path to throw error when the the length of a char is 0. Proposing to change the range to [0,255]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures
[ https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652807#comment-14652807 ] Jason Dere commented on HIVE-11430: --- Looks like HIVE-11223 has similar changes in its diff, though those are in the Map join not the Reduce job. Probably safe to consider this just a golden file update. +1 Followup HIVE-10166: investigate and fix the two test failures -- Key: HIVE-11430 URL: https://issues.apache.org/jira/browse/HIVE-11430 Project: Hive Issue Type: Bug Components: Test Affects Versions: 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11430.patch, HIVE-11430.patch {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache {code} As show in https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures
[ https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652754#comment-14652754 ] Chao Sun commented on HIVE-11430: - I'm not sure either. Seems like an extra SELECT op is generated after merging from master. Followup HIVE-10166: investigate and fix the two test failures -- Key: HIVE-11430 URL: https://issues.apache.org/jira/browse/HIVE-11430 Project: Hive Issue Type: Bug Components: Test Affects Versions: 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11430.patch, HIVE-11430.patch {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache {code} As show in https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries
[ https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner resolved HIVE-11415. --- Resolution: Won't Fix Will look into balancing tree instead. Add early termination for recursion in vectorization for deep filter queries Key: HIVE-11415 URL: https://issues.apache.org/jira/browse/HIVE-11415 Project: Hive Issue Type: Bug Reporter: Prasanth Jayachandran Assignee: Matt McCline Queries with deep filters (left deep) throws StackOverflowException in vectorization {code} Exception in thread main java.lang.StackOverflowError at java.lang.Class.getAnnotation(Class.java:3415) at org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29) at org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) {code} Sample query: {code} explain select count(*) from over1k where ( (t=1 and si=2) or (t=2 and si=3) or (t=3 and si=4) or (t=4 and si=5) or (t=5 and si=6) or (t=6 and si=7) or (t=7 and si=8) ... .. {code} repeat the filter for few thousand times for reproduction of the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures
[ https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652649#comment-14652649 ] Xuefu Zhang commented on HIVE-11430: Good question. I don't know. The test case was added in Spark branch, so the test output was generated here. After merging from master to Spark branch, the test case output should have been updated but didn't because the test didn't run due to some testing environment issue. Nevertheless, this doesn't answer the question. The difference seems to be the order of the SELECT and FILTER operators. The output for spark was updated via HIVE-11296, which changed the order, while the test out for MR was missed because of the above mentioned test env issue. [~csun], do you have any idea of what made the diff when we merged master to Spark branch in HIVE-11296? Followup HIVE-10166: investigate and fix the two test failures -- Key: HIVE-11430 URL: https://issues.apache.org/jira/browse/HIVE-11430 Project: Hive Issue Type: Bug Components: Test Affects Versions: 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11430.patch, HIVE-11430.patch {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache {code} As show in https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11449) HybridHashTableContainer should throw exception if not enough memory to create the hash tables
[ https://issues.apache.org/jira/browse/HIVE-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-11449: -- Attachment: HIVE-11449.1.patch HybridHashTableContainer should throw exception if not enough memory to create the hash tables -- Key: HIVE-11449 URL: https://issues.apache.org/jira/browse/HIVE-11449 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-11449.1.patch Currently it only logs a warning message: {code} public static int calcNumPartitions(long memoryThreshold, long dataSize, int minNumParts, int minWbSize, HybridHashTableConf nwayConf) throws IOException { int numPartitions = minNumParts; if (memoryThreshold minNumParts * minWbSize) { LOG.warn(Available memory is not enough to create a HybridHashTableContainer!); } {code} Because we only log a warning, processing continues and hits a hard-to-diagnose error (log below also includes extra logging I added to help track this down). We should probably just fail the query a useful logging message instead. {noformat} 2015-07-30 18:49:29,696 [pool-1269-thread-8()] WARN org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer: Available memory is not enough to create HybridHashTableContainers consistently! 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 1: 10 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 2: 131072 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** maxCapacity: 0 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 3: 0 2015-07-30 18:49:29,699 [TezTaskRunner_attempt_1437197396589_0685_1_49_00_2(attempt_1437197396589_0685_1_49_00_2)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:258) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:168) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:157) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async initialization failed at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:419) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:389) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:514) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:467) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:243) ... 15 more Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: Capacity must be a power of two at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:409) ... 20 more Caused by: java.lang.AssertionError: Capacity must be a power of two at
[jira] [Commented] (HIVE-11433) NPE for a multiple inner join query
[ https://issues.apache.org/jira/browse/HIVE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653037#comment-14653037 ] Hive QA commented on HIVE-11433: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748499/HIVE-11433.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9319 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4809/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4809/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4809/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748499 - PreCommit-HIVE-TRUNK-Build NPE for a multiple inner join query --- Key: HIVE-11433 URL: https://issues.apache.org/jira/browse/HIVE-11433 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0, 1.1.0, 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11433.patch, HIVE-11433.patch NullPointException is thrown for query that has multiple (greater than 3) inner joins. Stacktrace for 1.1.0 {code} NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.ParseUtils.getIndex(ParseUtils.java:149) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:166) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:8257) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:8422) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9805) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9714) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10150) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10161) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10078) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:386) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:373) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692) at
[jira] [Resolved] (HIVE-11246) hive升级遇到问题
[ https://issues.apache.org/jira/browse/HIVE-11246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng resolved HIVE-11246. -- Resolution: Fixed Please reopen the issue if it happens again. hive升级遇到问题 -- Key: HIVE-11246 URL: https://issues.apache.org/jira/browse/HIVE-11246 Project: Hive Issue Type: Bug Reporter: hongyan Assignee: Wei Zheng Priority: Critical 现在使用的hive版本0.12.0 hadoop版本是1.1.2 hive升级到1.2.1 之后select报错,我看官网说支持hadoop 1.x.y但是在网上查了下说 hive和hadoop版本不兼容问题,求解答 Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.unset(Ljava/lang/String;)V at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:77) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11450) HiveConnection doesn't cleanup properly
[ https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-11450: --- Attachment: HIVE-11450.patch HiveConnection doesn't cleanup properly --- Key: HIVE-11450 URL: https://issues.apache.org/jira/browse/HIVE-11450 Project: Hive Issue Type: Bug Reporter: Nezih Yigitbasi Assignee: Nezih Yigitbasi Attachments: HIVE-11450.patch the {{getSchema()}} method doesn't cleanup the resources properly on exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11371) Null pointer exception for nested table query when using ORC versus text
[ https://issues.apache.org/jira/browse/HIVE-11371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653079#comment-14653079 ] Matt McCline commented on HIVE-11371: - The interesting thing in the call stack is this is occuring during closeOp (HybridGrace). See the *reProcessBigTable* in the call stack. Oh boy. {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:508) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.closeOp(VectorMapJoinGenerateResultOperator.java:635) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:630) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:324) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.reProcessBigTable(VectorMapJoinGenerateResultOperator.java:572) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.continueProcess(MapJoinOperator.java:567) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:506) ... 18 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:444) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.reProcessBigTable(VectorMapJoinGenerateResultOperator.java:565) ... 20 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60) at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430) ... 21 more {code} Null pointer exception for nested table query when using ORC versus text Key: HIVE-11371 URL: https://issues.apache.org/jira/browse/HIVE-11371 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.2.0 Reporter: N Campbell Assignee: Matt McCline Attachments: TJOIN1, TJOIN2, TJOIN3, TJOIN4 Following query will fail if the file format is ORC select tj1rnum, tj2rnum, tjoin3.rnum as rnumt3 from (select tjoin1.rnum tj1rnum, tjoin2.rnum tj2rnum, tjoin2.c1 tj2c1 from tjoin1 left outer join tjoin2 on tjoin1.c1 = tjoin2.c1 ) tj left outer join tjoin3 on tj2c1 = tjoin3.c1 aused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60) at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430) ... 22 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1437788144883_0004_2_02 [Map 1] killed/failed due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 SQLState: 08S01 ErrorCode: 2 getDatabaseProductNameApache Hive getDatabaseProductVersion 1.2.1.2.3.0.0-2557 getDriverName Hive JDBC getDriverVersion 1.2.1.2.3.0.0-2557 getDriverMajorVersion 1 getDriverMinorVersion 2 create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc;
[jira] [Commented] (HIVE-11371) Null pointer exception for nested table query when using ORC versus text
[ https://issues.apache.org/jira/browse/HIVE-11371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653066#comment-14653066 ] Matt McCline commented on HIVE-11371: - (Didn't refresh before I added my comment and didn't see Gopal's comment) Null pointer exception for nested table query when using ORC versus text Key: HIVE-11371 URL: https://issues.apache.org/jira/browse/HIVE-11371 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.2.0 Reporter: N Campbell Assignee: Matt McCline Attachments: TJOIN1, TJOIN2, TJOIN3, TJOIN4 Following query will fail if the file format is ORC select tj1rnum, tj2rnum, tjoin3.rnum as rnumt3 from (select tjoin1.rnum tj1rnum, tjoin2.rnum tj2rnum, tjoin2.c1 tj2c1 from tjoin1 left outer join tjoin2 on tjoin1.c1 = tjoin2.c1 ) tj left outer join tjoin3 on tj2c1 = tjoin3.c1 aused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60) at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430) ... 22 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1437788144883_0004_2_02 [Map 1] killed/failed due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 SQLState: 08S01 ErrorCode: 2 getDatabaseProductNameApache Hive getDatabaseProductVersion 1.2.1.2.3.0.0-2557 getDriverName Hive JDBC getDriverVersion 1.2.1.2.3.0.0-2557 getDriverMajorVersion 1 getDriverMinorVersion 2 create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc; create table if not exists TJOIN2 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; create table if not exists TJOIN3 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; create table if not exists TJOIN4 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11449) HybridHashTableContainer should throw exception if not enough memory to create the hash tables
[ https://issues.apache.org/jira/browse/HIVE-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653046#comment-14653046 ] Wei Zheng commented on HIVE-11449: -- Previously we didn't want to fail the query as long as we can proceed, even if the memory is not enough. We want to force the mapjoin to move on - the worst case would be the same as regular mapjoin - OOM. But we still have a chance to finish. Now that we have memory manager and if we want to abide by the allocation faithfully, we may need to change the warning to a failure. [~mmokhtar] [~vikram.dixit] HybridHashTableContainer should throw exception if not enough memory to create the hash tables -- Key: HIVE-11449 URL: https://issues.apache.org/jira/browse/HIVE-11449 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-11449.1.patch Currently it only logs a warning message: {code} public static int calcNumPartitions(long memoryThreshold, long dataSize, int minNumParts, int minWbSize, HybridHashTableConf nwayConf) throws IOException { int numPartitions = minNumParts; if (memoryThreshold minNumParts * minWbSize) { LOG.warn(Available memory is not enough to create a HybridHashTableContainer!); } {code} Because we only log a warning, processing continues and hits a hard-to-diagnose error (log below also includes extra logging I added to help track this down). We should probably just fail the query a useful logging message instead. {noformat} 2015-07-30 18:49:29,696 [pool-1269-thread-8()] WARN org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer: Available memory is not enough to create HybridHashTableContainers consistently! 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 1: 10 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 2: 131072 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** maxCapacity: 0 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 3: 0 2015-07-30 18:49:29,699 [TezTaskRunner_attempt_1437197396589_0685_1_49_00_2(attempt_1437197396589_0685_1_49_00_2)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:258) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:168) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:157) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async initialization failed at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:419) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:389) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:514) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:467) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:243) ... 15 more Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError:
[jira] [Commented] (HIVE-11448) Support vectorization of Multi-OR and Multi-AND
[ https://issues.apache.org/jira/browse/HIVE-11448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653052#comment-14653052 ] Gopal V commented on HIVE-11448: Added to my build for the night, thanks [~mmccline]. Support vectorization of Multi-OR and Multi-AND --- Key: HIVE-11448 URL: https://issues.apache.org/jira/browse/HIVE-11448 Project: Hive Issue Type: Bug Components: Hive Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-11448.01.patch, HIVE-11448.02.patch Support more than 2 children for OR and AND when all children are expressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11371) Null pointer exception for nested table query when using ORC versus text
[ https://issues.apache.org/jira/browse/HIVE-11371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653075#comment-14653075 ] Matt McCline commented on HIVE-11371: - Nevermind. The repro in the JIRA does work. Null pointer exception for nested table query when using ORC versus text Key: HIVE-11371 URL: https://issues.apache.org/jira/browse/HIVE-11371 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.2.0 Reporter: N Campbell Assignee: Matt McCline Attachments: TJOIN1, TJOIN2, TJOIN3, TJOIN4 Following query will fail if the file format is ORC select tj1rnum, tj2rnum, tjoin3.rnum as rnumt3 from (select tjoin1.rnum tj1rnum, tjoin2.rnum tj2rnum, tjoin2.c1 tj2c1 from tjoin1 left outer join tjoin2 on tjoin1.c1 = tjoin2.c1 ) tj left outer join tjoin3 on tj2c1 = tjoin3.c1 aused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60) at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430) ... 22 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1437788144883_0004_2_02 [Map 1] killed/failed due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 SQLState: 08S01 ErrorCode: 2 getDatabaseProductNameApache Hive getDatabaseProductVersion 1.2.1.2.3.0.0-2557 getDriverName Hive JDBC getDriverVersion 1.2.1.2.3.0.0-2557 getDriverMajorVersion 1 getDriverMinorVersion 2 create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc; create table if not exists TJOIN2 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; create table if not exists TJOIN3 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; create table if not exists TJOIN4 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11450) Resources are not cleaned up properly at multiple places
[ https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-11450: --- Summary: Resources are not cleaned up properly at multiple places (was: HiveConnection doesn't cleanup resources properly) Resources are not cleaned up properly at multiple places Key: HIVE-11450 URL: https://issues.apache.org/jira/browse/HIVE-11450 Project: Hive Issue Type: Bug Components: JDBC Reporter: Nezih Yigitbasi Assignee: Nezih Yigitbasi Attachments: HIVE-11450.patch the {{getSchema()}} method doesn't cleanup the resources properly on exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11398) Parse wide OR and wide AND trees to flat OR/AND trees
[ https://issues.apache.org/jira/browse/HIVE-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653054#comment-14653054 ] Gopal V commented on HIVE-11398: Added the my nightly performance tests. Parse wide OR and wide AND trees to flat OR/AND trees - Key: HIVE-11398 URL: https://issues.apache.org/jira/browse/HIVE-11398 Project: Hive Issue Type: New Feature Components: Logical Optimizer, UDF Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11398.patch Deep trees of AND/OR are hard to traverse particularly when they are merely the same structure in nested form as a version of the operator that takes an arbitrary number of args. One potential way to convert the DFS searches into a simpler BFS search is to introduce a new Operator pair named ALL and ANY. ALL(A, B, C, D, E) represents AND(AND(AND(AND(E, D), C), B), A) ANY(A, B, C, D, E) represents OR(OR(OR(OR(E, D), C),B),A) The SemanticAnalyser would be responsible for generating these operators and this would mean that the depth and complexity of traversals for the simplest case of wide AND/OR trees would be trivial. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11450) Resources are not cleaned up properly at multiple places
[ https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-11450: --- Attachment: HIVE-11450.2.patch Updating the patch to fix other resource cleanup issues. Resources are not cleaned up properly at multiple places Key: HIVE-11450 URL: https://issues.apache.org/jira/browse/HIVE-11450 Project: Hive Issue Type: Bug Components: JDBC Reporter: Nezih Yigitbasi Assignee: Nezih Yigitbasi Attachments: HIVE-11450.2.patch, HIVE-11450.patch I noticed that various resources aren't properly cleaned in various classes. To be specific, * Some streams aren't properly cleaned up in {{beeline/src/java/org/apache/hive/beeline/BeeLine.java}} and {{beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java}} * {{Statement}}, {{ResultSet}}, and {{Connection}} aren't properly cleaned up in {{beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java}} * {{Statement}} and {{ResultSet}} aren't properly cleaned up in {{jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into
[ https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652979#comment-14652979 ] Hive QA commented on HIVE-11437: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748494/HIVE-11437.02.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9317 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4808/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4808/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4808/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748494 - PreCommit-HIVE-TRUNK-Build CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into --- Key: HIVE-11437 URL: https://issues.apache.org/jira/browse/HIVE-11437 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11437.01.patch, HIVE-11437.02.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11449) HybridHashTableContainer should throw exception if not enough memory to create the hash tables
[ https://issues.apache.org/jira/browse/HIVE-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652989#comment-14652989 ] Sergey Shelukhin commented on HIVE-11449: - What is passed as memUsage to the hashtable? Perhaps that logic should also be fixed. Also, [~wzheng] might comment on why this happens in the first place... perhaps all the hashtables should be created smaller? HybridHashTableContainer should throw exception if not enough memory to create the hash tables -- Key: HIVE-11449 URL: https://issues.apache.org/jira/browse/HIVE-11449 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-11449.1.patch Currently it only logs a warning message: {code} public static int calcNumPartitions(long memoryThreshold, long dataSize, int minNumParts, int minWbSize, HybridHashTableConf nwayConf) throws IOException { int numPartitions = minNumParts; if (memoryThreshold minNumParts * minWbSize) { LOG.warn(Available memory is not enough to create a HybridHashTableContainer!); } {code} Because we only log a warning, processing continues and hits a hard-to-diagnose error (log below also includes extra logging I added to help track this down). We should probably just fail the query a useful logging message instead. {noformat} 2015-07-30 18:49:29,696 [pool-1269-thread-8()] WARN org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer: Available memory is not enough to create HybridHashTableContainers consistently! 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 1: 10 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 2: 131072 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** maxCapacity: 0 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 3: 0 2015-07-30 18:49:29,699 [TezTaskRunner_attempt_1437197396589_0685_1_49_00_2(attempt_1437197396589_0685_1_49_00_2)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:258) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:168) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:157) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async initialization failed at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:419) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:389) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:514) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:467) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:243) ... 15 more Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: Capacity must be a power of two at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at
[jira] [Comment Edited] (HIVE-11295) LLAP: clean up ORC dependencies on object pools
[ https://issues.apache.org/jira/browse/HIVE-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653006#comment-14653006 ] Sergey Shelukhin edited comment on HIVE-11295 at 8/4/15 3:11 AM: - Actual patch; it also needed some rebasing. I also removed 2 pools that are probably not very useful, and fixed an issue with the first patch (the one that moves ORC/LLAP stuff around) [~prasanth_j] can you review? was (Author: sershe): Actual patch; it also needed some rebasing. [~prasanth_j] can you review? LLAP: clean up ORC dependencies on object pools --- Key: HIVE-11295 URL: https://issues.apache.org/jira/browse/HIVE-11295 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-11295.patch Before there's storage API module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11441) No DDL allowed on table if user accidentally set table location wrong
[ https://issues.apache.org/jira/browse/HIVE-11441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653041#comment-14653041 ] Hive QA commented on HIVE-11441: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748503/HIVE-11441.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4810/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4810/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4810/ Messages: {noformat} This message was trimmed, see log for full details main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [copy] Copying 11 files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ spark-client --- [INFO] Compiling 5 source files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes [INFO] [INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client --- [INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar [INFO] Copying guava-14.0.1.jar to /data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ spark-client --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Query Language 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-exec --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen Generating vector expression code Generating vector expression test code [INFO] Executed tasks [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec --- [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java added. [INFO] [INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec --- [INFO] ANTLR: Processing source directory /data/hive-ptest/working/apache-github-source-source/ql/src/java ANTLR Parser Generator Version 3.4 org/apache/hadoop/hive/ql/parse/HiveLexer.g org/apache/hadoop/hive/ql/parse/HiveParser.g warning(200): IdentifiersParser.g:455:5: Decision can match input such as {KW_REGEXP,
[jira] [Updated] (HIVE-11450) HiveConnection doesn't cleanup resources properly
[ https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-11450: --- Summary: HiveConnection doesn't cleanup resources properly (was: HiveConnection doesn't cleanup properly) HiveConnection doesn't cleanup resources properly - Key: HIVE-11450 URL: https://issues.apache.org/jira/browse/HIVE-11450 Project: Hive Issue Type: Bug Components: JDBC Reporter: Nezih Yigitbasi Assignee: Nezih Yigitbasi Attachments: HIVE-11450.patch the {{getSchema()}} method doesn't cleanup the resources properly on exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11450) Resources are not cleaned up properly at multiple places
[ https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-11450: --- Description: I noticed that various resources aren't properly cleaned in various classes. To be specific, * Some streams aren't properly cleaned up in {{beeline/src/java/org/apache/hive/beeline/BeeLine.java}} and {{beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java}} * {{Statement}}, {{ResultSet}}, and {{Connection}} aren't properly cleaned up in {{beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java}} * {{Statement}} and {{ResultSet}} aren't properly cleaned up in {{jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java}} was:I noticed that various resources aren't properly cleaned in various classes. Resources are not cleaned up properly at multiple places Key: HIVE-11450 URL: https://issues.apache.org/jira/browse/HIVE-11450 Project: Hive Issue Type: Bug Components: JDBC Reporter: Nezih Yigitbasi Assignee: Nezih Yigitbasi Attachments: HIVE-11450.2.patch, HIVE-11450.patch I noticed that various resources aren't properly cleaned in various classes. To be specific, * Some streams aren't properly cleaned up in {{beeline/src/java/org/apache/hive/beeline/BeeLine.java}} and {{beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java}} * {{Statement}}, {{ResultSet}}, and {{Connection}} aren't properly cleaned up in {{beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java}} * {{Statement}} and {{ResultSet}} aren't properly cleaned up in {{jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651723#comment-14651723 ] Hive QA commented on HIVE-10975: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748415/HIVE-10975.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4799/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4799/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4799/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: org.apache.hive.ptest.execution.ssh.SSHExecutionException: RSyncResult [localFile=/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver, remoteFile=/home/hiveptest/54.204.186.94-hiveptest-0/logs/, getExitCode()=12, getException()=null, getUser()=hiveptest, getHost()=54.204.186.94, getInstance()=0]: 'Address 54.204.186.94 maps to ec2-54-204-186-94.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list TEST-TestHBaseMinimrCliDriver-TEST-org.apache.hadoop.hive.cli.TestHBaseMinimrCliDriver.xml 0 0%0.00kB/s0:00:00 4846 100%4.62MB/s0:00:00 (xfer#1, to-check=3/5) hive.log 0 0%0.00kB/s0:00:00 45613056 22% 43.50MB/s0:00:03 94601216 46% 45.09MB/s0:00:02 rsync: write failed on /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver/hive.log: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6] rsync: connection unexpectedly closed (213 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(600) [generator=3.0.6] Address 54.204.186.94 maps to ec2-54-204-186-94.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list ./ hive.log 0 0%0.00kB/s0:00:00 rsync: write failed on /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver/hive.log: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6] rsync: connection unexpectedly closed (213 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(600) [generator=3.0.6] Address 54.204.186.94 maps to ec2-54-204-186-94.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list ./ hive.log 0 0%0.00kB/s0:00:00 rsync: write failed on /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver/hive.log: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6] rsync: connection unexpectedly closed (213 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(600) [generator=3.0.6] Address 54.204.186.94 maps to ec2-54-204-186-94.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list ./ hive.log 0 0%0.00kB/s0:00:00 rsync: write failed on /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver/hive.log: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6] rsync: connection unexpectedly closed (213 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(600) [generator=3.0.6] Address 54.204.186.94 maps to ec2-54-204-186-94.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list ./ hive.log 0 0%0.00kB/s0:00:00 rsync: write failed on /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4799/succeeded/TestHBaseMinimrCliDriver/hive.log: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6] rsync: connection unexpectedly closed (213 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(600) [generator=3.0.6] ' {noformat} This message is automatically generated. ATTACHMENT ID: 12748415 - PreCommit-HIVE-TRUNK-Build Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL:
[jira] [Commented] (HIVE-11371) Null pointer exception for nested table query when using ORC versus text
[ https://issues.apache.org/jira/browse/HIVE-11371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651482#comment-14651482 ] Gopal V commented on HIVE-11371: Comment looks identical to an issue I'm currently hitting. From the repro, try removing the tjoin3.rnum, because projecting out of a no-match might be the issue. Null pointer exception for nested table query when using ORC versus text Key: HIVE-11371 URL: https://issues.apache.org/jira/browse/HIVE-11371 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.2.0 Reporter: N Campbell Assignee: Matt McCline Attachments: TJOIN1, TJOIN2, TJOIN3, TJOIN4 Following query will fail if the file format is ORC select tj1rnum, tj2rnum, tjoin3.rnum as rnumt3 from (select tjoin1.rnum tj1rnum, tjoin2.rnum tj2rnum, tjoin2.c1 tj2c1 from tjoin1 left outer join tjoin2 on tjoin1.c1 = tjoin2.c1 ) tj left outer join tjoin3 on tj2c1 = tjoin3.c1 aused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60) at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430) ... 22 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1437788144883_0004_2_02 [Map 1] killed/failed due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 SQLState: 08S01 ErrorCode: 2 getDatabaseProductNameApache Hive getDatabaseProductVersion 1.2.1.2.3.0.0-2557 getDriverName Hive JDBC getDriverVersion 1.2.1.2.3.0.0-2557 getDriverMajorVersion 1 getDriverMinorVersion 2 create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc; create table if not exists TJOIN2 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; create table if not exists TJOIN3 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; create table if not exists TJOIN4 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11438) Join a ACID table with non-ACID table fail with MR on 1.0.0
[ https://issues.apache.org/jira/browse/HIVE-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651515#comment-14651515 ] Hive QA commented on HIVE-11438: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748398/HIVE-11438.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4797/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4797/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4797/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4797/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 8b2cd2a HIVE-11380: NPE when FileSinkOperator is not initialized (Yongzhi Chen, reviewed by Sergio Pena) + git clean -f -d + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 8b2cd2a HIVE-11380: NPE when FileSinkOperator is not initialized (Yongzhi Chen, reviewed by Sergio Pena) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12748398 - PreCommit-HIVE-TRUNK-Build Join a ACID table with non-ACID table fail with MR on 1.0.0 --- Key: HIVE-11438 URL: https://issues.apache.org/jira/browse/HIVE-11438 Project: Hive Issue Type: Bug Components: Query Processor, Transactions Affects Versions: 1.0.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 1.0.1 Attachments: HIVE-11438.1.patch The following script fail on MR mode: Preparation: {code} CREATE TABLE orc_update_table (k1 INT, f1 STRING, op_code STRING) CLUSTERED BY (k1) INTO 2 BUCKETS STORED AS ORC TBLPROPERTIES(transactional=true); INSERT INTO TABLE orc_update_table VALUES (1, 'a', 'I'); CREATE TABLE orc_table (k1 INT, f1 STRING) CLUSTERED BY (k1) SORTED BY (k1) INTO 2 BUCKETS STORED AS ORC; INSERT OVERWRITE TABLE orc_table VALUES (1, 'x'); {code} Then run the following script: {code} SET hive.execution.engine=mr; SET hive.auto.convert.join=false; SET hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat; SELECT t1.*, t2.* FROM orc_table t1 JOIN orc_update_table t2 ON t1.k1=t2.k1 ORDER BY t1.k1; {code} Stack: {code} java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getCombineSplits(CombineHiveInputFormat.java:272) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:509) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:624) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:616) at
[jira] [Assigned] (HIVE-11371) Null pointer exception for nested table query when using ORC versus text
[ https://issues.apache.org/jira/browse/HIVE-11371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline reassigned HIVE-11371: --- Assignee: Matt McCline Null pointer exception for nested table query when using ORC versus text Key: HIVE-11371 URL: https://issues.apache.org/jira/browse/HIVE-11371 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.2.0 Reporter: N Campbell Assignee: Matt McCline Attachments: TJOIN1, TJOIN2, TJOIN3, TJOIN4 Following query will fail if the file format is ORC select tj1rnum, tj2rnum, tjoin3.rnum as rnumt3 from (select tjoin1.rnum tj1rnum, tjoin2.rnum tj2rnum, tjoin2.c1 tj2c1 from tjoin1 left outer join tjoin2 on tjoin1.c1 = tjoin2.c1 ) tj left outer join tjoin3 on tj2c1 = tjoin3.c1 aused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow$LongCopyRow.copy(VectorCopyRow.java:60) at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.copyByReference(VectorCopyRow.java:260) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultMultiValue(VectorMapJoinGenerateResultOperator.java:238) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterGenerateResultOperator.finishOuter(VectorMapJoinOuterGenerateResultOperator.java:495) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinOuterLongOperator.process(VectorMapJoinOuterLongOperator.java:430) ... 22 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1437788144883_0004_2_02 [Map 1] killed/failed due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 SQLState: 08S01 ErrorCode: 2 getDatabaseProductNameApache Hive getDatabaseProductVersion 1.2.1.2.3.0.0-2557 getDriverName Hive JDBC getDriverVersion 1.2.1.2.3.0.0-2557 getDriverMajorVersion 1 getDriverMinorVersion 2 create table if not exists TJOIN1 (RNUM int , C1 int, C2 int) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc; create table if not exists TJOIN2 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; create table if not exists TJOIN3 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; create table if not exists TJOIN4 (RNUM int , C1 int, C2 char(2)) -- ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS orc ; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11376) CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files
[ https://issues.apache.org/jira/browse/HIVE-11376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651480#comment-14651480 ] Amareshwari Sriramadasu commented on HIVE-11376: +1 for the patch, pending test results. CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files - Key: HIVE-11376 URL: https://issues.apache.org/jira/browse/HIVE-11376 Project: Hive Issue Type: Bug Reporter: Rajat Khandelwal Assignee: Rajat Khandelwal Attachments: HIVE-11376_02.patch https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java#L379 This is the exact code snippet: {noformat} / Since there is no easy way of knowing whether MAPREDUCE-1597 is present in the tree or not, // we use a configuration variable for the same if (this.mrwork != null !this.mrwork.getHadoopSupportsSplittable()) { // The following code should be removed, once // https://issues.apache.org/jira/browse/MAPREDUCE-1597 is fixed. // Hadoop does not handle non-splittable files correctly for CombineFileInputFormat, // so don't use CombineFileInputFormat for non-splittable files //ie, dont't combine if inputformat is a TextInputFormat and has compression turned on {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into
[ https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651593#comment-14651593 ] Hive QA commented on HIVE-11437: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748397/HIVE-11437.01.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9317 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_auto_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_join0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4798/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4798/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4798/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748397 - PreCommit-HIVE-TRUNK-Build CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into --- Key: HIVE-11437 URL: https://issues.apache.org/jira/browse/HIVE-11437 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11437.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11182) Enable optimized hash tables for spark [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651508#comment-14651508 ] Rui Li commented on HIVE-11182: --- Hi [~leftylev], not for this one. I'll take care of documentation in HIVE-11180. Enable optimized hash tables for spark [Spark Branch] - Key: HIVE-11182 URL: https://issues.apache.org/jira/browse/HIVE-11182 Project: Hive Issue Type: Improvement Components: Spark Reporter: Rui Li Assignee: Rui Li Fix For: spark-branch, 1.3.0, 2.0.0 Attachments: HIVE-11182.1-spark.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11413) Error in detecting availability of HiveSemanticAnalyzerHooks
[ https://issues.apache.org/jira/browse/HIVE-11413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-11413: --- Attachment: HIVE-11413.2.patch Error in detecting availability of HiveSemanticAnalyzerHooks Key: HIVE-11413 URL: https://issues.apache.org/jira/browse/HIVE-11413 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 2.0.0 Reporter: Raajay Viswanathan Assignee: Raajay Viswanathan Priority: Trivial Labels: newbie Fix For: 2.0.0 Attachments: HIVE-11413.2.patch, HIVE-11413.2.patch, HIVE-11413.patch In {{compile(String, Boolean)}} function in {{Driver.java}}, the list of available {{HiveSemanticAnalyzerHook}} (_saHooks_) are obtained using the {{getHooks}} method. This method always returns a {{List}} of hooks. However, while checking for availability of hooks, the current version of the code uses a comparison of _saHooks_ with NULL. This is incorrect, as the segment of code designed to call pre and post Analyze functions gets executed even when the list is empty. The comparison should be changed to {{saHooks.size() 0}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11387) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix reduce_deduplicate optimization
[ https://issues.apache.org/jira/browse/HIVE-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652101#comment-14652101 ] Pengcheng Xiong commented on HIVE-11387: [~jcamachorodriguez], sure, please do so. Thanks. CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix reduce_deduplicate optimization -- Key: HIVE-11387 URL: https://issues.apache.org/jira/browse/HIVE-11387 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11387.01.patch, HIVE-11387.02.patch, HIVE-11387.03.patch, HIVE-11387.04.patch {noformat} The main problem is that, due to return path, now we may have (RS1-GBY2)-(RS3-GBY4) when map.aggr=false, i.e., no map aggr. However, in the non-return path, it will be treated as (RS1)-(GBY2-RS3-GBY4). The main problem is that it does not take into account of the setting. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11436) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char
[ https://issues.apache.org/jira/browse/HIVE-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11436: --- Attachment: HIVE-11436.02.patch CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char -- Key: HIVE-11436 URL: https://issues.apache.org/jira/browse/HIVE-11436 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11436.01.patch, HIVE-11436.02.patch BaseCharUtils checks whether the length of a char is in between [1,255]. This causes return path to throw error when the the length of a char is 0. Proposing to change the range to [0,255]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results
[ https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652125#comment-14652125 ] Matt McCline commented on HIVE-11410: - No problem -- thank you for your response. Join with subquery containing a group by incorrectly returns no results --- Key: HIVE-11410 URL: https://issues.apache.org/jira/browse/HIVE-11410 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.1.0 Reporter: Nicholas Brenwald Assignee: Matt McCline Priority: Minor Attachments: hive-site.xml Start by creating a table *t* with columns *c1* and *c2* and populate with 1 row of data. For example create table *t* from an existing table which contains at least 1 row of data by running: {code} create table t as select 'abc' as c1, 0 as c2 from Y limit 1; {code} Table *t* looks like the following: ||c1||c2|| |abc|0| Running the following query then returns zero results. {code} SELECT t1.c1 FROM t t1 JOIN (SELECT t2.c1, MAX(t2.c2) AS c2 FROM t t2 GROUP BY t2.c1 ) t3 ON t1.c2=t3.c2 {code} However, we expected to see the following: ||c1|| |abc| The problem seems to relate to the fact that in the subquery, we group by column *c1*, but this is not subsequently used in the join condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11443) remove HiveServer1 C++ client library
[ https://issues.apache.org/jira/browse/HIVE-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-11443: - Labels: newbie newdev (was: ) remove HiveServer1 C++ client library - Key: HIVE-11443 URL: https://issues.apache.org/jira/browse/HIVE-11443 Project: Hive Issue Type: Bug Components: ODBC Reporter: Thejas M Nair Labels: newbie, newdev HiveServer1 has been removed as part of HIVE-6977 . There is still C++ hive client code used by the old ODBC driver that works against HiveServer1. We should remove that unusable code from the code base. This the whole odbc dir. There would also be maven pom.xml entries at top level that would also be candidates for removal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11250) Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2 [Spark Brnach]
[ https://issues.apache.org/jira/browse/HIVE-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652213#comment-14652213 ] Xuefu Zhang commented on HIVE-11250: In addition to the problem, I think changing value of hive.execution.engine from spark to others (say, mr) should result in destroying the spark session. While this is probably not comment in a production environment, I ran into problem in testing that I switched to MR and found that the MR job doesn't make any progress because all containers are currently held by the spark session. Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2 [Spark Brnach] Key: HIVE-11250 URL: https://issues.apache.org/jira/browse/HIVE-11250 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Jimmy Xiang Hive CLI works as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652216#comment-14652216 ] Hive QA commented on HIVE-10975: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748453/HIVE-10975.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4803/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4803/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4803/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: java.io.IOException: Error writing to /data/hive-ptest/working/scratch/hiveptest-TestPutResultWritable.sh {noformat} This message is automatically generated. ATTACHMENT ID: 12748453 - PreCommit-HIVE-TRUNK-Build Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.1.patch, HIVE-10975.1.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11443) remove HiveServer1 C++ client library
[ https://issues.apache.org/jira/browse/HIVE-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652315#comment-14652315 ] Thejas M Nair commented on HIVE-11443: -- [~vgumashta] Good point, HS1 thrift IDL should also be removed. Also need to update https://cwiki.apache.org/confluence/display/Hive/HiveODBC when this change is done. remove HiveServer1 C++ client library - Key: HIVE-11443 URL: https://issues.apache.org/jira/browse/HIVE-11443 Project: Hive Issue Type: Bug Components: ODBC Reporter: Thejas M Nair Labels: newbie, newdev HiveServer1 has been removed as part of HIVE-6977 . There is still C++ hive client code used by the old ODBC driver that works against HiveServer1. We should remove that unusable code from the code base. This the whole odbc dir. There would also be maven pom.xml entries at top level that would also be candidates for removal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11443) remove HiveServer1 C++ client library
[ https://issues.apache.org/jira/browse/HIVE-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652218#comment-14652218 ] Vaibhav Gumashta commented on HIVE-11443: - The thrift IDL as well (hive_service.thrift). remove HiveServer1 C++ client library - Key: HIVE-11443 URL: https://issues.apache.org/jira/browse/HIVE-11443 Project: Hive Issue Type: Bug Components: ODBC Reporter: Thejas M Nair Labels: newbie, newdev HiveServer1 has been removed as part of HIVE-6977 . There is still C++ hive client code used by the old ODBC driver that works against HiveServer1. We should remove that unusable code from the code base. This the whole odbc dir. There would also be maven pom.xml entries at top level that would also be candidates for removal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11442) Remove commons-configuration.jar from Hive distribution
[ https://issues.apache.org/jira/browse/HIVE-11442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652322#comment-14652322 ] Thejas M Nair commented on HIVE-11442: -- +1 Remove commons-configuration.jar from Hive distribution --- Key: HIVE-11442 URL: https://issues.apache.org/jira/browse/HIVE-11442 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11442.1.patch Some customer report version conflicting for Hive bundled commons-configuration.jar. Actually commons-configuration.jar is not needed by Hive. It is a transitive dependency of Hadoop/Accumulo. User should be able to pick those jars from Hadoop at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11443) remove HiveServer1 C++ client library
[ https://issues.apache.org/jira/browse/HIVE-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652324#comment-14652324 ] Lefty Leverenz commented on HIVE-11443: --- ... remembering to keep the old information for people using previous releases. remove HiveServer1 C++ client library - Key: HIVE-11443 URL: https://issues.apache.org/jira/browse/HIVE-11443 Project: Hive Issue Type: Bug Components: ODBC Reporter: Thejas M Nair Labels: newbie, newdev HiveServer1 has been removed as part of HIVE-6977 . There is still C++ hive client code used by the old ODBC driver that works against HiveServer1. We should remove that unusable code from the code base. This the whole odbc dir. There would also be maven pom.xml entries at top level that would also be candidates for removal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11434) Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors
[ https://issues.apache.org/jira/browse/HIVE-11434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-11434: --- Attachment: HIVE-11434.1.patch Followup for HIVE-10166: reuse existing configurations for prewarming Spark executors - Key: HIVE-11434 URL: https://issues.apache.org/jira/browse/HIVE-11434 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11434.1.patch, HIVE-11434.patch It appears that the patch other than the latest from HIVE-11363 was committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into
[ https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11437: --- Attachment: HIVE-11437.02.patch CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into --- Key: HIVE-11437 URL: https://issues.apache.org/jira/browse/HIVE-11437 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11437.01.patch, HIVE-11437.02.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11433) NPE for a multiple inner join query
[ https://issues.apache.org/jira/browse/HIVE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang reassigned HIVE-11433: -- Assignee: Xuefu Zhang NPE for a multiple inner join query --- Key: HIVE-11433 URL: https://issues.apache.org/jira/browse/HIVE-11433 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0, 1.1.0, 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11433.patch NullPointException is thrown for query that has multiple (greater than 3) inner joins. Stacktrace for 1.1.0 {code} NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.ParseUtils.getIndex(ParseUtils.java:149) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:166) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:8257) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:8422) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9805) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9714) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10150) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10161) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10078) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:386) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:373) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code}. However, the problem can also be reproduced in latest master branch. Further investigation shows that the following code (in ParseUtils.java) is problematic: {code} static int getIndex(String[] list, String elem) { for(int i=0; i list.length; i++) { if (list[i].toLowerCase().equals(elem)) { return i; } } return -1; } {code} The code assumes that every element in the list is not null, which isn't true because of the following code in SemanticAnalyzer.java (method genJoinTree()): {code} if ((right.getToken().getType() == HiveParser.TOK_TABREF) || (right.getToken().getType() == HiveParser.TOK_SUBQUERY) || (right.getToken().getType() == HiveParser.TOK_PTBLFUNCTION)) { String tableName = getUnescapedUnqualifiedTableName((ASTNode) right.getChild(0)) .toLowerCase(); String alias = extractJoinAlias(right, tableName); String[] rightAliases = new String[1]; rightAliases[0] = alias;
[jira] [Commented] (HIVE-11312) ORC format: where clause with CHAR data type not returning any rows
[ https://issues.apache.org/jira/browse/HIVE-11312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652203#comment-14652203 ] Thomas Friedrich commented on HIVE-11312: - Thanks for looking at this, [~prasanth_j]. Feel free to assign the JIRA to yourself. ORC format: where clause with CHAR data type not returning any rows --- Key: HIVE-11312 URL: https://issues.apache.org/jira/browse/HIVE-11312 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0, 1.2.1 Reporter: Thomas Friedrich Assignee: Thomas Friedrich Labels: orc Attachments: HIVE-11312.1.patch, HIVE-11312.2.patch Test case: Setup: create table orc_test( col1 string, col2 char(10)) stored as orc tblproperties (orc.compress=NONE); insert into orc_test values ('val1', '1'); Query: select * from orc_test where col2='1'; Query returns no row. Problem is introduced with HIVE-10286, class RecordReaderImpl.java, method evaluatePredicateRange. Old code: - Object baseObj = predicate.getLiteral(PredicateLeaf.FileFormat.ORC); - Object minValue = getConvertedStatsObj(min, baseObj); - Object maxValue = getConvertedStatsObj(max, baseObj); - Object predObj = getBaseObjectForComparison(baseObj, minValue); New code: + Object baseObj = predicate.getLiteral(); + Object minValue = getBaseObjectForComparison(predicate.getType(), min); + Object maxValue = getBaseObjectForComparison(predicate.getType(), max); + Object predObj = getBaseObjectForComparison(predicate.getType(), baseObj); The values for min and max are of type String which contain as many characters as the CHAR column indicated. For example if the type is CHAR(10), and the row has value 1, the value of String min is 1 ; Before Hive 1.2, the method getConvertedStatsObj would call StringUtils.stripEnd(statsObj.toString(), null); which would remove the trailing spaces from min and max. Later in the compareToRange method, it was able to compare 1 with 1. In Hive 1.2 with the use getBaseObjectForComparison method, it simply returns obj.String if the data type is String, which means minValue and maxValue are still 1 . As a result, the compareToRange method will return a wrong value (1.compareTo(1 ) -9 instead of 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652129#comment-14652129 ] Hive QA commented on HIVE-11319: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748452/HIVE-11319.2.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4802/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4802/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4802/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.io.IOException: Could not create /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4802/succeeded/TestHCatHiveCompatibility {noformat} This message is automatically generated. ATTACHMENT ID: 12748452 - PreCommit-HIVE-TRUNK-Build CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results
[ https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-11410. - Resolution: Cannot Reproduce Join with subquery containing a group by incorrectly returns no results --- Key: HIVE-11410 URL: https://issues.apache.org/jira/browse/HIVE-11410 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.1.0 Reporter: Nicholas Brenwald Assignee: Matt McCline Priority: Minor Attachments: hive-site.xml Start by creating a table *t* with columns *c1* and *c2* and populate with 1 row of data. For example create table *t* from an existing table which contains at least 1 row of data by running: {code} create table t as select 'abc' as c1, 0 as c2 from Y limit 1; {code} Table *t* looks like the following: ||c1||c2|| |abc|0| Running the following query then returns zero results. {code} SELECT t1.c1 FROM t t1 JOIN (SELECT t2.c1, MAX(t2.c2) AS c2 FROM t t2 GROUP BY t2.c1 ) t3 ON t1.c2=t3.c2 {code} However, we expected to see the following: ||c1|| |abc| The problem seems to relate to the fact that in the subquery, we group by column *c1*, but this is not subsequently used in the join condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11438) Join a ACID table with non-ACID table fail with MR on 1.0.0
[ https://issues.apache.org/jira/browse/HIVE-11438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-11438: -- Attachment: HIVE-11438.1-branch-1.0.patch Rename the patch for precommit test. Join a ACID table with non-ACID table fail with MR on 1.0.0 --- Key: HIVE-11438 URL: https://issues.apache.org/jira/browse/HIVE-11438 Project: Hive Issue Type: Bug Components: Query Processor, Transactions Affects Versions: 1.0.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 1.0.1 Attachments: HIVE-11438.1-branch-1.0.patch, HIVE-11438.1.patch The following script fail on MR mode: Preparation: {code} CREATE TABLE orc_update_table (k1 INT, f1 STRING, op_code STRING) CLUSTERED BY (k1) INTO 2 BUCKETS STORED AS ORC TBLPROPERTIES(transactional=true); INSERT INTO TABLE orc_update_table VALUES (1, 'a', 'I'); CREATE TABLE orc_table (k1 INT, f1 STRING) CLUSTERED BY (k1) SORTED BY (k1) INTO 2 BUCKETS STORED AS ORC; INSERT OVERWRITE TABLE orc_table VALUES (1, 'x'); {code} Then run the following script: {code} SET hive.execution.engine=mr; SET hive.auto.convert.join=false; SET hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat; SELECT t1.*, t2.* FROM orc_table t1 JOIN orc_update_table t2 ON t1.k1=t2.k1 ORDER BY t1.k1; {code} Stack: {code} java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getCombineSplits(CombineHiveInputFormat.java:272) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:509) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:624) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:616) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:492) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:585) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:580) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:580) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:571) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:429) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1606) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1367) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1179) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Job Submission failed with exception 'java.lang.NullPointerException(null)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask {code} Note the query is the same as HIVE-11422. But in 1.0.0 for this Jira, it throw a different exeception. -- This message
[jira] [Issue Comment Deleted] (HIVE-11433) NPE for a multiple inner join query
[ https://issues.apache.org/jira/browse/HIVE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-11433: --- Comment: was deleted (was: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748296/HIVE-11433.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4784/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4784/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4784/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: java.io.IOException: Could not create /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4784/succeeded/TestCompactor {noformat} This message is automatically generated. ATTACHMENT ID: 12748296 - PreCommit-HIVE-TRUNK-Build) NPE for a multiple inner join query --- Key: HIVE-11433 URL: https://issues.apache.org/jira/browse/HIVE-11433 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0, 1.1.0, 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11433.patch, HIVE-11433.patch NullPointException is thrown for query that has multiple (greater than 3) inner joins. Stacktrace for 1.1.0 {code} NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.ParseUtils.getIndex(ParseUtils.java:149) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:166) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:8257) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:8422) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9805) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9714) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10150) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10161) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10078) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:386) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:373) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code}. However, the problem can also be reproduced in
[jira] [Updated] (HIVE-11433) NPE for a multiple inner join query
[ https://issues.apache.org/jira/browse/HIVE-11433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-11433: --- Attachment: HIVE-11433.patch NPE for a multiple inner join query --- Key: HIVE-11433 URL: https://issues.apache.org/jira/browse/HIVE-11433 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0, 1.1.0, 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11433.patch, HIVE-11433.patch NullPointException is thrown for query that has multiple (greater than 3) inner joins. Stacktrace for 1.1.0 {code} NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.ParseUtils.getIndex(ParseUtils.java:149) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:166) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185) at org.apache.hadoop.hive.ql.parse.ParseUtils.checkJoinFilterRefersOneAlias(ParseUtils.java:185) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoins(SemanticAnalyzer.java:8257) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.mergeJoinTree(SemanticAnalyzer.java:8422) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9805) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9714) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10150) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10161) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10078) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1104) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:101) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:172) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:386) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:373) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:692) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code}. However, the problem can also be reproduced in latest master branch. Further investigation shows that the following code (in ParseUtils.java) is problematic: {code} static int getIndex(String[] list, String elem) { for(int i=0; i list.length; i++) { if (list[i].toLowerCase().equals(elem)) { return i; } } return -1; } {code} The code assumes that every element in the list is not null, which isn't true because of the following code in SemanticAnalyzer.java (method genJoinTree()): {code} if ((right.getToken().getType() == HiveParser.TOK_TABREF) || (right.getToken().getType() == HiveParser.TOK_SUBQUERY) || (right.getToken().getType() == HiveParser.TOK_PTBLFUNCTION)) { String tableName = getUnescapedUnqualifiedTableName((ASTNode) right.getChild(0)) .toLowerCase(); String alias = extractJoinAlias(right, tableName); String[] rightAliases = new String[1]; rightAliases[0] = alias;
[jira] [Updated] (HIVE-11441) No DDL allowed on table if user accidentally set table location wrong
[ https://issues.apache.org/jira/browse/HIVE-11441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-11441: -- Attachment: HIVE-11441.1.patch Provide a patch which throws exception if user want to alter location to a non-exist host/port. No DDL allowed on table if user accidentally set table location wrong - Key: HIVE-11441 URL: https://issues.apache.org/jira/browse/HIVE-11441 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11441.1.patch If user makes a mistake, hive should either correct it in the first place, or allow user a chance to correct it. STEPS TO REPRODUCE: create table testwrongloc(id int); alter table testwrongloc set location hdfs://a-valid-hostname/tmp/testwrongloc; --at this time, hive should throw error, as hdfs://a-valid-hostname is not a valid path, it either needs to be hdfs://namenode-hostname:8020/ or hdfs://hdfs-nameservice for HA alter table testwrongloc set location hdfs://correct-host:8020/tmp/testwrongloc or drop table testwrongloc; upon this hive throws error, that host 'a-valid-hostname' is not reachable {code} 2015-07-30 12:19:43,573 DEBUG [main]: transport.TSaslTransport (TSaslTransport.java:readFrame(429)) - CLIENT: reading data length: 293 2015-07-30 12:19:43,720 ERROR [main]: ql.Driver (SessionState.java:printError(833)) - FAILED: SemanticException Unable to fetch table testloc. java.net.ConnectException: Call From hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused org.apache.hadoop.hive.ql.parse.SemanticException: Unable to fetch table testloc. java.net.ConnectException: Call From hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1323) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1309) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.addInputsOutputsAlterTable(DDLSemanticAnalyzer.java:1387) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableLocation(DDLSemanticAnalyzer.java:1452) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:295) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table testloc. java.net.ConnectException: Call From hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1072) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1019) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1316) ... 23
[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results
[ https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652089#comment-14652089 ] Nicholas Brenwald commented on HIVE-11410: -- [~mmccline] I have done some further testing today compiling from source various branches. The issue only seems to be present in release-1.1.0 (which is part of the Cloudera distribution we use). The issue cannot be reproduced in branch-1.1 or branch-1.2 (even when using our environment variables/hive-site.xml etc). As such I think this can be marked as resolved. Thanks for looking into this and sorry for the false alarm. Join with subquery containing a group by incorrectly returns no results --- Key: HIVE-11410 URL: https://issues.apache.org/jira/browse/HIVE-11410 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.1.0 Reporter: Nicholas Brenwald Assignee: Matt McCline Priority: Minor Attachments: hive-site.xml Start by creating a table *t* with columns *c1* and *c2* and populate with 1 row of data. For example create table *t* from an existing table which contains at least 1 row of data by running: {code} create table t as select 'abc' as c1, 0 as c2 from Y limit 1; {code} Table *t* looks like the following: ||c1||c2|| |abc|0| Running the following query then returns zero results. {code} SELECT t1.c1 FROM t t1 JOIN (SELECT t2.c1, MAX(t2.c2) AS c2 FROM t t2 GROUP BY t2.c1 ) t3 ON t1.c2=t3.c2 {code} However, we expected to see the following: ||c1|| |abc| The problem seems to relate to the fact that in the subquery, we group by column *c1*, but this is not subsequently used in the join condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652170#comment-14652170 ] Chao Sun commented on HIVE-9152: Thanks [~leftylev], I've added descriptions to the wiki. Dynamic Partition Pruning [Spark Branch] Key: HIVE-9152 URL: https://issues.apache.org/jira/browse/HIVE-9152 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Brock Noland Assignee: Chao Sun Labels: TODOC-SPARK, TODOC1.3 Fix For: spark-branch, 1.3.0, 2.0.0 Attachments: HIVE-9152.1-spark.patch, HIVE-9152.10-spark.patch, HIVE-9152.11-spark.patch, HIVE-9152.12-spark.patch, HIVE-9152.2-spark.patch, HIVE-9152.3-spark.patch, HIVE-9152.4-spark.patch, HIVE-9152.5-spark.patch, HIVE-9152.6-spark.patch, HIVE-9152.8-spark.patch, HIVE-9152.9-spark.patch Tez implemented dynamic partition pruning in HIVE-7826. This is a nice optimization and we should implement the same in HOS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11432) Hive macro give same result for different arguments
[ https://issues.apache.org/jira/browse/HIVE-11432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652187#comment-14652187 ] Pengcheng Xiong commented on HIVE-11432: [~mendax], it is not committed yet. [~hsubramaniyan], could you please review it? Thanks! Hive macro give same result for different arguments --- Key: HIVE-11432 URL: https://issues.apache.org/jira/browse/HIVE-11432 Project: Hive Issue Type: Bug Reporter: Jay Pandya Assignee: Pengcheng Xiong Attachments: HIVE-11432.01.patch If you use hive macro more than once while processing same row, hive returns same result for all invocations even if the argument are different. Example : CREATE TABLE macro_testing( a int, b int, c int) select * from macro_testing; 1 2 3 4 5 6 7 8 9 1011 12 create temporary macro math_square(x int) x*x; select math_square(a), b, math_square(c) from macro_testing; 9 2 9 365 36 818 81 144 11 144 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11442) Remove commons-configuration.jar from Hive distribution
[ https://issues.apache.org/jira/browse/HIVE-11442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-11442: -- Attachment: HIVE-11442.1.patch Remove commons-configuration.jar from Hive distribution --- Key: HIVE-11442 URL: https://issues.apache.org/jira/browse/HIVE-11442 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11442.1.patch Some customer report version conflicting for Hive bundled commons-configuration.jar. Actually commons-configuration.jar is not needed by Hive. It is a transitive dependency of Hadoop/Accumulo. User should be able to pick those jars from Hadoop at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11316) Use datastructure that doesnt duplicate any part of string for ASTNode::toStringTree()
[ https://issues.apache.org/jira/browse/HIVE-11316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651792#comment-14651792 ] Jesus Camacho Rodriguez commented on HIVE-11316: +1 Use datastructure that doesnt duplicate any part of string for ASTNode::toStringTree() -- Key: HIVE-11316 URL: https://issues.apache.org/jira/browse/HIVE-11316 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11316-branch-1.0.patch, HIVE-11316-branch-1.2.patch, HIVE-11316.1.patch, HIVE-11316.2.patch, HIVE-11316.3.patch, HIVE-11316.4.patch, HIVE-11316.5.patch, HIVE-11316.6.patch, HIVE-11316.7.patch HIVE-11281 uses an approach to memoize toStringTree() for ASTNode. This jira is suppose to alter the string memoization to use a different data structure that doesn't duplicate any part of the string so that we do not run into OOM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11391) CBO (Calcite Return Path): Add CBO tests with return path on
[ https://issues.apache.org/jira/browse/HIVE-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11391: --- Attachment: HIVE-11391.patch CBO (Calcite Return Path): Add CBO tests with return path on Key: HIVE-11391 URL: https://issues.apache.org/jira/browse/HIVE-11391 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11391.patch, HIVE-11391.patch, HIVE-11391.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651876#comment-14651876 ] Yongzhi Chen commented on HIVE-11319: - Build machine out of disk? Reattach second patch. CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11319: Attachment: HIVE-11319.2.patch CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-10975: Attachment: HIVE-10975.1.patch Failed to download some external deps. Attach it to trigger again. Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.1.patch, HIVE-10975.1.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11319: Attachment: (was: HIVE-11319.2.patch) CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11413) Error in detecting availability of HiveSemanticAnalyzerHooks
[ https://issues.apache.org/jira/browse/HIVE-11413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652396#comment-14652396 ] Hive QA commented on HIVE-11413: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748467/HIVE-11413.2.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9319 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_handler_bulk org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4804/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4804/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4804/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748467 - PreCommit-HIVE-TRUNK-Build Error in detecting availability of HiveSemanticAnalyzerHooks Key: HIVE-11413 URL: https://issues.apache.org/jira/browse/HIVE-11413 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 2.0.0 Reporter: Raajay Viswanathan Assignee: Raajay Viswanathan Priority: Trivial Labels: newbie Fix For: 2.0.0 Attachments: HIVE-11413.2.patch, HIVE-11413.2.patch, HIVE-11413.patch In {{compile(String, Boolean)}} function in {{Driver.java}}, the list of available {{HiveSemanticAnalyzerHook}} (_saHooks_) are obtained using the {{getHooks}} method. This method always returns a {{List}} of hooks. However, while checking for availability of hooks, the current version of the code uses a comparison of _saHooks_ with NULL. This is incorrect, as the segment of code designed to call pre and post Analyze functions gets executed even when the list is empty. The comparison should be changed to {{saHooks.size() 0}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10880) The bucket number is not respected in insert overwrite.
[ https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652398#comment-14652398 ] Yongzhi Chen commented on HIVE-10880: - The patch is fixing following issue: In local mode and when enforce.bucketing is true, for bucket table, insert overwrite to table or static partition, bucket number is not respected. Because only dynamic partition works fine, this fix uses the same idea as how to handle the dynamic partition scenario. Attach patch 4 after rebase. The bucket number is not respected in insert overwrite. --- Key: HIVE-10880 URL: https://issues.apache.org/jira/browse/HIVE-10880 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Priority: Critical Attachments: HIVE-10880.1.patch, HIVE-10880.2.patch, HIVE-10880.3.patch When hive.enforce.bucketing is true, the bucket number defined in the table is no longer respected in current master and 1.2. Reproduce: {code:sql} CREATE TABLE IF NOT EXISTS buckettestinput( data string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput1( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput2( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; {code} Then I inserted the following data into the buckettestinput table: {noformat} firstinsert1 firstinsert2 firstinsert3 firstinsert4 firstinsert5 firstinsert6 firstinsert7 firstinsert8 secondinsert1 secondinsert2 secondinsert3 secondinsert4 secondinsert5 secondinsert6 secondinsert7 secondinsert8 {noformat} {code:sql} set hive.enforce.bucketing = true; set hive.enforce.sorting=true; insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'; set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); {code} {noformat} Error: Error while compiling statement: FAILED: SemanticException [Error 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 (state=42000,code=10141) {noformat} The related debug information related to insert overwrite: {noformat} 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'insert overwrite table buckettestoutput1 0: jdbc:hive2://localhost:1 ; select * from buckettestinput where data like ' first%'; INFO : Number of reduce tasks determined at compile time: 2 INFO : In order to change the average load for a reducer (in bytes): INFO : set hive.exec.reducers.bytes.per.reducer=number INFO : In order to limit the maximum number of reducers: INFO : set hive.exec.reducers.max=number INFO : In order to set a constant number of reducers: INFO : set mapred.reduce.tasks=number INFO : Job running in-process (local Hadoop) INFO : 2015-06-01 11:09:29,650 Stage-1 map = 86%, reduce = 100% INFO : Ended Job = job_local107155352_0001 INFO : Loading data to table default.buckettestoutput1 from file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1 INFO : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, totalSize=52, rawDataSize=48] No rows affected (1.692 seconds) {noformat} Insert use dynamic partition does not have the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11319: Attachment: (was: HIVE-11319.2.patch) CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-5277) HBase handler skips rows with null valued first cells when only row key is selected
[ https://issues.apache.org/jira/browse/HIVE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni reassigned HIVE-5277: -- Assignee: Swarnim Kulkarni (was: Teddy Choi) HBase handler skips rows with null valued first cells when only row key is selected --- Key: HIVE-5277 URL: https://issues.apache.org/jira/browse/HIVE-5277 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.11.0, 0.11.1, 0.12.0, 0.13.0 Reporter: Teddy Choi Assignee: Swarnim Kulkarni Priority: Critical Attachments: HIVE-5277.1.patch.txt, HIVE-5277.2.patch.txt HBaseStorageHandler skips rows with null valued first cells when only row key is selected. {noformat} SELECT key, col1, col2 FROM hbase_table; key1 cell1 cell2 key2 NULLcell3 SELECT COUNT(key) FROM hbase_table; 1 {noformat} HiveHBaseTableInputFormat.getRecordReader makes first cell selected to avoid skipping rows. But when the first cell is null, HBase skips that row. http://hbase.apache.org/book/perf.reading.html 12.9.6. Optimal Loading of Row Keys describes how to deal with this problem. I tried to find an existing issue, but I couldn't. If you find a same issue, please make this issue duplicated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5277) HBase handler skips rows with null valued first cells when only row key is selected
[ https://issues.apache.org/jira/browse/HIVE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652419#comment-14652419 ] Swarnim Kulkarni commented on HIVE-5277: Seems like this patch would need more work with all the updates on the master that have happened since this was logged.I can take the task to make this update. HBase handler skips rows with null valued first cells when only row key is selected --- Key: HIVE-5277 URL: https://issues.apache.org/jira/browse/HIVE-5277 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.11.0, 0.11.1, 0.12.0, 0.13.0 Reporter: Teddy Choi Assignee: Teddy Choi Priority: Critical Attachments: HIVE-5277.1.patch.txt, HIVE-5277.2.patch.txt HBaseStorageHandler skips rows with null valued first cells when only row key is selected. {noformat} SELECT key, col1, col2 FROM hbase_table; key1 cell1 cell2 key2 NULLcell3 SELECT COUNT(key) FROM hbase_table; 1 {noformat} HiveHBaseTableInputFormat.getRecordReader makes first cell selected to avoid skipping rows. But when the first cell is null, HBase skips that row. http://hbase.apache.org/book/perf.reading.html 12.9.6. Optimal Loading of Row Keys describes how to deal with this problem. I tried to find an existing issue, but I couldn't. If you find a same issue, please make this issue duplicated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures
[ https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652446#comment-14652446 ] Jason Dere commented on HIVE-11430: --- Agree about convert_enum_to_string, also looked into this [here|https://issues.apache.org/jira/browse/HIVE-10319?focusedCommentId=14647008page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14647008] What change in master is responsible for dynamic_rdd_cache.q? Followup HIVE-10166: investigate and fix the two test failures -- Key: HIVE-11430 URL: https://issues.apache.org/jira/browse/HIVE-11430 Project: Hive Issue Type: Bug Components: Test Affects Versions: 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11430.patch {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache {code} As show in https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11445) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby distinct does not work
[ https://issues.apache.org/jira/browse/HIVE-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong reassigned HIVE-11445: -- Assignee: Pengcheng Xiong CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby distinct does not work - Key: HIVE-11445 URL: https://issues.apache.org/jira/browse/HIVE-11445 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases
[ https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652522#comment-14652522 ] Nezih Yigitbasi commented on HIVE-10319: Rebased to latest master and re-generated the thrift source with 0.9.2, can you please try merging again [~jdere]? Hive CLI startup takes a long time with a large number of databases --- Key: HIVE-10319 URL: https://issues.apache.org/jira/browse/HIVE-10319 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.0.0 Reporter: Nezih Yigitbasi Assignee: Nezih Yigitbasi Attachments: HIVE-10319.1.patch, HIVE-10319.2.patch, HIVE-10319.3.patch, HIVE-10319.4.patch, HIVE-10319.5.patch, HIVE-10319.6.patch, HIVE-10319.patch The Hive CLI takes a long time to start when there is a large number of databases in the DW. I think the root cause is the way permanent UDFs are loaded from the metastore. When I looked at the logs and the source code I see that at startup Hive first gets all the databases from the metastore and then for each database it makes a metastore call to get the permanent functions for that database [see Hive.java | https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185]. So the number of metastore calls made is in the order of the number of databases. In production we have several hundreds of databases so Hive makes several hundreds of RPC calls during startup, taking 30+ seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases
[ https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-10319: --- Attachment: HIVE-10319.6.patch Hive CLI startup takes a long time with a large number of databases --- Key: HIVE-10319 URL: https://issues.apache.org/jira/browse/HIVE-10319 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.0.0 Reporter: Nezih Yigitbasi Assignee: Nezih Yigitbasi Attachments: HIVE-10319.1.patch, HIVE-10319.2.patch, HIVE-10319.3.patch, HIVE-10319.4.patch, HIVE-10319.5.patch, HIVE-10319.6.patch, HIVE-10319.patch The Hive CLI takes a long time to start when there is a large number of databases in the DW. I think the root cause is the way permanent UDFs are loaded from the metastore. When I looked at the logs and the source code I see that at startup Hive first gets all the databases from the metastore and then for each database it makes a metastore call to get the permanent functions for that database [see Hive.java | https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185]. So the number of metastore calls made is in the order of the number of databases. In production we have several hundreds of databases so Hive makes several hundreds of RPC calls during startup, taking 30+ seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11445) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby distinct does not work
[ https://issues.apache.org/jira/browse/HIVE-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11445: --- Attachment: HIVE-11445.01.patch a temporary patch. may need more work. CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby distinct does not work - Key: HIVE-11445 URL: https://issues.apache.org/jira/browse/HIVE-11445 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11445.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-5277) HBase handler skips rows with null valued first cells when only row key is selected
[ https://issues.apache.org/jira/browse/HIVE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-5277: --- Priority: Critical (was: Major) HBase handler skips rows with null valued first cells when only row key is selected --- Key: HIVE-5277 URL: https://issues.apache.org/jira/browse/HIVE-5277 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.11.0, 0.11.1, 0.12.0, 0.13.0 Reporter: Teddy Choi Assignee: Teddy Choi Priority: Critical Attachments: HIVE-5277.1.patch.txt, HIVE-5277.2.patch.txt HBaseStorageHandler skips rows with null valued first cells when only row key is selected. {noformat} SELECT key, col1, col2 FROM hbase_table; key1 cell1 cell2 key2 NULLcell3 SELECT COUNT(key) FROM hbase_table; 1 {noformat} HiveHBaseTableInputFormat.getRecordReader makes first cell selected to avoid skipping rows. But when the first cell is null, HBase skips that row. http://hbase.apache.org/book/perf.reading.html 12.9.6. Optimal Loading of Row Keys describes how to deal with this problem. I tried to find an existing issue, but I couldn't. If you find a same issue, please make this issue duplicated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10880) The bucket number is not respected in insert overwrite.
[ https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-10880: Attachment: HIVE-10880.4.patch The bucket number is not respected in insert overwrite. --- Key: HIVE-10880 URL: https://issues.apache.org/jira/browse/HIVE-10880 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Priority: Critical Attachments: HIVE-10880.1.patch, HIVE-10880.2.patch, HIVE-10880.3.patch, HIVE-10880.4.patch When hive.enforce.bucketing is true, the bucket number defined in the table is no longer respected in current master and 1.2. Reproduce: {code:sql} CREATE TABLE IF NOT EXISTS buckettestinput( data string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput1( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput2( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; {code} Then I inserted the following data into the buckettestinput table: {noformat} firstinsert1 firstinsert2 firstinsert3 firstinsert4 firstinsert5 firstinsert6 firstinsert7 firstinsert8 secondinsert1 secondinsert2 secondinsert3 secondinsert4 secondinsert5 secondinsert6 secondinsert7 secondinsert8 {noformat} {code:sql} set hive.enforce.bucketing = true; set hive.enforce.sorting=true; insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'; set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); {code} {noformat} Error: Error while compiling statement: FAILED: SemanticException [Error 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 (state=42000,code=10141) {noformat} The related debug information related to insert overwrite: {noformat} 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'insert overwrite table buckettestoutput1 0: jdbc:hive2://localhost:1 ; select * from buckettestinput where data like ' first%'; INFO : Number of reduce tasks determined at compile time: 2 INFO : In order to change the average load for a reducer (in bytes): INFO : set hive.exec.reducers.bytes.per.reducer=number INFO : In order to limit the maximum number of reducers: INFO : set hive.exec.reducers.max=number INFO : In order to set a constant number of reducers: INFO : set mapred.reduce.tasks=number INFO : Job running in-process (local Hadoop) INFO : 2015-06-01 11:09:29,650 Stage-1 map = 86%, reduce = 100% INFO : Ended Job = job_local107155352_0001 INFO : Loading data to table default.buckettestoutput1 from file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1 INFO : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, totalSize=52, rawDataSize=48] No rows affected (1.692 seconds) {noformat} Insert use dynamic partition does not have the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries
[ https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652450#comment-14652450 ] Matt McCline commented on HIVE-11415: - [~jvaria] FYI. Add early termination for recursion in vectorization for deep filter queries Key: HIVE-11415 URL: https://issues.apache.org/jira/browse/HIVE-11415 Project: Hive Issue Type: Bug Reporter: Prasanth Jayachandran Assignee: Matt McCline Queries with deep filters (left deep) throws StackOverflowException in vectorization {code} Exception in thread main java.lang.StackOverflowError at java.lang.Class.getAnnotation(Class.java:3415) at org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29) at org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) {code} Sample query: {code} explain select count(*) from over1k where ( (t=1 and si=2) or (t=2 and si=3) or (t=3 and si=4) or (t=4 and si=5) or (t=5 and si=6) or (t=6 and si=7) or (t=7 and si=8) ... .. {code} repeat the filter for few thousand times for reproduction of the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652423#comment-14652423 ] Yongzhi Chen commented on HIVE-11319: - Tests did not run. Re-attach. CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11319: Attachment: HIVE-11319.2.patch CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8678) Pig fails to correctly load DATE fields using HCatalog
[ https://issues.apache.org/jira/browse/HIVE-8678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652435#comment-14652435 ] Sushanth Sowmyan commented on HIVE-8678: On digging further, my issues in the 0.13.1 vm were a different issue from the one reported here, and was related to pig's jodatime being an older library than needed. It was solved by adding a joda-time-2.1.jar to PIG_CLASSPATH, and setting PIG_USER_CLASSPATH_FIRST so that it picked it up first. At this point, I am not able to reproduce this issue with 0.13.1 either. Pig fails to correctly load DATE fields using HCatalog -- Key: HIVE-8678 URL: https://issues.apache.org/jira/browse/HIVE-8678 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.1 Reporter: Michael McLellan Assignee: Sushanth Sowmyan Using: Hadoop 2.5.0-cdh5.2.0 Pig 0.12.0-cdh5.2.0 Hive 0.13.1-cdh5.2.0 When using pig -useHCatalog to load a Hive table that has a DATE field, when trying to DUMP the field, the following error occurs: {code} 2014-10-30 22:58:05,469 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.sql.Date at org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:420) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:457) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:375) at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64) 2014-10-30 22:58:05,469 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 6018: Error converting read value to tuple {code} It seems to be occuring here: https://github.com/apache/hive/blob/trunk/hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java#L433 and that it should be: {code}Date d = Date.valueOf(o);{code} instead of {code}Date d = (Date) o;{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases
[ https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652440#comment-14652440 ] Jason Dere commented on HIVE-10319: --- Just tried to apply the patch to master, but getting a ton of conflicts. It looks like HIVE-9152 (brought in by the merge from Spark branch) has switched to using Thrift 0.9.2 to generate the thrift files. Can you regenerate the changes using Thrift-0.9.2 again? You don't have to fix convert_enum_to_string.q, it looks like [~xuefuz] is trying to fix that in HIVE-11430. Hive CLI startup takes a long time with a large number of databases --- Key: HIVE-10319 URL: https://issues.apache.org/jira/browse/HIVE-10319 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.0.0 Reporter: Nezih Yigitbasi Assignee: Nezih Yigitbasi Attachments: HIVE-10319.1.patch, HIVE-10319.2.patch, HIVE-10319.3.patch, HIVE-10319.4.patch, HIVE-10319.5.patch, HIVE-10319.patch The Hive CLI takes a long time to start when there is a large number of databases in the DW. I think the root cause is the way permanent UDFs are loaded from the metastore. When I looked at the logs and the source code I see that at startup Hive first gets all the databases from the metastore and then for each database it makes a metastore call to get the permanent functions for that database [see Hive.java | https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185]. So the number of metastore calls made is in the order of the number of databases. In production we have several hundreds of databases so Hive makes several hundreds of RPC calls during startup, taking 30+ seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures
[ https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-11430: --- Attachment: HIVE-11430.patch Followup HIVE-10166: investigate and fix the two test failures -- Key: HIVE-11430 URL: https://issues.apache.org/jira/browse/HIVE-11430 Project: Hive Issue Type: Bug Components: Test Affects Versions: 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-11430.patch, HIVE-11430.patch {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache {code} As show in https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066. -- This message was sent by Atlassian JIRA (v6.3.4#6332)