[jira] [Commented] (HIVE-10047) LLAP: VectorMapJoinOperator gets an over-flow on batchSize of 1024
[ https://issues.apache.org/jira/browse/HIVE-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372576#comment-14372576 ] Matt McCline commented on HIVE-10047: - I can tell from the stack trace this is not the HIVE-9937 patch because it is using VectorColumnAssignFactory (unless the stack trace from using wrong binary). Very strange problem. The code is correct. It is looping over columns and filling up the next row in the outputBatch by using outputBatch.size as the index. After the row is filled, it increments outputBatch.size and checks if the outputBatch is full. So, what could be going wrong here? Pure speculation: 1) Someone increased DEFAULT_SIZE to more than 1024, but VectorizedRowBatchCtx.createVectorizedRowBatch() is creating 1024 sized batches? 2) A VectorMapJoinOperator is being concurrently executed by 2 threads. Ah, but we are single threaded. Oh. LLAP: VectorMapJoinOperator gets an over-flow on batchSize of 1024 -- Key: HIVE-10047 URL: https://issues.apache.org/jira/browse/HIVE-10047 Project: Hive Issue Type: Sub-task Affects Versions: llap Reporter: Gopal V Assignee: Matt McCline Fix For: llap Simple LLAP queries on constrained resources runs into an exception which suggests that the {code} Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024 at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$VectorLongColumnAssign.assignLong(VectorColumnAssignFactory.java:113) at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$9.assignObjectValue(VectorColumnAssignFactory.java:293) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:196) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:653) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:656) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:752) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:316) ... 22 more {code} The relevant line is due to the check for a full output-batch being outside of the loop here - looks like it can be triggered during MxN joins where there are more values than there were input rows in the input batch. {code} for (int i=0; ivalues.length; ++i) { vcas[i].assignObjectValue(values[i], outputBatch.size); } ++outputBatch.size; if (outputBatch.size == VectorizedRowBatch.DEFAULT_SIZE) { flushOutput(); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3454) Problem with CAST(BIGINT as TIMESTAMP)
[ https://issues.apache.org/jira/browse/HIVE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372897#comment-14372897 ] Hive QA commented on HIVE-3454: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12706162/HIVE-3454.3.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7819 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_between_in org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorTypeCasts.testCastLongToTimestamp org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3105/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3105/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3105/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12706162 - PreCommit-HIVE-TRUNK-Build Problem with CAST(BIGINT as TIMESTAMP) -- Key: HIVE-3454 URL: https://issues.apache.org/jira/browse/HIVE-3454 Project: Hive Issue Type: Bug Components: Types, UDF Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.13.1 Reporter: Ryan Harris Assignee: Aihua Xu Labels: newbie, newdev, patch Attachments: HIVE-3454.1.patch.txt, HIVE-3454.3.patch, HIVE-3454.patch Ran into an issue while working with timestamp conversion. CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current time from the BIGINT returned by unix_timestamp() Instead, however, a 1970-01-16 timestamp is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9859) Create bitwise left/right shift UDFs
[ https://issues.apache.org/jira/browse/HIVE-9859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372931#comment-14372931 ] Pengcheng Xiong commented on HIVE-9859: --- [~jdere], I took a look last night and I think [~apivovarov] was right. There is no good solution to avoid the conflict of with array type declaration. I am sorry about that. Create bitwise left/right shift UDFs Key: HIVE-9859 URL: https://issues.apache.org/jira/browse/HIVE-9859 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Attachments: HIVE-9859.1.patch, HIVE-9859.2.patch, HIVE-9859.3.patch Signature: a b a b a b For example: {code} select 1 4, 8 2, 8 2; OK 16 2 2 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3454) Problem with CAST(BIGINT as TIMESTAMP)
[ https://issues.apache.org/jira/browse/HIVE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372840#comment-14372840 ] Aihua Xu commented on HIVE-3454: Attached should be the right patch now. Maybe I uploaded the right patch first but I worried test run was not started (but seems the test was run against the right patch), so I uploaded again but the wrong one. Thanks for catching it. Problem with CAST(BIGINT as TIMESTAMP) -- Key: HIVE-3454 URL: https://issues.apache.org/jira/browse/HIVE-3454 Project: Hive Issue Type: Bug Components: Types, UDF Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.13.1 Reporter: Ryan Harris Assignee: Aihua Xu Labels: newbie, newdev, patch Attachments: HIVE-3454.1.patch.txt, HIVE-3454.3.patch, HIVE-3454.patch Ran into an issue while working with timestamp conversion. CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current time from the BIGINT returned by unix_timestamp() Instead, however, a 1970-01-16 timestamp is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10040) CBO (Calcite Return Path): Pluggable cost modules [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10040: --- Attachment: HIVE-10040.cbo.patch [~jpullokkaran], can you review it? Thanks CBO (Calcite Return Path): Pluggable cost modules [CBO branch] -- Key: HIVE-10040 URL: https://issues.apache.org/jira/browse/HIVE-10040 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10040.cbo.patch We should be able to deal with cost models in a modular way. Thus, the cost model should be integrated within a Calcite MD provider that is pluggable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10006) RSC has memory leak while execute multi queries.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14372803#comment-14372803 ] Xuefu Zhang commented on HIVE-10006: +1 for patch #8. One nit, It would be great if we can put a similar comment on changes in SparkPlanGenerator.java. Also, we can create a JIRA for HiveInputFormat to track the issue, but no fix is necessary at the moment. RSC has memory leak while execute multi queries.[Spark Branch] -- Key: HIVE-10006 URL: https://issues.apache.org/jira/browse/HIVE-10006 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.1.0 Reporter: Chengxiang Li Assignee: Chengxiang Li Priority: Critical Labels: Spark-M5 Attachments: HIVE-10006.1-spark.patch, HIVE-10006.2-spark.patch, HIVE-10006.2-spark.patch, HIVE-10006.3-spark.patch, HIVE-10006.4-spark.patch, HIVE-10006.5-spark.patch, HIVE-10006.6-spark.patch, HIVE-10006.7-spark.patch, HIVE-10006.8-spark.patch While execute query with RSC, MapWork/ReduceWork number is increased all the time, and lead to OOM at the end. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9693) Introduce a stats cache for HBase metastore [hbase-metastore branch]
[ https://issues.apache.org/jira/browse/HIVE-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14373017#comment-14373017 ] Mostafa Mokhtar commented on HIVE-9693: --- [~vgumashta] Can this bloom filter implementation be used? java/org/apache/hadoop/hive/ql/io/filters/BloomFilter.java Introduce a stats cache for HBase metastore [hbase-metastore branch] - Key: HIVE-9693 URL: https://issues.apache.org/jira/browse/HIVE-9693 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Attachments: HIVE-9693.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10001) SMB join in reduce side
[ https://issues.apache.org/jira/browse/HIVE-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10001: -- Attachment: HIVE-10001.1.patch SMB join in reduce side --- Key: HIVE-10001 URL: https://issues.apache.org/jira/browse/HIVE-10001 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10001.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10001) SMB join in reduce side
[ https://issues.apache.org/jira/browse/HIVE-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14373080#comment-14373080 ] Hive QA commented on HIVE-10001: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12706189/HIVE-10001.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7818 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_16 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3106/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3106/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3106/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12706189 - PreCommit-HIVE-TRUNK-Build SMB join in reduce side --- Key: HIVE-10001 URL: https://issues.apache.org/jira/browse/HIVE-10001 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10001.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-3454) Problem with CAST(BIGINT as TIMESTAMP)
[ https://issues.apache.org/jira/browse/HIVE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-3454: --- Attachment: (was: HIVE-3454.3.patch) Problem with CAST(BIGINT as TIMESTAMP) -- Key: HIVE-3454 URL: https://issues.apache.org/jira/browse/HIVE-3454 Project: Hive Issue Type: Bug Components: Types, UDF Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.13.1 Reporter: Ryan Harris Assignee: Aihua Xu Labels: newbie, newdev, patch Attachments: HIVE-3454.1.patch.txt, HIVE-3454.3.patch, HIVE-3454.patch Ran into an issue while working with timestamp conversion. CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current time from the BIGINT returned by unix_timestamp() Instead, however, a 1970-01-16 timestamp is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3454) Problem with CAST(BIGINT as TIMESTAMP)
[ https://issues.apache.org/jira/browse/HIVE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14374774#comment-14374774 ] Hive QA commented on HIVE-3454: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12706362/HIVE-3454.3.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7819 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3107/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3107/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3107/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12706362 - PreCommit-HIVE-TRUNK-Build Problem with CAST(BIGINT as TIMESTAMP) -- Key: HIVE-3454 URL: https://issues.apache.org/jira/browse/HIVE-3454 Project: Hive Issue Type: Bug Components: Types, UDF Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.13.1 Reporter: Ryan Harris Assignee: Aihua Xu Labels: newbie, newdev, patch Attachments: HIVE-3454.1.patch.txt, HIVE-3454.3.patch, HIVE-3454.patch Ran into an issue while working with timestamp conversion. CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current time from the BIGINT returned by unix_timestamp() Instead, however, a 1970-01-16 timestamp is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9839) HiveServer2 leaks OperationHandle on async queries which fail at compile phase
[ https://issues.apache.org/jira/browse/HIVE-9839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14374764#comment-14374764 ] Nemon Lou commented on HIVE-9839: - Refering to SQLOperation.java,there is no chance that a HiveSQLException throws and the asyn background operation submits to thread pool successfully at the same time.Only line 181 and line 244 can causing HiveSQLException. So the answer to If OperationHandle has to be returned to report the error in Async mode is no. {code} @Override 178 public void runInternal() throws HiveSQLException { 179 setState(OperationState.PENDING); 180 final HiveConf opConfig = getConfigForOperation(); 181 prepare(opConfig); 182 if (!shouldRunAsync()) { 183 runQuery(opConfig); 184 } else { 185 // We'll pass ThreadLocals in the background thread from the foreground (handler) thread 186 final SessionState parentSessionState = SessionState.get(); 187 // ThreadLocal Hive object needs to be set in background thread. 188 // The metastore client in Hive is associated with right user. 189 final Hive parentHive = getSessionHive(); 190 // Current UGI will get used by metastore when metsatore is in embedded mode 191 // So this needs to get passed to the new background thread 192 final UserGroupInformation currentUGI = getCurrentUGI(opConfig); 193 // Runnable impl to call runInternal asynchronously, 194 // from a different thread 195 Runnable backgroundOperation = new Runnable() { 196 @Override 197 public void run() { 198 PrivilegedExceptionActionObject doAsAction = new PrivilegedExceptionActionObject() { 199 @Override 200 public Object run() throws HiveSQLException { 201 Hive.set(parentHive); 202 SessionState.setCurrentSessionState(parentSessionState); 203 // Set current OperationLog in this async thread for keeping on saving query log. 204 registerCurrentOperationLog(); 205 try { 206 runQuery(opConfig); 207 } catch (HiveSQLException e) { 208 setOperationException(e); 209 LOG.error(Error running hive query: , e); 210 } finally { 211 unregisterOperationLog(); 212 } 213 return null; 214 } 215 }; 216 217 try { 218 currentUGI.doAs(doAsAction); 219 } catch (Exception e) { 220 setOperationException(new HiveSQLException(e)); 221 LOG.error(Error running hive query as user : + currentUGI.getShortUserName(), e); 222 } 223 finally { 224 /** 225 * We'll cache the ThreadLocal RawStore object for this background thread for an orderly cleanup 226 * when this thread is garbage collected later. 227 * @see org.apache.hive.service.server.ThreadWithGarbageCleanup#finalize() 228 */ 229 if (ThreadWithGarbageCleanup.currentThread() instanceof ThreadWithGarbageCleanup) { 230 ThreadWithGarbageCleanup currentThread = 231 (ThreadWithGarbageCleanup) ThreadWithGarbageCleanup.currentThread(); 232 currentThread.cacheThreadLocalRawStore(); 233 } 234 } 235 } 236 }; 237 try { 238 // This submit blocks if no background threads are available to run this operation 239 Future? backgroundHandle = 240 getParentSession().getSessionManager().submitBackgroundOperation(backgroundOperation); 241 setBackgroundHandle(backgroundHandle); 242 } catch (RejectedExecutionException rejected) { 243 setState(OperationState.ERROR); 244 throw new HiveSQLException(The background threadpool cannot accept + 245 new task for execution, please retry the operation, rejected); 246 } 247 } 248 } {code} HiveServer2 leaks OperationHandle on async queries which fail at compile phase -- Key: HIVE-9839 URL: https://issues.apache.org/jira/browse/HIVE-9839 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.0, 0.13.1, 1.0.0, 1.1.0 Reporter: Nemon Lou Priority: Critical Attachments: OperationHandleMonitor.java, hive-9839.patch Using beeline to connect to HiveServer2.And type the following: drop
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14374768#comment-14374768 ] Matt McCline commented on HIVE-9937: Added more detail to the exception. Still don't understand the issue... {noformat} Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:404) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:246) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:183) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:470) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:395) ... 16 more Caused by: java.io.EOFException: Detail: java.io.EOFException: Buffer range start 0 current offset 36 range end 36 (total buffer length 66) occured for field 11 of 13 fields (LONG, INT, DOUBLE, SHORT, SHORT, SHORT, DOUBLE, DOUBLE, FLOAT, DOUBLE, DOUBLE, BYTE, DOUBLE) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.throwMoreDetailedException(VectorDeserializeRow.java:668) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeByValue(VectorDeserializeRow.java:640) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:438) ... 17 more {noformat} LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join -- Key: HIVE-9937 URL: https://issues.apache.org/jira/browse/HIVE-9937 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)