[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491441#comment-14491441 ] Matt McCline commented on HIVE-9937: The did not produce a TEST-*.xml stuff for TestMinimrCliDriver is occuring in other submits, so it is unrelated to this patch. > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, > HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch, > HIVE-9937.09.patch, HIVE-9937.91.patch, HIVE-9937.92.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491429#comment-14491429 ] Hive QA commented on HIVE-9937: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12724707/HIVE-9937.92.patch {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8686 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3393/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3393/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3393/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12724707 - PreCommit-HIVE-TRUNK-Build > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, > HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch, > HIVE-9937.09.patch, HIVE-9937.91.patch, HIVE-9937.92.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484929#comment-14484929 ] Gopal V commented on HIVE-9937: --- [~mmccline]: LGTM - +1. Good test coverage - this is just a new fast SerDe + test-cases, without any deviation for the main codepath until the new operators are introduced. Reading a Decimal from Key instead of Value might be a corner case perhaps. Before commit, can you verify the behavour of BinarySortableSerde on Decimal (trailing zeros/precision) - something like vector_decimal_round.q should do as a validity test. > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, > HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch, > HIVE-9937.09.patch, HIVE-9937.91.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482413#comment-14482413 ] Hive QA commented on HIVE-9937: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12723438/HIVE-9937.91.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8715 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-smb_mapjoin_8.q - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3300/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3300/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3300/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12723438 - PreCommit-HIVE-TRUNK-Build > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, > HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch, > HIVE-9937.09.patch, HIVE-9937.91.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14481958#comment-14481958 ] Matt McCline commented on HIVE-9937: Rebased to recent checkins and submit. > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, > HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch, > HIVE-9937.09.patch, HIVE-9937.91.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14396032#comment-14396032 ] Gopal V commented on HIVE-9937: --- [~mmccline]: watch out for HIVE-10128, which will break this patch merge (WriteBuffers readPosition changes) > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, > HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394538#comment-14394538 ] Hive QA commented on HIVE-9937: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12709123/HIVE-9937.08.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8704 tests executed *Failed tests:* {noformat} TestHs2Hooks - did not produce a TEST-*.xml file TestMinimrCliDriver-smb_mapjoin_8.q - did not produce a TEST-*.xml file org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3270/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3270/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3270/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12709123 - PreCommit-HIVE-TRUNK-Build > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, > HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384635#comment-14384635 ] Matt McCline commented on HIVE-9937: Thank you for the review comments. I did have trouble with one of the old VectorSerDe class buffering up 1024 Object[] rows and it caused Writable overwrite problems. But I stopped using that SerDe. The singleRow trick has been used by VectorReduceSinkOperator and VectorFileSinkOperator for a while with no problems. > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, > HIVE-9937.06.patch, HIVE-9937.07.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383465#comment-14383465 ] Hive QA commented on HIVE-9937: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12707584/HIVE-9937.07.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8682 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3177/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3177/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3177/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12707584 - PreCommit-HIVE-TRUNK-Build > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, > HIVE-9937.06.patch, HIVE-9937.07.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382857#comment-14382857 ] Gopal V commented on HIVE-9937: --- [~mmccline]: Ran a few scale tests last night and there seems to be no visible issues with the patch from last night. General comment about asserts - the regular runtime turns off asserts, so you should be using Preconditions.check operations particularly if it is outside the core loop (like the futures.size). Need to re-verify the TODO in VectorAppMasterEventOperator - make sure nothing in the super.process actually buffers the Object[] row, since now the data is modified in-place, while earlier it was generating a new array for each row. This has no safety switch to turn off other than "turn off vectorization", I'd like to see if [~mmokhtar] can get a full TPC-DS run for this. With this epic patch, the slowest part of a group-by is now the full-sort, which gives me something else to fix :) > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, > HIVE-9937.06.patch, HIVE-9937.07.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382585#comment-14382585 ] Matt McCline commented on HIVE-9937: Rebase to recent checkins by Wei and Jason. > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, HIVE-9937.06.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380246#comment-14380246 ] Hive QA commented on HIVE-9937: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12707114/HIVE-9937.06.patch {color:green}SUCCESS:{color} +1 8342 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3149/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3149/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3149/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12707114 - PreCommit-HIVE-TRUNK-Build > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, HIVE-9937.06.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379168#comment-14379168 ] Matt McCline commented on HIVE-9937: Make new LazyBinaryDeserializeRead behave like LazyBinarySerDe and just warn when reading beyond the buffer instead of throwing an EOFException. So, for a test like vectorization_short_regress.q we will get warning in vectorized reduce for not reading all the bytes in the key (1 extra byte which I think is historical). And, a warning when reading beyond the buffer for the value. These warnings variously appear in all cases {MR | Tez} x {NonVec | Vec}. > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377857#comment-14377857 ] Hive QA commented on HIVE-9937: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12706874/HIVE-9937.05.patch {color:green}SUCCESS:{color} +1 7829 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3132/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3132/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3132/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12706874 - PreCommit-HIVE-TRUNK-Build > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377591#comment-14377591 ] Matt McCline commented on HIVE-9937: Rounded out LazySimple support. Add remaining data types to TestVectorSerDeRow. > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375613#comment-14375613 ] Hive QA commented on HIVE-9937: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12706454/HIVE-9937.03.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7822 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3114/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3114/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3114/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12706454 - PreCommit-HIVE-TRUNK-Build > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375432#comment-14375432 ] Matt McCline commented on HIVE-9937: It is looking more and more like LazyBinarySerDe must tolerate reading beyond the specified bytes range. And that there is some bug causing a short length to be returned for the length of the value buffer. LazyBinarySerDe finds the data; LazyBinaryDeserializeRead finds the data when told to ignore the limit. For the moment I have suppressed beyond buffer range checking and will resubmit. > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, > HIVE-9937.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14374821#comment-14374821 ] Gopal V commented on HIVE-9937: --- The explain would make more sense to figure out what the expected schema of the reducer is. It is entirely possible that some of the columns needed are being read from the reducesinkkey0? > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14374768#comment-14374768 ] Matt McCline commented on HIVE-9937: Added more detail to the exception. Still don't understand the issue... {noformat} Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:404) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:246) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:183) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:470) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:395) ... 16 more Caused by: java.io.EOFException: Detail: "java.io.EOFException: Buffer range start 0 current offset 36 range end 36 (total buffer length 66)" occured for field 11 of 13 fields (LONG, INT, DOUBLE, SHORT, SHORT, SHORT, DOUBLE, DOUBLE, FLOAT, DOUBLE, DOUBLE, BYTE, DOUBLE) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.throwMoreDetailedException(VectorDeserializeRow.java:668) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeByValue(VectorDeserializeRow.java:640) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:438) ... 17 more {noformat} > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14374754#comment-14374754 ] Matt McCline commented on HIVE-9937: {noformat} Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:404) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:246) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:183) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:470) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:395) ... 16 more Caused by: java.io.EOFException at org.apache.hadoop.hive.serde2.lazybinary.fast.LazyBinaryDeserializeRead.readByte(LazyBinaryDeserializeRead.java:247) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow$ByteReader.apply(VectorDeserializeRow.java:121) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeByValue(VectorDeserializeRow.java:628) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:438) ... 17 more {noformat} > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372624#comment-14372624 ] Hive QA commented on HIVE-9937: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12706136/HIVE-9937.02.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7822 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_short_regress {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3104/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3104/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3104/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12706136 - PreCommit-HIVE-TRUNK-Build > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363496#comment-14363496 ] Gopal V commented on HIVE-9937: --- {code} Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.io.Text.set(Text.java:225) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow$StringExtractorByValue.extract(VectorExtractRow.java:427) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:675) at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:93) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:835) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:135) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:835) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:160) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) ... 18 more {code} > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363494#comment-14363494 ] Gopal V commented on HIVE-9937: --- [~mmccline]: Pretty impressive performance difference for a shuffle-heavy group-by is almost ~3x cpu savings. But there are some off-by-one errors somewhere, the results out of a few keys seem incorrect in the smaller test runs. Trying to produce a narrower test-case. > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359153#comment-14359153 ] Matt McCline commented on HIVE-9937: Test failure udaf_percentile_approx_23 is a known issue. See HIVE-9833: udaf_percentile_approx_23.q fails intermittently. All tests passed. > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join
[ https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358565#comment-14358565 ] Hive QA commented on HIVE-9937: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12704103/HIVE-9937.01.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7766 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3015/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3015/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3015/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12704103 - PreCommit-HIVE-TRUNK-Build > LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new > Vectorized Map Join > -- > > Key: HIVE-9937 > URL: https://issues.apache.org/jira/browse/HIVE-9937 > Project: Hive > Issue Type: Sub-task >Reporter: Matt McCline >Assignee: Matt McCline > Attachments: HIVE-9937.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)