[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-04-12 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491441#comment-14491441
 ] 

Matt McCline commented on HIVE-9937:


The did not produce a TEST-*.xml stuff for TestMinimrCliDriver is occuring in 
other submits, so it is unrelated to this patch.

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
> HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch, 
> HIVE-9937.09.patch, HIVE-9937.91.patch, HIVE-9937.92.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-04-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491429#comment-14491429
 ] 

Hive QA commented on HIVE-9937:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12724707/HIVE-9937.92.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8686 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3393/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3393/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3393/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12724707 - PreCommit-HIVE-TRUNK-Build

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
> HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch, 
> HIVE-9937.09.patch, HIVE-9937.91.patch, HIVE-9937.92.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-04-08 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14484929#comment-14484929
 ] 

Gopal V commented on HIVE-9937:
---

[~mmccline]: LGTM - +1.

Good test coverage - this is just a new fast SerDe + test-cases, without any 
deviation for the main codepath until the new operators are introduced.

Reading a Decimal from Key instead of Value might be a corner case perhaps.

Before commit, can you verify the behavour of BinarySortableSerde on Decimal 
(trailing zeros/precision) - something like vector_decimal_round.q should do as 
a validity test.

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
> HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch, 
> HIVE-9937.09.patch, HIVE-9937.91.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-04-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482413#comment-14482413
 ] 

Hive QA commented on HIVE-9937:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12723438/HIVE-9937.91.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8715 tests executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-smb_mapjoin_8.q - did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3300/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3300/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3300/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12723438 - PreCommit-HIVE-TRUNK-Build

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
> HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch, 
> HIVE-9937.09.patch, HIVE-9937.91.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-04-06 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14481958#comment-14481958
 ] 

Matt McCline commented on HIVE-9937:


Rebased to recent checkins and submit.

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
> HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch, 
> HIVE-9937.09.patch, HIVE-9937.91.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-04-04 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14396032#comment-14396032
 ] 

Gopal V commented on HIVE-9937:
---

[~mmccline]: watch out for HIVE-10128, which will break this patch merge 
(WriteBuffers readPosition changes)

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
> HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-04-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394538#comment-14394538
 ] 

Hive QA commented on HIVE-9937:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12709123/HIVE-9937.08.patch

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8704 tests executed
*Failed tests:*
{noformat}
TestHs2Hooks - did not produce a TEST-*.xml file
TestMinimrCliDriver-smb_mapjoin_8.q - did not produce a TEST-*.xml file
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3270/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3270/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3270/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12709123 - PreCommit-HIVE-TRUNK-Build

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
> HIVE-9937.06.patch, HIVE-9937.07.patch, HIVE-9937.08.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-27 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384635#comment-14384635
 ] 

Matt McCline commented on HIVE-9937:


Thank you for the review comments.

I did have trouble with one of the old VectorSerDe class buffering up 1024 
Object[] rows and it caused Writable overwrite problems.  But I stopped using 
that SerDe.  The singleRow trick has been used by VectorReduceSinkOperator and 
VectorFileSinkOperator for a while with no problems.

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
> HIVE-9937.06.patch, HIVE-9937.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383465#comment-14383465
 ] 

Hive QA commented on HIVE-9937:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707584/HIVE-9937.07.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8682 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3177/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3177/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3177/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707584 - PreCommit-HIVE-TRUNK-Build

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
> HIVE-9937.06.patch, HIVE-9937.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-26 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382857#comment-14382857
 ] 

Gopal V commented on HIVE-9937:
---

[~mmccline]: Ran a few scale tests last night and there seems to be no visible 
issues with the patch from last night.

General comment about asserts - the regular runtime turns off asserts, so you 
should be using Preconditions.check operations particularly if it is outside 
the core loop (like the futures.size).

Need to re-verify the TODO in VectorAppMasterEventOperator - make sure nothing 
in the super.process actually buffers the Object[] row, since now the data is 
modified in-place, while earlier it was generating a new array for each row.

This has no safety switch to turn off other than "turn off vectorization", I'd 
like to see if [~mmokhtar] can get a full TPC-DS run for this.

With this epic patch, the slowest part of a group-by is now the full-sort, 
which gives me something else to fix :)

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, 
> HIVE-9937.06.patch, HIVE-9937.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-26 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382585#comment-14382585
 ] 

Matt McCline commented on HIVE-9937:


Rebase to recent checkins by Wei and Jason.

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, HIVE-9937.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380246#comment-14380246
 ] 

Hive QA commented on HIVE-9937:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12707114/HIVE-9937.06.patch

{color:green}SUCCESS:{color} +1 8342 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3149/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3149/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3149/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12707114 - PreCommit-HIVE-TRUNK-Build

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch, HIVE-9937.06.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-24 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379168#comment-14379168
 ] 

Matt McCline commented on HIVE-9937:


Make new LazyBinaryDeserializeRead behave like LazyBinarySerDe and just warn 
when reading beyond the buffer instead of throwing an EOFException.
So, for a test like vectorization_short_regress.q we will get warning in 
vectorized reduce for not reading all the bytes in the key (1 extra byte which 
I think is historical).
And, a warning when reading beyond the buffer for the value.

These warnings variously appear in all cases {MR | Tez} x {NonVec | Vec}.

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377857#comment-14377857
 ] 

Hive QA commented on HIVE-9937:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12706874/HIVE-9937.05.patch

{color:green}SUCCESS:{color} +1 7829 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3132/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3132/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3132/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12706874 - PreCommit-HIVE-TRUNK-Build

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch, HIVE-9937.04.patch, HIVE-9937.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-24 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14377591#comment-14377591
 ] 

Matt McCline commented on HIVE-9937:


Rounded out LazySimple support.  Add remaining data types to TestVectorSerDeRow.

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375613#comment-14375613
 ] 

Hive QA commented on HIVE-9937:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12706454/HIVE-9937.03.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7822 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3114/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3114/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3114/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12706454 - PreCommit-HIVE-TRUNK-Build

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-22 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375432#comment-14375432
 ] 

Matt McCline commented on HIVE-9937:


It is looking more and more like LazyBinarySerDe must tolerate reading beyond 
the specified bytes range.  And that there is some bug causing a short length 
to be returned for the length of the value buffer.  LazyBinarySerDe finds the 
data; LazyBinaryDeserializeRead finds the data when told to ignore the limit.

For the moment I have suppressed beyond buffer range checking and will resubmit.

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch, 
> HIVE-9937.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-22 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14374821#comment-14374821
 ] 

Gopal V commented on HIVE-9937:
---

The explain would make more sense to figure out what the expected schema of the 
reducer is.

It is entirely possible that some of the columns needed are being read from the 
reducesinkkey0?

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-21 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14374768#comment-14374768
 ] 

Matt McCline commented on HIVE-9937:


Added more detail to the exception.  Still don't understand the issue...

{noformat}
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing vector batch (tag=0) 
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:404)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:246)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:183)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
... 13 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing vector batch (tag=0) 
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:470)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:395)
... 16 more
Caused by: java.io.EOFException: Detail: "java.io.EOFException: Buffer range 
start 0 current offset 36 range end 36 (total buffer length 66)" occured for 
field 11 of 13 fields (LONG, INT, DOUBLE, SHORT, SHORT, SHORT, DOUBLE, DOUBLE, 
FLOAT, DOUBLE, DOUBLE, BYTE, DOUBLE)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.throwMoreDetailedException(VectorDeserializeRow.java:668)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeByValue(VectorDeserializeRow.java:640)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:438)
... 17 more
{noformat}

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-21 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14374754#comment-14374754
 ] 

Matt McCline commented on HIVE-9937:


{noformat}
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing vector batch (tag=0) 
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:404)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:246)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:183)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
... 13 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing vector batch (tag=0) 
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:470)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector(ReduceRecordSource.java:395)
... 16 more
Caused by: java.io.EOFException
at 
org.apache.hadoop.hive.serde2.lazybinary.fast.LazyBinaryDeserializeRead.readByte(LazyBinaryDeserializeRead.java:247)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow$ByteReader.apply(VectorDeserializeRow.java:121)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeByValue(VectorDeserializeRow.java:628)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(ReduceRecordSource.java:438)
... 17 more
{noformat}

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372624#comment-14372624
 ] 

Hive QA commented on HIVE-9937:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12706136/HIVE-9937.02.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7822 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_short_regress
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3104/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3104/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3104/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12706136 - PreCommit-HIVE-TRUNK-Build

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch, HIVE-9937.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363496#comment-14363496
 ] 

Gopal V commented on HIVE-9937:
---

{code}
Caused by: java.lang.NullPointerException
at java.lang.System.arraycopy(Native Method)
at org.apache.hadoop.io.Text.set(Text.java:225)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow$StringExtractorByValue.extract(VectorExtractRow.java:427)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:675)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:93)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:835)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:135)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:835)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:160)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
... 18 more
{code}

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-16 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363494#comment-14363494
 ] 

Gopal V commented on HIVE-9937:
---

[~mmccline]: Pretty impressive performance difference for a shuffle-heavy 
group-by is almost ~3x cpu savings.

But there are some off-by-one errors somewhere, the results out of a few keys 
seem incorrect in the smaller test runs. Trying to produce a narrower test-case.

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-12 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359153#comment-14359153
 ] 

Matt McCline commented on HIVE-9937:


Test failure udaf_percentile_approx_23 is a known issue.  See HIVE-9833: 
udaf_percentile_approx_23.q fails intermittently.

All tests passed.

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9937) LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new Vectorized Map Join

2015-03-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358565#comment-14358565
 ] 

Hive QA commented on HIVE-9937:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12704103/HIVE-9937.01.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7766 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3015/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3015/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3015/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12704103 - PreCommit-HIVE-TRUNK-Build

> LLAP: Vectorized Field-By-Field Serialize / Deserialize to support new 
> Vectorized Map Join
> --
>
> Key: HIVE-9937
> URL: https://issues.apache.org/jira/browse/HIVE-9937
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Matt McCline
>Assignee: Matt McCline
> Attachments: HIVE-9937.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)