[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches

Remus Rusanu (JIRA) Wed, 21 May 2014 03:57:13 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004572#comment-14004572
 ]


Remus Rusanu commented on HIVE-7105:
------------------------------------

Extending the vectorized processing to the reduce side is a complex 
undertaking. None of the vector mode operators are implemented in reduce side. 
The thinking is that the bulk of the CPU intensive processing occurs on the map 
side and our goal was to provide maximum feature coverage (ie. implement as 
many operators as needed to cover the most queries) but atm vectorization only 
works for map side of first stage. I'm not sure whether at this stage we can 
call the map side effort stable/mature/complete enough to warrant a focus shift 
to reduce side.

> Enable ReduceRecordProcessor to generate VectorizedRowBatches
> -------------------------------------------------------------
>
>                 Key: HIVE-7105
>                 URL: https://issues.apache.org/jira/browse/HIVE-7105
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>            Reporter: Rajesh Balamohan
>            Assignee: Jitendra Nath Pandey
>
> Currently, ReduceRecordProcessor sends one key,value pair at a time to its 
> operator pipeline.  It would be beneficial to send VectorizedRowBatch to 
> downstream operators. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7105) Enable ReduceRecordProcessor to generate VectorizedRowBatches

Reply via email to