[ 
https://issues.apache.org/jira/browse/HIVE-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891107#comment-15891107
 ] 

Ashutosh Chauhan commented on HIVE-15844:
-----------------------------------------

Thanks for taking this up [~ekoifman] Such refactoring is very much needed to 
keep sanity of devs and readability of code.

I am still troubled with SPDO inserting "constant" bucket_number column and the 
RS magically computing and replacing that constant at runtime. Ideally that 
should be created as column expression which is evaluated as any other 
expression in RS (or perhaps in SEL prior to it). I am hopeful someday that 
refactoring would happen as well :)

> Make ReduceSinkOperator independent of Acid
> -------------------------------------------
>
>                 Key: HIVE-15844
>                 URL: https://issues.apache.org/jira/browse/HIVE-15844
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>             Fix For: 2.2.0
>
>         Attachments: HIVE-15844.01.patch, HIVE-15844.02.patch, 
> HIVE-15844.03.patch, HIVE-15844.04.patch, HIVE-15844.05.patch, 
> HIVE-15844.06.patch, HIVE-15844.07.patch, HIVE-15844.08.patch
>
>
> # both FileSinkDesk and ReduceSinkDesk have special code path for 
> Update/Delete operations. It is not always set correctly for ReduceSink. 
> ReduceSinkDeDuplication is one place where it gets lost. Even when it isn't 
> set correctly, elsewhere we set ROW_ID to be the partition column of the 
> ReduceSinkOperator and UDFToInteger special cases it to extract bucketId from 
> ROW_ID. We need to modify Explain Plan to record Write Type (i.e. 
> insert/update/delete) to make sure we have tests that can catch errors here.
> # Add some validation at the end of the plan to make sure that RSO/FSO which 
> represent the end of the pipeline and write to acid table have WriteType set 
> (to something other than default).
> #  We don't seem to have any tests where number of buckets is > number of 
> reducers. Add those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to