[jira] [Updated] (HIVE-19967) SMB Join : ReduceSink should use correct keys in optraits

Deepak Jaiswal (JIRA) Thu, 21 Jun 2018 21:20:57 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-19967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Deepak Jaiswal updated HIVE-19967:
----------------------------------
    Status: Patch Available  (was: In Progress)

> SMB Join : ReduceSink should use correct keys in optraits
> ---------------------------------------------------------
>
>                 Key: HIVE-19967
>                 URL: https://issues.apache.org/jira/browse/HIVE-19967
>             Project: Hive
>          Issue Type: Task
>            Reporter: Deepak Jaiswal
>            Assignee: Deepak Jaiswal
>            Priority: Major
>
> The optraits for ReduceSinkOp used to use the key columns as bucket and sort 
> columns which worked fine for SMB, however, to enable prefix in Bucket Map 
> Join, this logic was updated to use the bucket columns from parent operators. 
> However, this may break reduce side SMB in a scenario like this,
>  
> Task1 (TS bucketed by col0), passes it down to RS which ignores the key 
> columns and uses col0 as bucket key.
> Task2 (Set of ops work such that data is sorted by a set of columns), 
> however, with current logic, the bucketing column set in Task1 keeps getting 
> pushed in Optraits, thus losing the real flow.
> Task3(Join op) The physical optimizer looks at the parent RS ops which 
> incidentally are sorted by same column as the original Task1's bucket column, 
> however, in the meantime lost the meaning.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19967) SMB Join : ReduceSink should use correct keys in optraits

Reply via email to