[ 
https://issues.apache.org/jira/browse/HIVE-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-8597:
---------------------------------
    Attachment: HIVE-8597.1.patch

Patch to create one set of serialized splits for each bucket, and re-use them 
across tasks processing the same bucket. Also removes some unused variables, 
and cleans up variables to allow for GC.

[~vikram.dixit] - please review.

> SMB join small table side should use the same set of serialized payloads 
> across tasks
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-8597
>                 URL: https://issues.apache.org/jira/browse/HIVE-8597
>             Project: Hive
>          Issue Type: Improvement
>          Components: Tez
>    Affects Versions: 0.14.0
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>             Fix For: 0.14.0
>
>         Attachments: HIVE-8597.1.patch
>
>
> Each task sees all splits belonging to the bucket being processed by the 
> task. At the moment, we end up using different instances of the same 
> serialized split which adds unnecessary memory pressure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to