[ 
https://issues.apache.org/jira/browse/TEZ-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-1499:
----------------------------
    Summary: Add SortMergeJoinExample to tez-examples  (was: Add 
OrderedJoinExample to tez-examples)

> Add SortMergeJoinExample to tez-examples
> ----------------------------------------
>
>                 Key: TEZ-1499
>                 URL: https://issues.apache.org/jira/browse/TEZ-1499
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>         Attachments: Tez-1499-2.patch, Tez-1499.patch
>
>
> In the current join example, the inputs of JoinProcessor is unordered so that 
> it will always need to load one input into memory, and stream another input. 
> This only fit for the case when one dataset is small enough to fit into 
> memory ( even use no-broadcast, memory may not be enough ).  So I'd like to 
> add another join example that make the inputs of JoinProcessor is ordered. ( 
> using OrderedPartitionedKVEdgeConfig ). This kind of join could been used 
> when both of the 2 datasets are large.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to