I think the outputOrdering would be the one of the big table (if any) and
it wouldn't matter if this involves the join keys or not. Am I wrong?

2018-06-28 17:01 GMT+02:00 吴晓菊 <chrysan...@gmail.com>:

> Thanks for the reply.
> By looking into the SortMergeJoinExec, I think we can follow what
> SortMergeJoin do, for some types of join, if the children is ordered on
> join keys, we can output the ordered join keys as output ordering.
>
>
> Chrysan Wu
> 吴晓菊
> Phone:+86 17717640807
>
>
> 2018-06-28 22:53 GMT+08:00 Wenchen Fan <cloud0...@gmail.com>:
>
>> SortMergeJoin only reports ordering of the join keys, not the output
>> ordering of any child.
>>
>> It seems reasonable to me that broadcast join should respect the output
>> ordering of the children. Feel free to submit a PR to fix it, thanks!
>>
>> On Thu, Jun 28, 2018 at 10:07 PM 吴晓菊 <chrysan...@gmail.com> wrote:
>>
>>> Why we cannot use the output order of big table?
>>>
>>>
>>> Chrysan Wu
>>> Phone:+86 17717640807
>>>
>>>
>>> 2018-06-28 21:48 GMT+08:00 Marco Gaido <marcogaid...@gmail.com>:
>>>
>>>> The easy answer to this is that SortMergeJoin ensure an outputOrdering,
>>>> while BroadcastHashJoin doesn't, ie. after running a BroadcastHashJoin you
>>>> don't know which is going to be the order of the output since nothing
>>>> enforces it.
>>>>
>>>> Hope this helps.
>>>> Thanks.
>>>> Marco
>>>>
>>>> 2018-06-28 15:46 GMT+02:00 吴晓菊 <chrysan...@gmail.com>:
>>>>
>>>>>
>>>>> We see SortMergeJoinExec is implemented with
>>>>> outputPartitioning&outputOrdering while BroadcastHashJoinExec is only
>>>>> implemented with outputPartitioning. Why is the design?
>>>>>
>>>>> Chrysan Wu
>>>>> Phone:+86 17717640807
>>>>>
>>>>>
>>>>
>>>
>

Reply via email to