I think the outputOrdering would be the one of the big table (if any) and it wouldn't matter if this involves the join keys or not. Am I wrong?
2018-06-28 17:01 GMT+02:00 吴晓菊 <chrysan...@gmail.com>: > Thanks for the reply. > By looking into the SortMergeJoinExec, I think we can follow what > SortMergeJoin do, for some types of join, if the children is ordered on > join keys, we can output the ordered join keys as output ordering. > > > Chrysan Wu > 吴晓菊 > Phone:+86 17717640807 > > > 2018-06-28 22:53 GMT+08:00 Wenchen Fan <cloud0...@gmail.com>: > >> SortMergeJoin only reports ordering of the join keys, not the output >> ordering of any child. >> >> It seems reasonable to me that broadcast join should respect the output >> ordering of the children. Feel free to submit a PR to fix it, thanks! >> >> On Thu, Jun 28, 2018 at 10:07 PM 吴晓菊 <chrysan...@gmail.com> wrote: >> >>> Why we cannot use the output order of big table? >>> >>> >>> Chrysan Wu >>> Phone:+86 17717640807 >>> >>> >>> 2018-06-28 21:48 GMT+08:00 Marco Gaido <marcogaid...@gmail.com>: >>> >>>> The easy answer to this is that SortMergeJoin ensure an outputOrdering, >>>> while BroadcastHashJoin doesn't, ie. after running a BroadcastHashJoin you >>>> don't know which is going to be the order of the output since nothing >>>> enforces it. >>>> >>>> Hope this helps. >>>> Thanks. >>>> Marco >>>> >>>> 2018-06-28 15:46 GMT+02:00 吴晓菊 <chrysan...@gmail.com>: >>>> >>>>> >>>>> We see SortMergeJoinExec is implemented with >>>>> outputPartitioning&outputOrdering while BroadcastHashJoinExec is only >>>>> implemented with outputPartitioning. Why is the design? >>>>> >>>>> Chrysan Wu >>>>> Phone:+86 17717640807 >>>>> >>>>> >>>> >>> >