hvanhovell commented on pull request #29089:
URL: https://github.com/apache/spark/pull/29089#issuecomment-658612372


   Ehh... AFAIK nested ordering can be ignored from a relation algebra point of 
view. So I am not sure this is a very solid argument. This feels a bit like an 
example of [hyrum's law](https://www.hyrumslaw.com/). If you want sorted runs 
in ORC then you ought to fix is there, and not rely on some implicit system 
behavior.
   
   Regarding the shuffles. If the data is sorted before it goes into the 
shuffle, then the individual shuffle blocks are sorted. This is also the reason 
why doing a sort aggregate is not completely terrible (TimSort is good at 
identifying sorted runs).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to