Github user henryr commented on the issue:

    https://github.com/apache/spark/pull/21049
  
    In SQL, the sort in a subquery doesn't make sense because of the relational 
model - the output of a subquery is an unordered bag of tuples. Some engines 
still allow the sort, some silently drop it and some throw an error.
    
    For example: 
    
    * MariaDB: 
https://mariadb.com/kb/en/library/why-is-order-by-in-a-from-subquery-ignored/
    * SQL Server: 
https://stackoverflow.com/questions/985921/sql-error-with-order-by-in-subquery
     
    Oracle and Postgres allow the `ORDER BY`.
    
    One issue might be that the underlying dataframe model might not be 100% 
relational - maybe dataframes _are_ sorted lists of rows and then this 
optimization would only be valid if using the SQL interface. If so, it's 
probably not worth the effort to maintain. But if dataframes and SQL relations 
are supposed to be equivalent, we can drop the `ORDER BY`.
    
    We also may want to decide not to do this because it would surprise users 
who had been relying on the existing behavior.
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to