Now, if you are ruthless it'd make sense to randomise the order of results if someone left out the order by, to stop complacency.
like that time sun changed the ordering that methods were returned in a Class.listMethods() call and everyone's junit test cases failed if they'd assumed that ordering was that of the source file -which it was until then, even though the language spec said "no guarantees". People code for what works, not what is documented in places they don't read. (this is also why anyone writing network code should really have a flaky network connection to keep themselves honest) On Sat, 23 Sept 2023 at 11:00, beliefer <belie...@163.com> wrote: > AFAIK, The order is free whether it's SQL without spcified ORDER BY clause > or DataFrame without sort. The behavior is consistent between them. > > > > At 2023-09-18 23:47:40, "Nicholas Chammas" <nicholas.cham...@gmail.com> > wrote: > > I’ve always considered DataFrames to be logically equivalent to SQL tables > or queries. > > In SQL, the result order of any query is implementation-dependent without > an explicit ORDER BY clause. Technically, you could run `SELECT * FROM > table;` 10 times in a row and get 10 different orderings. > > I thought the same applied to DataFrames, but the docstring for the > recently added method DataFrame.offset > <https://github.com/apache/spark/pull/40873/files#diff-4ff57282598a3b9721b8d6f8c2fea23a62e4bc3c0f1aa5444527549d1daa38baR1293-R1301> > implies > otherwise. > > This example will work fine in practice, of course. But if DataFrames are > technically unordered without an explicit ordering clause, then in theory a > future implementation change may result in “Bob" being the “first” row in > the DataFrame, rather than “Tom”. That would make the example incorrect. > > Is that not the case? > > Nick > >