On Wed, Oct 4, 2023 at 6:37 PM Erik Wienhold <e...@ewie.name> wrote:

> On 2023-10-04 16:24 +0200, PG Doc comments form write:
> > The following documentation comment has been logged on the website:
> >
> > Page: https://www.postgresql.org/docs/16/queries-order.html
> > Description:
> >
> > The document only says this about unsorted queries:
> >
> > > After a query has produced an output table (after the select list has
> been
> > > processed) it can optionally be sorted. If sorting is not chosen, the
> rows
> > > will be returned in an unspecified order. The actual order in that
> case will
> > > depend on the scan and join plan types and the order on disk, but it
> must
> > > not be relied on. A particular output ordering can only be guaranteed
> if the
> > > sort step is explicitly chosen.
> >
> > It mentions "If sorting is not chosen". This sort of implies that if you
> > pick a sort the output order is predictable. However I believe that the
> only
> > actual guarantee is if the sort columns selected produce a unique value.
> >
> > For example if you do `ORDER BY name` and have two rows with the same
> name I
> > don't think the order of those rows is predictable.
>
> "The relative ordering of two rows that are not distinct with respect to
>  the <sort specification> is implementation-dependent."
>

The OP is assuming a promise of a deterministic ordering of all output rows
and such a promise is only possible if the order by clause columns uniquely
identify every row in the output.  This is because all the order by
promises is that output ordering will conform to the order by
specification, and indeed if it is under-specified such that multiple rows
match a given bin, then there is no deterministic relative ordering among
those rows.

I don't feel that the wording makes any such inference regarding
determinism of row output due to the mere presence of an order by clause.
Nor doesn't such determinism in the face of an under-specific clause even
make logical sense.  I'm mostly inclined to leave the wording alone given
this single report.  My only complaints are style-istic at this point.

That said, maybe a final sentence:

Assuming every output row can be uniquely identified by some subset of the
output columns, that subset must all be listed within the order by clause
if you wish to ensure a fully deterministic ordering.

David J.

Reply via email to