[HACKERS] Reverse-sort indexes and NULLS FIRST/LAST sorting

Tom Lane Mon, 01 Jan 2007 14:53:55 -0800

The SQL2003 spec adds optional "NULLS FIRST" and "NULLS LAST" modifiers
for ORDER BY clauses.  Teodor proposed an implementation here:
http://archives.postgresql.org/pgsql-patches/2006-12/msg00019.php
which I didn't care for at all:
http://archives.postgresql.org/pgsql-hackers/2006-12/msg00133.php


Doing this right is going to require introducing the nulls-first-or-last
concept into all the system's handling of sort ordering.  Messy as that
sounds, I think it will end up logically cleaner than what we have now,
because it will let us fix some issues involving descending-order index
opclasses and backwards-sort mergejoins.  Neither of those can really work
correctly right now, the reason being exactly that we lack a framework for
dealing with variable sort positioning of NULLs.

I'm hoping to fix this as a consequence of the work I'm doing with
operator families for 8.3.  What I'd like to come out of it is support
for both NULLS FIRST/LAST and reverse-sort index columns.  Reverse-sort
indexes are already in the TODO list, the application being to create
an index whose sort order matches a query like "ORDER BY x ASC, y DESC".
There are some user-visible decisions to be made first, so this message
is to start a discussion about what we want.

One way we could handle this is to say that reverse-sort indexes are
implemented by adding explicit catalog entries for reverse-sort opclasses,
with no additions to the underlying btree index mechanisms.  So you
might make an index using a command like

        CREATE INDEX fooi ON foo (x, y reverse_int4_ops);

btree indexes would always sort by the given opclass with NULLS LAST.
So the two possible orderings that could be derived from this index
(using forward or backward scan respectively) are

        ORDER BY x ASC NULLS LAST, y DESC NULLS LAST
        ORDER BY x DESC NULLS FIRST, y ASC NULLS FIRST

The other way that seems like it could win acceptance is to make REVERSE
an explicit optional property of an index column; and if we do that we
might as well allow NULLS FIRST/LAST to be an optional property as well.
Then you could say something like

        CREATE INDEX fooi ON foo (x, y REVERSE NULLS FIRST);

(Or maybe use DESC instead of REVERSE as the keyword --- not very
important at this point.)  This index would support scans with these
two sort orderings:

        ORDER BY x ASC NULLS LAST, y DESC NULLS FIRST
        ORDER BY x DESC NULLS FIRST, y ASC NULLS LAST

This second way is more flexible in that it allows indexes to support
mixed null orderings; another attraction is that we'd not have to create
explicit reverse-sort opclasses, which would be a tedious bit of extra
work for every datatype.  On the other hand, adding these extra flag bits
to indexes seems like a horribly ugly wart, mainly because they're
irrelevant to anything except a btree index.  (Or at least irrelevant to
anything that doesn't support ordered scans, but in practice that's only
btree for the foreseeable future.)  Also, having to account for these
options in the btree code would make it more complex and perhaps slower.

Comments?  I've got mixed feelings about which way to jump myself.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

[HACKERS] Reverse-sort indexes and NULLS FIRST/LAST sorting

Reply via email to