I'd like to see this implemented with more general collation support in
In general, each index column can be ordered by one collation. A query
matching the index collation can use the index directly, a query asking
for another collation needs to convert. The trivial way to convert from
one collation to another is to sort according to the new collation. For
other conversions, there can be cheaper ways. I'd like to have explicit
support for collations and converting between them, perhaps by
introducing a new ConvertCollation node type. The default implementation
would be to sort, but for example to convert from NULLS FIRST to NULLS
LAST could be implemented by buffering the null columns to temporary
storage and returning all non-null rows first. There's similar tricks
that could be used to convert between many Western European collations,
without resorting to full sort.
The NULLS FIRST/LAST support, as well as ascending and descending
orderings would be special cases of the general collation and collation
I don't know how much work that would be to implement, compared to what
you're proposing. It would require adding an extra collation concept to
all the places that care about sort ordering, but you're proposing to
add nulls-first-or-last flag to all those places anyway.
Tom Lane wrote:
The SQL2003 spec adds optional "NULLS FIRST" and "NULLS LAST" modifiers
for ORDER BY clauses. Teodor proposed an implementation here:
which I didn't care for at all:
Doing this right is going to require introducing the nulls-first-or-last
concept into all the system's handling of sort ordering. Messy as that
sounds, I think it will end up logically cleaner than what we have now,
because it will let us fix some issues involving descending-order index
opclasses and backwards-sort mergejoins. Neither of those can really work
correctly right now, the reason being exactly that we lack a framework for
dealing with variable sort positioning of NULLs.
I'm hoping to fix this as a consequence of the work I'm doing with
operator families for 8.3. What I'd like to come out of it is support
for both NULLS FIRST/LAST and reverse-sort index columns. Reverse-sort
indexes are already in the TODO list, the application being to create
an index whose sort order matches a query like "ORDER BY x ASC, y DESC".
There are some user-visible decisions to be made first, so this message
is to start a discussion about what we want.
One way we could handle this is to say that reverse-sort indexes are
implemented by adding explicit catalog entries for reverse-sort opclasses,
with no additions to the underlying btree index mechanisms. So you
might make an index using a command like
CREATE INDEX fooi ON foo (x, y reverse_int4_ops);
btree indexes would always sort by the given opclass with NULLS LAST.
So the two possible orderings that could be derived from this index
(using forward or backward scan respectively) are
ORDER BY x ASC NULLS LAST, y DESC NULLS LAST
ORDER BY x DESC NULLS FIRST, y ASC NULLS FIRST
The other way that seems like it could win acceptance is to make REVERSE
an explicit optional property of an index column; and if we do that we
might as well allow NULLS FIRST/LAST to be an optional property as well.
Then you could say something like
CREATE INDEX fooi ON foo (x, y REVERSE NULLS FIRST);
(Or maybe use DESC instead of REVERSE as the keyword --- not very
important at this point.) This index would support scans with these
two sort orderings:
ORDER BY x ASC NULLS LAST, y DESC NULLS FIRST
ORDER BY x DESC NULLS FIRST, y ASC NULLS LAST
This second way is more flexible in that it allows indexes to support
mixed null orderings; another attraction is that we'd not have to create
explicit reverse-sort opclasses, which would be a tedious bit of extra
work for every datatype. On the other hand, adding these extra flag bits
to indexes seems like a horribly ugly wart, mainly because they're
irrelevant to anything except a btree index. (Or at least irrelevant to
anything that doesn't support ordered scans, but in practice that's only
btree for the foreseeable future.) Also, having to account for these
options in the btree code would make it more complex and perhaps slower.
Comments? I've got mixed feelings about which way to jump myself.
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster