On 17.10.2011 01:09, Jeff Davis wrote:
On Sat, 2011-10-15 at 01:46 +0300, Heikki Linnakangas wrote:
* The binary i/o format includes the length of the lower and upper
bounds twice, once explicitly in range_send, and second time within the
send-function of the subtype. Seems wasteful.

Any ideas how to fix that? How else do I know how much the underlying
send function will consume?

Oh, never mind. I was misreading the code, it's not sending the length twice.

* range_constructor_internal - I think it would be better to move logic
to figure out the the arguments into the callers.

Done.

The comment above range_constructor0() is now outdated.

* The gist support functions frequently call range_deserialize(), which
does catalog lookups. Isn't that horrendously expensive?

Yes, it was. I have introduced a cached structure that avoids syscache
lookups when it's the same range as the last lookup (the common case).

Hmm, I don't think that's safe. After Oid wraparound, a range type oid might get reused for some other range type, and the cache would return stale values. Extremely unlikely to happen by accident, but could be exploited by an attacker.

* What exactly is canonical function supposed to return? It's not clear
what format one should choose as the canonical format when writing a
custom range type. And even for the built-in types, it would be good to
explain which types use which canonical format (I saw the discussions on
that, so I understand that might still be subject to change).

The canonical function is just supposed to return a new range such that
two equal values will have equal representations. I have listed the
built-in discrete range types and their canonical form.

As far as describing what a custom range type is supposed to use for the
canonical form, I don't know which part is exactly unclear. There aren't
too many rules to defining one -- the only guideline is that ranges of
equal value going in should be put in one canonical representation.

Ok. The name "canonical" certainly hints at that, but it would be good to explicitly state that guideline. As the text stands, it would seem that a canonical function that maps "[1,7]" to "[1,8)", and also vice versa, "[1,8)" to "[1,7]", would be valid. That would be pretty silly, but it would be good to say something like "The canonical output for two values that are equal, like [1,7] and [1,8), must be equal. It doesn't matter which representation you choose to be the canonical one, as long as two equal values with different formattings are always mapped to the same value with same formatting"

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to