subject:"Re\: \[HACKERS\] Range Types, constructors, and the type system"

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-06 Thread Robert Haas

On Wed, Jul 6, 2011 at 1:19 AM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2011-07-05 at 13:06 -0400, Robert Haas wrote:
 On Tue, Jul 5, 2011 at 12:54 PM, Jeff Davis pg...@j-davis.com wrote:
  It would be something like: range_co(1,8)::int8range
 
  (just so we're comparing apples to apples)
 
  The intermediate type proposal doesn't require that we move the c and
  o into the parameter list.

 Well, you have to specify the bounds somewhere...

 That's true. In my example it's in the function name.

 OK, so let's pass the information on the bounds as a separate
 argument.  Like this:

 int8range(1,8,'co')

 That has a lot going for it, in the sense that it avoids dealing with
 the type problems.

 Then you can instead pass 'o' for open or 'i' for infinity (passing
 NULL for the corresponding argument position in that case).  The third
 argument can be optional and default to 'cc'.

 The fact that there can be a default for the third argument makes this
 quite a lot more appealing than I had originally thought (although I
 think 'co' is the generally-accepted default).

 There's some slight ugliness around the NULL/infinity business, but I
 think that I could be convinced. I'd like to avoid confusion between
 NULL and infinity if possible.

I was thinking that if you passed 'i' for one of the bounds, it would
ignore the supplied argument and substitute its special infinity
value.  But you'd still need to supply some argument in that position,
which could be NULL or anything else.  It doesn't really seem worth
having additional constructor functions to handle the case where one
or both arguments are infinite.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-06 Thread Jeff Davis

On Wed, 2011-07-06 at 09:10 -0400, Robert Haas wrote:
  There's some slight ugliness around the NULL/infinity business, but I
  think that I could be convinced. I'd like to avoid confusion between
  NULL and infinity if possible.
 
 I was thinking that if you passed 'i' for one of the bounds, it would
 ignore the supplied argument and substitute its special infinity
 value.  But you'd still need to supply some argument in that position,
 which could be NULL or anything else.  It doesn't really seem worth
 having additional constructor functions to handle the case where one
 or both arguments are infinite.

Right, that's what I assumed that you meant. I can't think of anything
better, either, because I like the fact that two arguments are there so
that you can visually see which sides are bounded/unbounded.

I suppose we could have constructors like:
  range(text, subtype)
and
  range(subtype, text)
where the text field is used to specify infinity. But that has the
obvious problem what if the subtype is text?. So, of course, we make a
special new pseudotype to represent infinity... ;)

But seriously, your idea is starting to look more appealing.

To get into some more details: how exactly would this constructor be
generated on the fly? Clearly we want only one underlying C function
that accepts something like:
  range_internal(lower, upper, flags, Oid rangetype)
So how do we get the rangetype in there? I suppose a default 4th
argument?

That would be kind of an interesting option, but what if someone
actually specified that 4th argument? We couldn't allow that.

Also, are default arguments always applied in all the contexts where
this function might be called?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-06 Thread Robert Haas

On Wed, Jul 6, 2011 at 12:22 PM, Jeff Davis pg...@j-davis.com wrote:
 To get into some more details: how exactly would this constructor be
 generated on the fly? Clearly we want only one underlying C function
 that accepts something like:
  range_internal(lower, upper, flags, Oid rangetype)
 So how do we get the rangetype in there?

I think that the C function could call get_call_result_type() and get
the return type OID back via the second argument.

 Also, are default arguments always applied in all the contexts where
 this function might be called?

Uh, I'm not sure.  But I don't see why it would need different
handling than any other function which takes default arguments.  It
shouldn't be needed during bootstrapping or anything funky like that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-06 Thread Jeff Davis

On Wed, 2011-07-06 at 12:51 -0400, Robert Haas wrote:
 On Wed, Jul 6, 2011 at 12:22 PM, Jeff Davis pg...@j-davis.com wrote:
  To get into some more details: how exactly would this constructor be
  generated on the fly? Clearly we want only one underlying C function
  that accepts something like:
   range_internal(lower, upper, flags, Oid rangetype)
  So how do we get the rangetype in there?
 
 I think that the C function could call get_call_result_type() and get
 the return type OID back via the second argument.

I'm also a little unclear on the rules for when that might be set
properly or not.

I ran into problems with that before... I think with the I/O functions.
I don't think that's a problem here, but I thought I'd ask.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-06 Thread Tom Lane

Jeff Davis pg...@j-davis.com writes:
 On Wed, 2011-07-06 at 12:51 -0400, Robert Haas wrote:
 On Wed, Jul 6, 2011 at 12:22 PM, Jeff Davis pg...@j-davis.com wrote:
 To get into some more details: how exactly would this constructor be
 generated on the fly? Clearly we want only one underlying C function
 that accepts something like:
 range_internal(lower, upper, flags, Oid rangetype)
 So how do we get the rangetype in there?

 I think that the C function could call get_call_result_type() and get
 the return type OID back via the second argument.

 I'm also a little unclear on the rules for when that might be set
 properly or not.

 I ran into problems with that before... I think with the I/O functions.
 I don't think that's a problem here, but I thought I'd ask.

I think it'd probably be all right to do that.  The places where you
might find shortcuts being taken are where functions are called directly
by C code, such as I/O function calls --- but these constructors should
only ever get invoked from SQL queries, no?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-06 Thread Jeff Davis

On Wed, 2011-07-06 at 15:14 -0400, Tom Lane wrote:
  I ran into problems with that before... I think with the I/O functions.
  I don't think that's a problem here, but I thought I'd ask.
 
 I think it'd probably be all right to do that.  The places where you
 might find shortcuts being taken are where functions are called directly
 by C code, such as I/O function calls --- but these constructors should
 only ever get invoked from SQL queries, no?

Perhaps index expressions/predicates as well (which are also fine). I
was more worried about some case that I hadn't thought of.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-05 Thread Robert Haas

On Fri, Jul 1, 2011 at 2:09 AM, Jeff Davis pg...@j-davis.com wrote:
 On Thu, 2011-06-30 at 12:28 +0200, Florian Pflug wrote:
 Well, arrays are containers, and we need two values to construct a range,

 What about empty ranges? What about infinite ranges?

 It seems quite a bit more awkward to shoehorn ranges into an array than
 to use a real type (even if it's intermediate and otherwise useless).

 Hm, I guess. I'm sill no huge fan of RANGEINPUT, but if we prevent
 it from being used as a column type and from being used as an argument
 type, then I guess it's workable...

 Btw, what happened to the idea of making RANGE(...) a special syntactic
 construct instead of a normal function call? Did we discard that for its
 intrusiveness, or were there other reasons?

 It has not been discarded; as far as I'm concerned it's still on the
 table. The main advantage is that it doesn't require an intermediate
 type, and that requiring a cast (or some specification of the range
 type) might be a little more natural. The downside is that, well, it's
 new syntax, and there's a little inertia there.

 But if it's actually better, we should do it. If an intermediate type
 seems to be problematic, or if people think it's strange to require
 casting, then I think this is reasonable.

I don't understand how the bespoke syntax avoids the need for a cast?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-05 Thread Jeff Davis

On Tue, 2011-07-05 at 10:06 -0400, Robert Haas wrote:
  But if it's actually better, we should do it. If an intermediate type
  seems to be problematic, or if people think it's strange to require
  casting, then I think this is reasonable.
 
 I don't understand how the bespoke syntax avoids the need for a cast?

It doesn't, it just avoids the need for an intermediate type.

What I meant was that it might be strange to require a cast on the
result of a function call, because we don't really do that anywhere
else. Florian pointed out that it's common to require casting the
ARRAY[] constructor, so that has more of a precedent. I'm not really
sure how much that matters.

I'm OK with the intermediate type, but Florian seems skeptical of that
idea.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-05 Thread Robert Haas

On Tue, Jul 5, 2011 at 11:11 AM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2011-07-05 at 10:06 -0400, Robert Haas wrote:
  But if it's actually better, we should do it. If an intermediate type
  seems to be problematic, or if people think it's strange to require
  casting, then I think this is reasonable.

 I don't understand how the bespoke syntax avoids the need for a cast?

 It doesn't, it just avoids the need for an intermediate type.

 What I meant was that it might be strange to require a cast on the
 result of a function call, because we don't really do that anywhere
 else. Florian pointed out that it's common to require casting the
 ARRAY[] constructor, so that has more of a precedent. I'm not really
 sure how much that matters.

 I'm OK with the intermediate type, but Florian seems skeptical of that
 idea.

How about the idea of creating a family of four constructor functions
for each new range type?  The functions would be named after the range
type, with _cc, _co, _oc, and _oo appended.  So, then, instead
of writing:

RANGE(1,8,'c','o')::int8range

...or somesuch, you could just say:

int8range_co(1,8)

...which is both more compact and less ugly, IMHO, and seems to
circumvent all the type system problems as well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-05 Thread Florian Pflug

On Jul5, 2011, at 17:11 , Jeff Davis wrote:
 I'm OK with the intermediate type, but Florian seems skeptical of that
 idea.

I'm starting to get used to it, though ;-) I do now believe that it can
be made safe against accidental miss-use, it seem that I was overly
anxious there.

What I still don't like about it is that it feels like a workaround for
a feature missing in our type system - the possibility of having function
with a polymorphic return type, but no polymorphic arguments. I feel
somewhat strongly about this, because it bit me when I tried to implement
record_getfield() and record_setfield() to get and set a record's field
based on it's name.

However, placing the burden of solving that onto the range type patch
doesn't seem fair.

Plus, I've realized now that a RANGEINPUT type would allow us to easily
support some things that otherwise seem hard. We could, for example,
make the cast from RANGEINPUT to the individual range types an assignment
cast (or even implicit), thereby removing the need for an explicit
cast in a lot of common cases like insert into a table with a range column.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-05 Thread Merlin Moncure

On Tue, Jul 5, 2011 at 10:26 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jul 5, 2011 at 11:11 AM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2011-07-05 at 10:06 -0400, Robert Haas wrote:
  But if it's actually better, we should do it. If an intermediate type
  seems to be problematic, or if people think it's strange to require
  casting, then I think this is reasonable.

 I don't understand how the bespoke syntax avoids the need for a cast?

 It doesn't, it just avoids the need for an intermediate type.

 What I meant was that it might be strange to require a cast on the
 result of a function call, because we don't really do that anywhere
 else. Florian pointed out that it's common to require casting the
 ARRAY[] constructor, so that has more of a precedent. I'm not really
 sure how much that matters.

 I'm OK with the intermediate type, but Florian seems skeptical of that
 idea.

 How about the idea of creating a family of four constructor functions
 for each new range type?  The functions would be named after the range
 type, with _cc, _co, _oc, and _oo appended.  So, then, instead
 of writing:

 RANGE(1,8,'c','o')::int8range

 ...or somesuch, you could just say:

 int8range_co(1,8)

 ...which is both more compact and less ugly, IMHO, and seems to
 circumvent all the type system problems as well.

+1 on this (so you wouldn't even then directly cast to a range?)

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-05 Thread Robert Haas

On Tue, Jul 5, 2011 at 12:23 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 5, 2011 at 10:26 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jul 5, 2011 at 11:11 AM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2011-07-05 at 10:06 -0400, Robert Haas wrote:
  But if it's actually better, we should do it. If an intermediate type
  seems to be problematic, or if people think it's strange to require
  casting, then I think this is reasonable.

 I don't understand how the bespoke syntax avoids the need for a cast?

 It doesn't, it just avoids the need for an intermediate type.

 What I meant was that it might be strange to require a cast on the
 result of a function call, because we don't really do that anywhere
 else. Florian pointed out that it's common to require casting the
 ARRAY[] constructor, so that has more of a precedent. I'm not really
 sure how much that matters.

 I'm OK with the intermediate type, but Florian seems skeptical of that
 idea.

 How about the idea of creating a family of four constructor functions
 for each new range type?  The functions would be named after the range
 type, with _cc, _co, _oc, and _oo appended.  So, then, instead
 of writing:

 RANGE(1,8,'c','o')::int8range

 ...or somesuch, you could just say:

 int8range_co(1,8)

 ...which is both more compact and less ugly, IMHO, and seems to
 circumvent all the type system problems as well.

 +1 on this (so you wouldn't even then directly cast to a range?)

You wouldn't need to, because these functions would be declared to
return the range type.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-05 Thread Jeff Davis

On Tue, 2011-07-05 at 11:26 -0400, Robert Haas wrote:
 How about the idea of creating a family of four constructor functions
 for each new range type?  The functions would be named after the range
 type, with _cc, _co, _oc, and _oo appended.  So, then, instead
 of writing:
 
 RANGE(1,8,'c','o')::int8range

It would be something like: range_co(1,8)::int8range

(just so we're comparing apples to apples)

The intermediate type proposal doesn't require that we move the c and
o into the parameter list.

 int8range_co(1,8)
 
 ...which is both more compact and less ugly, IMHO, and seems to
 circumvent all the type system problems as well.

I brought that up before:
http://archives.postgresql.org/pgsql-hackers/2011-06/msg02046.php

It certainly circumvents the polymorphic type problems, but the problem
is that it adds up to quite a few permutations. Not only are there
cc/co/oc/oo, but there are also variations for infinite bounds and empty
ranges. So I think we're talking 10+ functions per range type rather
than 4.

Also, if someone has an idea for another constructor, like the one you
mention above:
  range(1,8,'c','o')
then they have to create it for every range type, and they can't
anticipate new range types that someone might create. In other words,
the constructors wouldn't benefit from the polymorphism. However, if we
used an intermediate type, then they could create the above constructor
and it would work for any range type automatically.

I don't object to this idea, but we'll need to come up with a pretty
exhaustive list of possibly-useful constructors.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-05 Thread Robert Haas

On Tue, Jul 5, 2011 at 12:54 PM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2011-07-05 at 11:26 -0400, Robert Haas wrote:
 How about the idea of creating a family of four constructor functions
 for each new range type?  The functions would be named after the range
 type, with _cc, _co, _oc, and _oo appended.  So, then, instead
 of writing:

 RANGE(1,8,'c','o')::int8range

 It would be something like: range_co(1,8)::int8range

 (just so we're comparing apples to apples)

 The intermediate type proposal doesn't require that we move the c and
 o into the parameter list.

Well, you have to specify the bounds somewhere...

 int8range_co(1,8)

 ...which is both more compact and less ugly, IMHO, and seems to
 circumvent all the type system problems as well.

 I brought that up before:
 http://archives.postgresql.org/pgsql-hackers/2011-06/msg02046.php

 It certainly circumvents the polymorphic type problems, but the problem
 is that it adds up to quite a few permutations. Not only are there
 cc/co/oc/oo, but there are also variations for infinite bounds and empty
 ranges. So I think we're talking 10+ functions per range type rather
 than 4.

OK, so let's pass the information on the bounds as a separate
argument.  Like this:

int8range(1,8,'co')

Then you can instead pass 'o' for open or 'i' for infinity (passing
NULL for the corresponding argument position in that case).  The third
argument can be optional and default to 'cc'.

For empty ranges I doubt we need a separate constructor function;
presumably the representation of an empty range is some fixed string
and users can just write 'empty'::int8range or similar.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-05 Thread Jeff Davis

On Tue, 2011-07-05 at 13:06 -0400, Robert Haas wrote:
 On Tue, Jul 5, 2011 at 12:54 PM, Jeff Davis pg...@j-davis.com wrote:
  It would be something like: range_co(1,8)::int8range
 
  (just so we're comparing apples to apples)
 
  The intermediate type proposal doesn't require that we move the c and
  o into the parameter list.
 
 Well, you have to specify the bounds somewhere...

That's true. In my example it's in the function name.

 OK, so let's pass the information on the bounds as a separate
 argument.  Like this:
 
 int8range(1,8,'co')

That has a lot going for it, in the sense that it avoids dealing with
the type problems.

 Then you can instead pass 'o' for open or 'i' for infinity (passing
 NULL for the corresponding argument position in that case).  The third
 argument can be optional and default to 'cc'.

The fact that there can be a default for the third argument makes this
quite a lot more appealing than I had originally thought (although I
think 'co' is the generally-accepted default).

There's some slight ugliness around the NULL/infinity business, but I
think that I could be convinced. I'd like to avoid confusion between
NULL and infinity if possible.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-01 Thread Jeff Davis

On Thu, 2011-06-30 at 09:59 -0700, David E. Wheeler wrote:
 On Jun 30, 2011, at 9:34 AM, Jeff Davis wrote:
 
  Then how do you get a text range that doesn't correspond to the
  LC_COLLATE setting?
 
 You cast it.

My original solution was something like this, except involving domains.
With a sufficient amount of casting of all arguments to anything
involving a range type, it works, but it's a little too un-SQL-like.
There was at least one fairly strong objection to my approach, but if
you have some further thoughts along that line, I'm open to suggestion.

Also, what if the LC_COLLATE is C, and you want to cast it to en_US
collation?
  range('a','Z')
would be invalid in the C locale, and it would fail before you had a
chance to cast it.

 Cast where you need it explicit, and have a reasonable default when
 it's not cast.

I thought about that, too, but it's not ideal, either. That means that
something might start out as the only range type for a given subtype,
and doesn't need explicit casts. Then you define another range type over
that subtype, and all the original queries break because they are now
ambiguous.

I think the fundamental differences with range types that we're dealing
with are:
 1. multiple range types might reasonbly exist for a single subtype
 2. the order is a fundamental part of the type definition, not just an
extra argument useful for operations on the range type

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-07-01 Thread Jeff Davis

On Thu, 2011-06-30 at 12:28 +0200, Florian Pflug wrote:
 Well, arrays are containers, and we need two values to construct a range,

What about empty ranges? What about infinite ranges?

It seems quite a bit more awkward to shoehorn ranges into an array than
to use a real type (even if it's intermediate and otherwise useless).

 Hm, I guess. I'm sill no huge fan of RANGEINPUT, but if we prevent
 it from being used as a column type and from being used as an argument
 type, then I guess it's workable...
 
 Btw, what happened to the idea of making RANGE(...) a special syntactic
 construct instead of a normal function call? Did we discard that for its
 intrusiveness, or were there other reasons?

It has not been discarded; as far as I'm concerned it's still on the
table. The main advantage is that it doesn't require an intermediate
type, and that requiring a cast (or some specification of the range
type) might be a little more natural. The downside is that, well, it's
new syntax, and there's a little inertia there.

But if it's actually better, we should do it. If an intermediate type
seems to be problematic, or if people think it's strange to require
casting, then I think this is reasonable.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-30 Thread Florian Pflug

On Jun29, 2011, at 23:44 , Peter Eisentraut wrote:
 On ons, 2011-06-29 at 10:15 -0700, David E. Wheeler wrote:
 On Jun 29, 2011, at 10:13 AM, Florian Pflug wrote:
 Because there might be more than one range type for a
 base type. Say there are two range types over text, one
 with collation 'de_DE' and one with collation 'en_US'.
 What would the type of
 range('foo', 'f')
 be?
 
 The one that corresponds to the current LC_COLLATE setting.
 
 Yes, or more generally, we have logic that determines, for example, what
 collation to use for
 
 'foo'  'f'
 
 The same logic can be used to determine what collation to use for
 
 range('foo', 'f')
 
 (In fact, if you implement range() as a user-space function, that will
 happen automatically.)

I don't think it will - as it stands, there isn't a single collatable
type RANGE but instead one *distinct* type per combination of base type,
btree opclass and collation. The reasons for that were discussed at length -
the basic argument for doing it that way was to make a range represent
a fixed set of values.

There's also no guarantee that a range type with collation LC_COLLATE
even exists.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-30 Thread Peter Eisentraut

On tor, 2011-06-30 at 08:45 +0200, Florian Pflug wrote:
 I don't think it will - as it stands, there isn't a single collatable
 type RANGE but instead one *distinct* type per combination of base
 type, btree opclass and collation. The reasons for that were discussed
 at length - the basic argument for doing it that way was to make a
 range represent a fixed set of values.

How would the system catalogs be initialized under that theory: surely
you're not going to seed (nr. of types) * (nr. of collations) * (nr. of
opclasses) range types in initdb?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-30 Thread Florian Pflug

On Jun30, 2011, at 09:05 , Peter Eisentraut wrote:
 On tor, 2011-06-30 at 08:45 +0200, Florian Pflug wrote:
 I don't think it will - as it stands, there isn't a single collatable
 type RANGE but instead one *distinct* type per combination of base
 type, btree opclass and collation. The reasons for that were discussed
 at length - the basic argument for doing it that way was to make a
 range represent a fixed set of values.
 
 How would the system catalogs be initialized under that theory: surely
 you're not going to seed (nr. of types) * (nr. of collations) * (nr. of
 opclasses) range types in initdb?

There's CREATE RANGE. By default, no range types would exists I believe.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-30 Thread Florian Pflug

On Jun29, 2011, at 17:41 , Jeff Davis wrote:
 Is it? That's actually too bad, since I kinda like it. But anyway,
 if that's a concern it could also be
  range_bounds(ARRAY[1,2]::int8range, '(]')
 
 What type would the result of that be? What value?

ARRAY[1,2]::int8range would return an int8range instance representing
the range [1,2] (i.e. the set of values {1,2}). range_bounds would then
modify the left bound to be exclusive, and thus return the int8range
(1,2] (i.e. the set of values {2}). range_bounds would have the signature
  range_bounds(anyrange) returns anyrange.

I do think we'll probably want to have functions to modify the boundary
type (open or closed) anyway, so it wouldn't be that huge of a deal if
the range constructor didn't let you specify them.

Empty ranges would be constructed by
  ARRAY[]::int8range
(Hm, ok, now I'm cheating... Currently you'd need to write
ARRAY[]::int8[]::int8range, but fixing that only needs a few lines
in the transformExpression* function that makes ARRAY[]::int8[] work).

 * It still suffers similar problems as casting back and forth to text:
 ANYARRAY is too general, doesn't really take advantage of the type
 system, and not a great fit anyway.
 
 I believe it alleviates the gravest problems of casting back and forth
 to text. It doesn't have quoting issues and it doesn't potentially lose
 information.
 
 I think it still circumvents the type system to a degree. We're just
 putting stuff in an array with no intention of really using it that way.

Well, arrays are containers, and we need two values to construct a range,
so putting them into a container first and then creating the range from that
doesn't seem so bad to me. We do use the full set of features that arrays
provide, since we only ever expect zero, one or two entries. But I don't
think this is different from functions who only support single-dimensional
arrays - they too choose to use only a subset of the features set of arrays.

 In any case, I wouldn't expect this to *stay* the only way to construct
 a range forever. But I does have it's virtues for a first incarnation of
 range type, I believe, mostly because it's completely unintrusive and
 won't cause any backwards-compatbility headaches in the future
 
 I'm not sure that your overloading of arrays is completely immune from
 backwards-compatibility problems, should we decide to change it later.
 
 But regardless, we have quite a lot of time to make a decision before
 9.2 is released; so let's do it once and do it right.
 
 I fear that the intermediate type will turn out to be quite intrusive,
 at least if we try to handle all the corner cases and loose ends. And if
 we don't, I'm concerned that we're painting ourselves into a corner here...
 
 Can you expand on some of the corner-cases and loose ends you're
 concerned about? Does marking it as a pseudotype and making the IO
 functions throw exceptions handle them?

Hm, I guess. I'm sill no huge fan of RANGEINPUT, but if we prevent
it from being used as a column type and from being used as an argument
type, then I guess it's workable...

Btw, what happened to the idea of making RANGE(...) a special syntactic
construct instead of a normal function call? Did we discard that for its
intrusiveness, or were there other reasons?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-30 Thread Jeff Davis

On Thu, 2011-06-30 at 09:11 +0200, Florian Pflug wrote:
  How would the system catalogs be initialized under that theory: surely
  you're not going to seed (nr. of types) * (nr. of collations) * (nr. of
  opclasses) range types in initdb?
 
 There's CREATE RANGE.

Right. In that respect, it's more like a record type: many possible
record types exist, but you only define the ones you want.

 By default, no range types would exists I believe.

I was planning to include _some_ by default. Probably not text ranges,
but integer and timestamp[tz] ranges. If nothing else, it makes it
easier to document.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-30 Thread Jeff Davis

On Wed, 2011-06-29 at 10:15 -0700, David E. Wheeler wrote:
 On Jun 29, 2011, at 10:13 AM, Florian Pflug wrote:
 
  Because there might be more than one range type for a
  base type. Say there are two range types over text, one
  with collation 'de_DE' and one with collation 'en_US'.
  What would the type of
   range('foo', 'f')
  be?
 
 The one that corresponds to the current LC_COLLATE setting.

Then how do you get a text range that doesn't correspond to the
LC_COLLATE setting? Does that mean you couldn't dump/reload from a
system with one collation and get the same values in a system with a
different collation? That would be very strange.

Or, what about other types that just happen to have multiple useful
total orders?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-30 Thread Jeff Davis

On Wed, 2011-06-29 at 12:34 -0400, Robert Haas wrote:
 But now that I'm thinking about this a little more, I'm worried about this 
 case:
 
 CREATE TABLE foo AS RANGE('something'::funkytype, 'somethingelse'::funktype);
 DROP TYPE funkytype;
 
 It seems to me that the first statement had better fail, or else the
 second one is going to create a hopeless mess (imagine that a new type
 comes along and gets the OID of funkytype).

Interesting point. I don't think it's a problem because pseudo-types
can't be used that way, so that provides us a mechanism to stop it. But
it means that we have to be a little more sure that such values can't
persist anywhere.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-30 Thread David E. Wheeler

On Jun 30, 2011, at 9:29 AM, Jeff Davis wrote:

 Right. In that respect, it's more like a record type: many possible
 record types exist, but you only define the ones you want.

Well, okay. How is this same problem handled for RECORD types, then?

 By default, no range types would exists I believe.
 
 I was planning to include _some_ by default. Probably not text ranges,
 but integer and timestamp[tz] ranges. If nothing else, it makes it
 easier to document.

+1

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-30 Thread David E. Wheeler

On Jun 30, 2011, at 9:34 AM, Jeff Davis wrote:

 Then how do you get a text range that doesn't correspond to the
 LC_COLLATE setting?

You cast it.

 Does that mean you couldn't dump/reload from a
 system with one collation and get the same values in a system with a
 different collation? That would be very strange.

No, pg_dump should always explicitly cast things. But there should be a 
reasonable default behavior if I'm in psql and don't cast.

 Or, what about other types that just happen to have multiple useful
 total orders?

Cast where you need it explicit, and have a reasonable default when it's not 
cast.

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-30 Thread Jeff Davis

On Thu, 2011-06-30 at 09:58 -0700, David E. Wheeler wrote:
 On Jun 30, 2011, at 9:29 AM, Jeff Davis wrote:
 
  Right. In that respect, it's more like a record type: many possible
  record types exist, but you only define the ones you want.
 
 Well, okay. How is this same problem handled for RECORD types, then?

What problem, exactly? For a given list of subtypes, there is only one
valid record type.

Also, record is not a great example. The implementation uses at least
one pretty horrible hack.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-29 Thread Florian Pflug

On Jun29, 2011, at 05:02 , Jeff Davis wrote:
 On Tue, 2011-06-28 at 22:20 +0200, Florian Pflug wrote:
 I believe if we go that route we should make RANGEINPUT a full-blown
 type, having pair of bound semantics. Adding a lobotomized version
 just for the sake of range input feels a bit like a kludge to me.
 
 It's not a pair, because it can be made up of 0, 1, or 2 scalar values
 (unless you count infinity as one of those values, in which case 0 or
 2). And without ordering, it's not clear that those values are really
 bounds.

Hm, yeah, the lack of an ordering operator is trouble. There also seem
to be more problems with that idea, see below for that. So scratch
the idea of turning RANGEINPUT into a full-blown type...

 I don't think that having an extra type around is so bad. It solves a
 lot of problems, and doesn't seem like it would get in the way. And it's
 only for the construction of ranges out of scalars, which seems like the
 most natural place where a cast might be required (similar to casting an
 unknown literal, which is fairly common).

What I'm concerned about is how elegantly we'd be able to tie up all
the loose ends. What'd be the result of
  select range(1,2)
for example? Or
  create table (r rangeinput)
for that matter.

I think we'd want to forbid both of these, and more or less every other
use except
  range(1,2)::some range type
but that seems to require special-casing RANGEINPUT in a lot of places.

If we don't restrict RANGEINPUT that way, I think we ought to provide
at least a basic set of operators and functions for it - e.g.
input, output, lower(), upper(), ...

*Pondering this*

But we can't do that easily, since RANGEINPUT would actually be a kind of
VARIANT type (i.e. can hold values of arbitrary types). That's something
that our type system doesn't really support. We do have RECORD, which is
similar in a way, but its implementation is about as intrusive as it
gets...

 Alternatively, we could replace RANGEINPUT simply with ANYARRAY,
 and add functions ANYRANGE-ANYRANGE which allow specifying the
 bound operator (, = respectively ,=) after construction.
 
 So you'd write (using the functions-as-fields syntax I believe
 we support)
  (ARRAY[1,2]::int8range).left_open.right_closed for '(1,2]'
 and
  ARRAY[NULL,2]::int8range for '[-inf,2]'
 
 I think we can rule this one out:
 * The functions-as-fields syntax is all but deprecated (or should be)

Is it? That's actually too bad, since I kinda like it. But anyway,
if that's a concern it could also be
  range_bounds(ARRAY[1,2]::int8range, '(]')

 * That's hardly a readability improvement

Granted, it won't win any beauty contest, but

 * It still suffers similar problems as casting back and forth to text:
 ANYARRAY is too general, doesn't really take advantage of the type
 system, and not a great fit anyway.

I believe it alleviates the gravest problems of casting back and forth
to text. It doesn't have quoting issues and it doesn't potentially lose
information.

In any case, I wouldn't expect this to *stay* the only way to construct
a range forever. But I does have it's virtues for a first incarnation of
range type, I believe, mostly because it's completely unintrusive and
won't cause any backwards-compatbility headaches in the future

 All assuming that modifying the type system to support polymorphic
 type resolution based on the return type is out of the question... ;-)
 
 It's still not out of the question, but I thought that the intermediate
 type would be a less-intrusive alternative (and Robert seemed concerned
 about how intrusive it was).

I fear that the intermediate type will turn out to be quite intrusive,
at least if we try to handle all the corner cases and loose ends. And if
we don't, I'm concerned that we're painting ourselves into a corner here...

 There also might be a little more effort educating users if we selected
 the function based on the return type, because they might think that
 casting the inputs explicitly would be enough to get it to pick the
 right function. If it were a new syntax like RANGE[]::int8range, then I
 think it would be easier to understand.

There's certainly a risk of confusion here, simply because the relationship
between ANYRANGE and ANYLEMENT will be quite different than that of
ANYARRAY and ANYLEMENT. All we can do is state this very clearly in the
docs I think, and explain that it must be that way to support multiple
range types over the same base type.

best regards,
Florian Pflug






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-29 Thread Robert Haas

On Tue, Jun 28, 2011 at 11:02 PM, Jeff Davis pg...@j-davis.com wrote:
 It's still not out of the question, but I thought that the intermediate
 type would be a less-intrusive alternative (and Robert seemed concerned
 about how intrusive it was).

I'm no great fan of our existing type system, and I'm not opposed to
trying to improve it.  However, I'm a bit wary of the theory that we
can just tweak X, Y, or Z and then everything will go more smoothly
for range types.  I fear that there will be knock-on consequences that
we'll spend a lot of time either (a) arguing about or (b) fixing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-29 Thread Jeff Davis

On Wed, 2011-06-29 at 13:35 +0200, Florian Pflug wrote:
 What I'm concerned about is how elegantly we'd be able to tie up all
 the loose ends. What'd be the result of
   select range(1,2)
 for example? Or
   create table (r rangeinput)
 for that matter.
 
 I think we'd want to forbid both of these, and more or less every other
 use except
   range(1,2)::some range type
 but that seems to require special-casing RANGEINPUT in a lot of places.

We could make it a pseudo-type and make the IO functions generate
exceptions. That should prevent most mistakes and effectively hide it
from the user (sure, they could probably use it somewhere if they really
want to, but I wouldn't be worried about breaking backwards
compatibility with undocumented usage like that). There are plenty of
types that are hidden from users in one way or another -- trigger, void,
internal, fdw_handler, etc., so I don't see this as special-casing at
all.

That might make it slightly harder to document, but I think it can be
done. All we have to do is document the range constructors saying you
must cast the result to a valid range type; trying to use the result of
these functions directly raises an exception. In fact, I think I'll
take back the hard to document claim from before: it will be pretty
easy to document, and if someone gets it wrong, we can throw a helpful
error and hint.

Robert didn't really seem to like the idea of throwing an error though
-- Robert, can you expand on your reasoning here?

I tend to lean toward throwing an error as well, but I don't really have
much of an opinion.

 If we don't restrict RANGEINPUT that way, I think we ought to provide
 at least a basic set of operators and functions for it - e.g.
 input, output, lower(), upper(), ...
 
 *Pondering this*
 
 But we can't do that easily, since RANGEINPUT would actually be a kind of
 VARIANT type (i.e. can hold values of arbitrary types). That's something
 that our type system doesn't really support. We do have RECORD, which is
 similar in a way, but its implementation is about as intrusive as it
 gets...

I don't want to go down the road of making this a fully supported type.
I don't see any use case for it at all, and I think it's a bad idea to
design something with no idea how people might want to use it.

 Is it? That's actually too bad, since I kinda like it. But anyway,
 if that's a concern it could also be
   range_bounds(ARRAY[1,2]::int8range, '(]')

What type would the result of that be? What value?

  * It still suffers similar problems as casting back and forth to text:
  ANYARRAY is too general, doesn't really take advantage of the type
  system, and not a great fit anyway.
 
 I believe it alleviates the gravest problems of casting back and forth
 to text. It doesn't have quoting issues and it doesn't potentially lose
 information.

I think it still circumvents the type system to a degree. We're just
putting stuff in an array with no intention of really using it that way.

 In any case, I wouldn't expect this to *stay* the only way to construct
 a range forever. But I does have it's virtues for a first incarnation of
 range type, I believe, mostly because it's completely unintrusive and
 won't cause any backwards-compatbility headaches in the future

I'm not sure that your overloading of arrays is completely immune from
backwards-compatibility problems, should we decide to change it later.

But regardless, we have quite a lot of time to make a decision before
9.2 is released; so let's do it once and do it right.

 I fear that the intermediate type will turn out to be quite intrusive,
 at least if we try to handle all the corner cases and loose ends. And if
 we don't, I'm concerned that we're painting ourselves into a corner here...

Can you expand on some of the corner-cases and loose ends you're
concerned about? Does marking it as a pseudotype and making the IO
functions throw exceptions handle them?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-29 Thread Jeff Davis

On Wed, 2011-06-29 at 08:52 -0400, Robert Haas wrote:
 On Tue, Jun 28, 2011 at 11:02 PM, Jeff Davis pg...@j-davis.com wrote:
  It's still not out of the question, but I thought that the intermediate
  type would be a less-intrusive alternative (and Robert seemed concerned
  about how intrusive it was).
 
 I'm no great fan of our existing type system, and I'm not opposed to
 trying to improve it.  However, I'm a bit wary of the theory that we
 can just tweak X, Y, or Z and then everything will go more smoothly
 for range types.  I fear that there will be knock-on consequences that
 we'll spend a lot of time either (a) arguing about or (b) fixing.

Agreed.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-29 Thread Robert Haas

On Wed, Jun 29, 2011 at 11:41 AM, Jeff Davis pg...@j-davis.com wrote:
 Robert didn't really seem to like the idea of throwing an error though
 -- Robert, can you expand on your reasoning here?

I guess I don't have any terribly well-thought out reasoning - maybe
it's fine.  It just seems strange to have a type that you can't
display.

But now that I'm thinking about this a little more, I'm worried about this case:

CREATE TABLE foo AS RANGE('something'::funkytype, 'somethingelse'::funktype);
DROP TYPE funkytype;

It seems to me that the first statement had better fail, or else the
second one is going to create a hopeless mess (imagine that a new type
comes along and gets the OID of funkytype).

It also seems a bit strange to me that we're contemplating a system
where users are always going to have to cast the return type.
Generally, casts are annoying and we want to minimize the need for
them.  I'm not sure what the alternative is, though, unless we create
separate constructor functions for each type: int8range_cc(1, 2).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-29 Thread David E. Wheeler

On Jun 28, 2011, at 8:02 PM, Jeff Davis wrote:

 I think David Wheeler was trying to make a similar point, but I'm still
 not convinced.
 
 It's not a pair, because it can be made up of 0, 1, or 2 scalar values
 (unless you count infinity as one of those values, in which case 0 or
 2). And without ordering, it's not clear that those values are really
 bounds.
 
 The type needs to:
 * represent two values, either of which might be a special infinite
 value
 * represent the value empty
 * represent inclusivity/exclusivity of both values
 
 and those things seem fairly specific to ranges, so I don't really see
 what other use we'd have for such a type. But I'm open to suggestion.
 
 I don't think that having an extra type around is so bad. It solves a
 lot of problems, and doesn't seem like it would get in the way. And it's
 only for the construction of ranges out of scalars, which seems like the
 most natural place where a cast might be required (similar to casting an
 unknown literal, which is fairly common).

I'm fine with that, but my point is that if it's going to be exposed to users 
somehow, it needs to be useful on its own, without casting. Because some wit 
will make a column of this type. If it's not somehow useful on its own, then it 
should be an implementation detail or internal that I never see in SQL. IMHO.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-29 Thread David E. Wheeler

On Jun 29, 2011, at 8:41 AM, Jeff Davis wrote:

 We could make it a pseudo-type and make the IO functions generate
 exceptions. That should prevent most mistakes and effectively hide it
 from the user (sure, they could probably use it somewhere if they really
 want to, but I wouldn't be worried about breaking backwards
 compatibility with undocumented usage like that). There are plenty of
 types that are hidden from users in one way or another -- trigger, void,
 internal, fdw_handler, etc., so I don't see this as special-casing at
 all.

That could work.

 I don't want to go down the road of making this a fully supported type.
 I don't see any use case for it at all, and I think it's a bad idea to
 design something with no idea how people might want to use it.

+1

I'm still not clear, though, on why the return type of range() should not be 
related to the types of its arguments. So

range(1, 5)

Should return intrange, and

range(1::int8, 5::int8)

Should return int8range, and

range('foo', 'f')

Should return textrange.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-29 Thread Florian Pflug

On Jun29, 2011, at 18:34 , Robert Haas wrote:
 It also seems a bit strange to me that we're contemplating a system
 where users are always going to have to cast the return type.
 Generally, casts are annoying and we want to minimize the need for
 them.  I'm not sure what the alternative is, though, unless we create
 separate constructor functions for each type: int8range_cc(1, 2).

Well, if we want multiple range types per base type (which we do), then
the user needs to specify which one to use somehow. A cast seems the most
natural way to do that to me - after all, casting is *the* way to coerce
value to a certain type.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-29 Thread Florian Pflug

On Jun29, 2011, at 19:05 , David E. Wheeler wrote:
 I'm still not clear, though, on why the return type of range()
 should not be related to the types of its arguments. So
 
range(1, 5)
 
 Should return intrange, and
 
range(1::int8, 5::int8)
 
 Should return int8range, and
 
range('foo', 'f')
 
 Should return textrange.

Because there might be more than one range type for a
base type. Say there are two range types over text, one
with collation 'de_DE' and one with collation 'en_US'.
What would the type of
  range('foo', 'f')
be?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-29 Thread David E. Wheeler

On Jun 29, 2011, at 10:13 AM, Florian Pflug wrote:

 Because there might be more than one range type for a
 base type. Say there are two range types over text, one
 with collation 'de_DE' and one with collation 'en_US'.
 What would the type of
  range('foo', 'f')
 be?

The one that corresponds to the current LC_COLLATE setting.

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-29 Thread Peter Eisentraut

On ons, 2011-06-29 at 10:15 -0700, David E. Wheeler wrote:
 On Jun 29, 2011, at 10:13 AM, Florian Pflug wrote:
 
  Because there might be more than one range type for a
  base type. Say there are two range types over text, one
  with collation 'de_DE' and one with collation 'en_US'.
  What would the type of
   range('foo', 'f')
  be?
 
 The one that corresponds to the current LC_COLLATE setting.

Yes, or more generally, we have logic that determines, for example, what
collation to use for

'foo'  'f'

The same logic can be used to determine what collation to use for

range('foo', 'f')

(In fact, if you implement range() as a user-space function, that will
happen automatically.)



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-28 Thread Robert Haas

On Mon, Jun 27, 2011 at 11:42 PM, Jeff Davis pg...@j-davis.com wrote:
 So, in effect, RANGEINPUT is a special type used only for range
 constructors. If someone tried to output it, it would throw an
 exception, and we'd even have enough information at that point to print
 a nice error message with a hint.

I don't think I like the idea of throwing an error when you try to
output it, but the rest seems reasonably sensible.

 Actually, this is pretty much exactly Florian's idea (thanks again,
 Florian), but at the time I didn't like it because pair didn't capture
 everything that I wanted to capture, like infinite bounds, etc. But
 there's no reason that it can't, and your point made me realize that --
 you are effectively just using TEXT as the intermediate type (which
 works, but has some undesirable characteristics).

What undesirable characteristics?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-28 Thread David E. Wheeler

On Jun 27, 2011, at 8:42 PM, Jeff Davis wrote:

 Do we think that this is a good way forward? The only thing I can think
 of that's undesirable is that it's not normal to be required to cast the
 result of a function, and might be slightly difficult to explain in the
 documentation in a straightforward way

That's the part that bothers me. I think that if it's not cast it should 
somehow be useful. Maybe default to a text range or something?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-28 Thread Jeff Davis

On Tue, 2011-06-28 at 10:58 -0400, Robert Haas wrote:
 On Mon, Jun 27, 2011 at 11:42 PM, Jeff Davis pg...@j-davis.com wrote:
  So, in effect, RANGEINPUT is a special type used only for range
  constructors. If someone tried to output it, it would throw an
  exception, and we'd even have enough information at that point to print
  a nice error message with a hint.
 
 I don't think I like the idea of throwing an error when you try to
 output it, but the rest seems reasonably sensible.

I thought it might add a little confusion if people thought they had a
range type but really had RANGEINPUT. For instance, if you do a create
table as select range(1,2) then the result might be slightly
unexpected.

But it's probably no more unexpected than create table as select
'foo'. So, I suppose there's not much reason to throw an error. We can
just output it in the same format as a range type.

It's also much easier to explain something in the documentation that has
an output format, because at least it's tangible.

  Actually, this is pretty much exactly Florian's idea (thanks again,
  Florian), but at the time I didn't like it because pair didn't capture
  everything that I wanted to capture, like infinite bounds, etc. But
  there's no reason that it can't, and your point made me realize that --
  you are effectively just using TEXT as the intermediate type (which
  works, but has some undesirable characteristics).
 
 What undesirable characteristics?

Well, for one, outputting something as text and then reading it back in
does not always produce the same value. For instance, for float, it only
does that if you have extra_float_digits set to some high-enough value.
I suppose I could save the GUC, set it, and set it back; but that seems
like unnecessary ugliness.

There's also the deparsing/reparsing cycle. That might not really matter
for performance, but it seems unnecessary.

And there's always the fallback that we have types for a reason.
Wouldn't it be odd if you wrote a query like:
  select range(1,2) || 'foo'
and it succeeded? I'm sure that kind of thing can lead to some dangerous
situations.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-28 Thread Jeff Davis

On Tue, 2011-06-28 at 09:30 -0700, David E. Wheeler wrote:
 On Jun 27, 2011, at 8:42 PM, Jeff Davis wrote:
 
  Do we think that this is a good way forward? The only thing I can think
  of that's undesirable is that it's not normal to be required to cast the
  result of a function, and might be slightly difficult to explain in the
  documentation in a straightforward way
 
 That's the part that bothers me.

Yeah, that bothered me, too. 

 I think that if it's not cast it should somehow be useful.

Let's see, what can one do with a range that has no ordering yet? ;)

Robert suggested that we don't need to throw an error, and I think I
agree. Just having a working output function solves most of the
documentation problem, because it makes it less abstract.

The only operators that we could really support are accessors, which
seems somewhat reasonable. However, I'd have some concerns even about
that, because if you do range(10,1), then what's the upper bound?

 Maybe default to a text range or something?

That sounds a little dangerous:
  select range('1','09')
would fail before it could be cast to int4range.

We could invent an UNKNOWNRANGE type or something. But I don't
particularly like that; it would start out working nicely when people
only had one textrange type, and then their old queries would start
failing when they added another range type based on text.

I think it's fine if the RANGEINPUT type isn't too useful by itself.
It's already a common requirement to cast unknown literals, and this
isn't too much different. It's only for constructors, so it still fits
pretty closely with that idea.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-28 Thread Robert Haas

On Tue, Jun 28, 2011 at 12:58 PM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2011-06-28 at 10:58 -0400, Robert Haas wrote:
 On Mon, Jun 27, 2011 at 11:42 PM, Jeff Davis pg...@j-davis.com wrote:
  So, in effect, RANGEINPUT is a special type used only for range
  constructors. If someone tried to output it, it would throw an
  exception, and we'd even have enough information at that point to print
  a nice error message with a hint.

 I don't think I like the idea of throwing an error when you try to
 output it, but the rest seems reasonably sensible.

 I thought it might add a little confusion if people thought they had a
 range type but really had RANGEINPUT. For instance, if you do a create
 table as select range(1,2) then the result might be slightly
 unexpected.

True...

 But it's probably no more unexpected than create table as select
 'foo'. So, I suppose there's not much reason to throw an error. We can
 just output it in the same format as a range type.

+1.

 It's also much easier to explain something in the documentation that has
 an output format, because at least it's tangible.

+1.

  Actually, this is pretty much exactly Florian's idea (thanks again,
  Florian), but at the time I didn't like it because pair didn't capture
  everything that I wanted to capture, like infinite bounds, etc. But
  there's no reason that it can't, and your point made me realize that --
  you are effectively just using TEXT as the intermediate type (which
  works, but has some undesirable characteristics).

 What undesirable characteristics?

 Well, for one, outputting something as text and then reading it back in
 does not always produce the same value. For instance, for float, it only
 does that if you have extra_float_digits set to some high-enough value.
 I suppose I could save the GUC, set it, and set it back; but that seems
 like unnecessary ugliness.

Yeah, I don't think we want to go there.

 There's also the deparsing/reparsing cycle. That might not really matter
 for performance, but it seems unnecessary.

 And there's always the fallback that we have types for a reason.
 Wouldn't it be odd if you wrote a query like:
  select range(1,2) || 'foo'
 and it succeeded? I'm sure that kind of thing can lead to some dangerous
 situations.

That's pretty much what we tried to get rid of with the 8.3 casting
changes, so agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-28 Thread Florian Pflug

On Jun28, 2011, at 05:42 , Jeff Davis wrote:
 On Mon, 2011-06-27 at 14:50 -0400, Robert Haas wrote:
 Couldn't we also do neither of these things?  I mean, presumably
 '[1,10]'::int8range had better work.
 
 I think that if we combine this idea with Florian's PAIR suggestion
 here:
 http://archives.postgresql.org/message-id/ad4fc75d-db99-48ed-9082-52ee3a4d7...@phlo.org
 
 then I think we have a solution.
 
 If we add a type RANGEINPUT that is not a pseudotype, we can use that as
 an intermediate type that is returned by range constructors. Then, we
 add casts from RANGEINPUT to each range type. That would allow
  range(1,2)::int8range
 to work without changing the type system around, because range() would
 have the signature:
  range(ANYELEMENT, ANYELEMENT) - RANGEINPUT
 and then the cast would change it into an int8range. But we only need
 the one cast per range type, and we can also support all of the other
 kinds of constructors like:
  range_cc(ANYELEMENT, ANYELEMENT) - RANGEINPUT
  range_linf_c(ANYELEMENT) - RANGEINPUT
 without additional hassle.

Hm, so RANGEINPUT would actually be what was previously discussed as
the range as a pair of bounds definition, as opposed to the
range as a set of values definition. So essentially we'd add a
second concept of what a range is to work around the range input
troubles.

I believe if we go that route we should make RANGEINPUT a full-blown
type, having pair of bound semantics. Adding a lobotomized version
just for the sake of range input feels a bit like a kludge to me.

Alternatively, we could replace RANGEINPUT simply with ANYARRAY,
and add functions ANYRANGE-ANYRANGE which allow specifying the
bound operator (, = respectively ,=) after construction.

So you'd write (using the functions-as-fields syntax I believe
we support)
  (ARRAY[1,2]::int8range).left_open.right_closed for '(1,2]'
and
  ARRAY[NULL,2]::int8range for '[-inf,2]'

All assuming that modifying the type system to support polymorphic
type resolution based on the return type is out of the question... ;-)

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-28 Thread Jeff Davis

On Tue, 2011-06-28 at 22:20 +0200, Florian Pflug wrote:
 Hm, so RANGEINPUT would actually be what was previously discussed as
 the range as a pair of bounds definition, as opposed to the
 range as a set of values definition. So essentially we'd add a
 second concept of what a range is to work around the range input
 troubles.
 
 I believe if we go that route we should make RANGEINPUT a full-blown
 type, having pair of bound semantics. Adding a lobotomized version
 just for the sake of range input feels a bit like a kludge to me.

I think David Wheeler was trying to make a similar point, but I'm still
not convinced.

It's not a pair, because it can be made up of 0, 1, or 2 scalar values
(unless you count infinity as one of those values, in which case 0 or
2). And without ordering, it's not clear that those values are really
bounds.

The type needs to:
 * represent two values, either of which might be a special infinite
value
 * represent the value empty
 * represent inclusivity/exclusivity of both values

and those things seem fairly specific to ranges, so I don't really see
what other use we'd have for such a type. But I'm open to suggestion.

I don't think that having an extra type around is so bad. It solves a
lot of problems, and doesn't seem like it would get in the way. And it's
only for the construction of ranges out of scalars, which seems like the
most natural place where a cast might be required (similar to casting an
unknown literal, which is fairly common).

 Alternatively, we could replace RANGEINPUT simply with ANYARRAY,
 and add functions ANYRANGE-ANYRANGE which allow specifying the
 bound operator (, = respectively ,=) after construction.
 
 So you'd write (using the functions-as-fields syntax I believe
 we support)
   (ARRAY[1,2]::int8range).left_open.right_closed for '(1,2]'
 and
   ARRAY[NULL,2]::int8range for '[-inf,2]'

I think we can rule this one out:
 * The functions-as-fields syntax is all but deprecated (or should be)
 * That's hardly a readability improvement
 * It still suffers similar problems as casting back and forth to text:
ANYARRAY is too general, doesn't really take advantage of the type
system, and not a great fit anyway.

 All assuming that modifying the type system to support polymorphic
 type resolution based on the return type is out of the question... ;-)

It's still not out of the question, but I thought that the intermediate
type would be a less-intrusive alternative (and Robert seemed concerned
about how intrusive it was).

There also might be a little more effort educating users if we selected
the function based on the return type, because they might think that
casting the inputs explicitly would be enough to get it to pick the
right function. If it were a new syntax like RANGE[]::int8range, then I
think it would be easier to understand.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-27 Thread Darren Duncan


Jeff Davis wrote:

On Sun, 2011-06-26 at 00:57 -0700, Darren Duncan wrote:
I believe that the best general solution here is for every ordered base type to 
just have a single total order, which is always used with that type in any 
generic order-sensitive operation, including any ranges defined over it, or any 
ORDER BY or any ,,etc.  The built-in generic text type would have exactly 1 
system-defined collation that can't be changed, and it would be something simple 
and generic, such as simply sorting on the codepoint as integers.


Well, we're trying to support SQL, and SQL supports collations, so I
don't think we can just ignore that.


I'm not saying you can't support collations.  See also my reply to Tom.


I also agree with Tom that it's not a good idea. My reasons are:

 * Practical considerations, such as having a bunch of cruft from
duplicated types all over the system. With sufficient changes to the
type system, maybe that could be overcome. Or perhaps domains could be
used to make that work for range types (sort of), but the result would
not be very consistent with the rest of the system.


Yes, duplication can be avoided.


 * It doesn't seem to be based in any mathematical argument. A type is a
set of values, and there's no reason it can't have several total orders;
or no total order at all. So it appears to just be piggybacking on the
type system infrastructure as a place to hold the metadata for a total
order.


Yes, I agree that a type is a set of values, and a type can have 0..N total 
orders.  My proposal is just that, for those types that have at least 1 total 
order, exactly 1 of those is defined to be used implicitly in contexts where a 
total order is desired and no explicit collation is given, such as in ranges.



 * Who's to say that a compare function is the only way to specify a
total order? There might be other interfaces that would support
something closer to a lexicographic sort. So, from a theoretical
standpoint, trying to attach a single notion of total order to a type
seems strange, because there might be multiple interfaces for specifying
even one total order.


Thank you for bringing this up, the notion of multiple interfaces for specifying 
even one total order.  My example of a compare function was just an example, and 
it is valuable to consider that this may not be the only way to do it.



 * It would require extra explicit type annotations. If you have 12 text
types, the only way to practically use any text type is to constantly
specify which more-specific text type it actually is (probably using
the :: operator). That is not necessarily a bad choice if starting a
language from scratch and forming the syntax in a way that it's
reasonable to do. But this is SQL, and lots of type annotations are
un-SQL-like.


Well sometimes it doesn't hurt to suggest solutions from the point of view that 
one can start the language from scratch, because that provides a clean way to 
conceptualize and explain a feature.  And then people can find some middle 
ground that adapts benefits from that idea for the feature without changing SQL 
more than needed.  Witness the various improvements to Perl 5 that were first 
expressed in terms of Perl 6.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-27 Thread Florian Pflug

On Jun27, 2011, at 02:48 , Jeff Davis wrote:
 On Mon, 2011-06-27 at 00:56 +0200, Florian Pflug wrote:
 Well, there actually *is* some precedence for that kind of top-down
 (form a syntactic perspective) type inference. We *enforce* the cast
 in 
  array[]::arraytype
 and actually for a very similar reason - without the case, there's no
 way of knowing which type of empty array was meant. I think we also
 
 That's a good point.
 
 Although, I'm not sure whether that's an argument that we can make the
 type system work as-is, or if it means that we should add syntax like
 ARRAY[].

It was meant as an argument for the former, i.e. for extending the type
system (or rather the function call syntax, as I argue below).

 special-case
  'literal'::type
 to use the input function of type directly, instead of first creating
 a text value and later casting it to type.
 
 That is certainly true. Quoted strings never start out as text, they
 start out as unknown and wait for the type inference to determine the
 type. I'm not entirely sure whether a quoted string followed by a cast
 is briefly unknown and then cast, or if it's directly interpreted using
 the cast's type input function.

It's at least labelled with type unknown for a while AFAIK.

 I don't know if that's a good example though because it's near the end
 of the line and there's no function call in between the arguments and
 the cast. It might get more complex with cases like:
 
  range(lower(range(1,2)),upper(range(1,2)))::int8range
 
 but maybe that can be done more easily than I think?

I wouldn't take it that far. What I had in mind was to *only* support
the case where the cast directly follows the function call, i.e. the case
  f(...)::type

I view this more as an extension of the function call syntax than of
type inference. In other languages with polymorphism, there usually is
an explicit syntactic construct for specifying the type arguments to
a polymorphic function. For example, C++ you'd write
  make_rangeint(3,4)
to call the polymorphic function make_range() with it's (first)
type argument set to int. I think of
  f(...)::type
as essentially the same thing, but re-using already existing syntax
instead of inventing new one. 

I just checked - we currently special case array[]::type in transformExpr()
by detecting the case of an array expression being the immediate child
of a cast expression. I suggest we do the same for f(...)::type, i.e.
also special case a function call being the immediate child of a cast
expression and pass down the forced result type to the function call node.

Function call nodes would then usually ignore that passed-down result type,
except in the case of a polymorphic functions whose argument types don't
uniquely define its result type.

But I haven't tried doing that, so there might be stumbling block down
that road that I missed...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-27 Thread Jeff Davis

On Mon, 2011-06-27 at 12:16 +0200, Florian Pflug wrote:
 I wouldn't take it that far. What I had in mind was to *only* support
 the case where the cast directly follows the function call, i.e. the case
   f(...)::type

OK, so instead of writing:
range(lower(range(1,2)),upper(range(1,2)))::int8range

users would write:
range(lower(range(1,2)::int8range),upper(range(1,2)::int8range))::int8range

A little more verbose, but it seems like it wouldn't be a practical
problem in very many cases. Multiple levels of constructors seem like
they'd be fairly uncommon, and probably a case where a function should
be written anyway.

OK, I'll have to think about this a little more, but it seems like a
reasonable approach.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-27 Thread Jeff Davis

On Sun, 2011-06-26 at 22:29 -0700, Darren Duncan wrote:
 Tom Lane wrote:
  Darren Duncan dar...@darrenduncan.net writes:
  I believe that the best general solution here is for every ordered base 
  type to 
  just have a single total order, which is always used with that type in any 
  generic order-sensitive operation, including any ranges defined over it, 
  or any 
  ORDER BY or any ,,etc.
  
  We've spent years and blood on making sure that Postgres could support
  multiple orderings for any datatype; and there are plenty of natural
  examples for the usefulness of that.  So I'm not at all impressed by
  any line of reasoning that starts out by baldly throwing that away.
 
 I'm not saying that you can't use multiple orderings with a data type.  I'm 
 just 
 saying that the type only has *at most* one (possibly none) *native* 
 ordering, 
 which is what is used when you do something ordered-sensitive with the type, 
 such as have a range.

So, are you saying that it would be impossible to have a range that uses
a different ordering? What about ORDER BY? What about BTrees?

And if those things can use different orders for the same type, then
what is the difference between what you are suggesting and a default
ordering for the type (which we already support)?

I suppose it's hard to tell what you mean by native.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-27 Thread Darren Duncan


Jeff Davis wrote:

On Sun, 2011-06-26 at 22:29 -0700, Darren Duncan wrote:

Tom Lane wrote:

Darren Duncan dar...@darrenduncan.net writes:
I believe that the best general solution here is for every ordered base type to 
just have a single total order, which is always used with that type in any 
generic order-sensitive operation, including any ranges defined over it, or any 
ORDER BY or any ,,etc.

We've spent years and blood on making sure that Postgres could support
multiple orderings for any datatype; and there are plenty of natural
examples for the usefulness of that.  So I'm not at all impressed by
any line of reasoning that starts out by baldly throwing that away.
I'm not saying that you can't use multiple orderings with a data type.  I'm just 
saying that the type only has *at most* one (possibly none) *native* ordering, 
which is what is used when you do something ordered-sensitive with the type, 
such as have a range.


So, are you saying that it would be impossible to have a range that uses
a different ordering? What about ORDER BY? What about BTrees?

And if those things can use different orders for the same type, then
what is the difference between what you are suggesting and a default
ordering for the type (which we already support)?

I suppose it's hard to tell what you mean by native.

Regards,
Jeff Davis


Maybe I'm just talking about default ordering then. -- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-27 Thread Robert Haas

On Sat, Jun 25, 2011 at 6:29 PM, Jeff Davis pg...@j-davis.com wrote:
 Different ranges over the same subtype make sense when using different
 total orders for the subtype. This is most apparent with text collation,
 but makes sense (at least mathematically, if not practically) for any
 subtype.

 For instance:
  [a, Z)
 is a valid range in en_US, but not in C, so it makes sense to have
 multiple ranges over the same subtype with different collations.

 But what if you have a function (like a constructor), of the form:
  (anyelement, anyelement) - anyrange
 ? To work with the type system, you need to be able to figure out the
 return type from the arguments; which means to support functions like
 this we need a mapping from the subtype to the range type.
 Unfortunately, that restricts us to one range type per subtype (this
 isn't a problem for ARRAYs, because there is only one useful array type
 for a given element type).

 This problem first came up a while ago:
 http://archives.postgresql.org/pgsql-hackers/2011-01/msg02788.php

 My workaround was to use domains, but that's not a very clean solution
 (you have to add a bunch of casts to make sure the right domain is
 chosen). It became entirely unworkable with collations, because people
 would be using different text collations a lot more frequently than,
 say, a different ordering for timestamptz. Tom mentioned that here:

 http://archives.postgresql.org/message-id/24831.1308579...@sss.pgh.pa.us

 I think Florian proposed the most promising line of attack here:

 http://archives.postgresql.org/message-id/ad4fc75d-db99-48ed-9082-52ee3a4d7...@phlo.org

 by suggesting that functions of the form:
  (anyelement, [other non-anyrange arguments]) - anyrange
 might be expendable. After all, they are only useful for constructors as
 far as we can tell. Other range functions will have an anyrange
 parameter, and we can use the actual type of the argument to know the
 range type (as well as the subtype).

 Although it's very nice to be able to say:
  range(1,10)
 and get an int4range out of it, it's not the only way, and it's not
 without its problems anyway. For instance, to get an int8range you have
 to do:
  range(1::int8, 10::int8)
 or similar.

 So, we could just make functions like:
  int4range(int4, int4)
  int8range(int8, int8)
  ...
 when creating the range type, and it would actually be a usability
 improvement.

Couldn't we also do neither of these things?  I mean, presumably
'[1,10]'::int8range had better work.

I'm not saying that's ideal from a usability perspective but I fear
this patch is going to be unmanageably large, and separating out the
things that you need for it to work at all from the things that you
need in order for it to be convenient might have some merit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-27 Thread Jeff Davis

On Mon, 2011-06-27 at 14:50 -0400, Robert Haas wrote:
 Couldn't we also do neither of these things?  I mean, presumably
 '[1,10]'::int8range had better work.

I think that if we combine this idea with Florian's PAIR suggestion
here:
http://archives.postgresql.org/message-id/ad4fc75d-db99-48ed-9082-52ee3a4d7...@phlo.org

then I think we have a solution.

If we add a type RANGEINPUT that is not a pseudotype, we can use that as
an intermediate type that is returned by range constructors. Then, we
add casts from RANGEINPUT to each range type. That would allow
  range(1,2)::int8range
to work without changing the type system around, because range() would
have the signature:
  range(ANYELEMENT, ANYELEMENT) - RANGEINPUT
and then the cast would change it into an int8range. But we only need
the one cast per range type, and we can also support all of the other
kinds of constructors like:
  range_cc(ANYELEMENT, ANYELEMENT) - RANGEINPUT
  range_linf_c(ANYELEMENT) - RANGEINPUT
without additional hassle.

The RANGEINPUT type itself would hold similar information to actual
range types: the subtype OID (instead of the range type, because it's
not a range yet), optionally the two bounds (depending on the flags),
and the flags byte. The cast to a real range type would read the
subtype, and try to coerce the bounds to the subtype of the range you're
casting to, set the range type oid, leave the flags byte the same, and
it's done.

So, in effect, RANGEINPUT is a special type used only for range
constructors. If someone tried to output it, it would throw an
exception, and we'd even have enough information at that point to print
a nice error message with a hint.

Actually, this is pretty much exactly Florian's idea (thanks again,
Florian), but at the time I didn't like it because pair didn't capture
everything that I wanted to capture, like infinite bounds, etc. But
there's no reason that it can't, and your point made me realize that --
you are effectively just using TEXT as the intermediate type (which
works, but has some undesirable characteristics).

Do we think that this is a good way forward? The only thing I can think
of that's undesirable is that it's not normal to be required to cast the
result of a function, and might be slightly difficult to explain in the
documentation in a straightforward way.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-26 Thread Darren Duncan


To eludicate my earlier comments on this subject ...

I believe that the best general solution here is for every ordered base type to 
just have a single total order, which is always used with that type in any 
generic order-sensitive operation, including any ranges defined over it, or any 
ORDER BY or any ,,etc.  The built-in generic text type would have exactly 1 
system-defined collation that can't be changed, and it would be something simple 
and generic, such as simply sorting on the codepoint as integers.


When we want to have some other native ordering for an existing type, such as 
when we want to use a different text collation, we do this by creating a *new 
base type*, using CREATE TYPE or some shorthand thereof, and this new type 
defines its own ordering, such as that a particular text collation is used.


For example:

  CREATE TYPE text__en_US AS (v text);

  CREATE TYPE text__C AS (v text);

These will not compare equal to each other or to text, and that is good, because 
having a different text collation implies that we consider 'foo'::text__en_US 
and 'foo'::text__C to be different values.


I believe that any other approach is worse, and in particular I believe that 
creating DOMAIN over text is worse, because DOMAIN are supposed to be subtypes 
of some other type, whose set of member values is a subset of the other type's 
values, and that have the same ordering.  Multiple CREATE type over the same 
base type don't interfere with each other like multiple DOMAIN could.


Assuming that what CREATE TYPE produces is actually a base type, I believe there 
is no better solution using the facilities that SQL provides.


If there is concern about performance related to CREATE TYPE being a composite 
type, I'm sure it is possible to engineer an optimization for when the type has 
just 1 attribute so that performance isn't an issue.  The main point I'm trying 
to raise here is about semantics and good type systems.


Likewise, don't let concern about syntax for using values of such composite 
types.  Once again, there can be shorthands if necessary.


In fact, Postgres could provide a general shorthand for creating a composite 
type of 1 attribute whose purpose is to make one type that is like but unequal 
to another, and using this shorthand could also cause the composite type to 
overload/polymorph all the operators of it's attribute type, so that the syntax 
to define one is very short.


For example:

  CREATE TYPE text__en_US WRAPS text COLLATE en_US;

... and I assume the name of that attribute would just be system-defined.

Note that the above is specific to wrapping text, and the COLLATE is just 
shorthand for defining an ordering function for text__en_US.  A more general 
form could be:


  CREATE TYPE bar WRAPS foo ORDER USING FUNCTION baz (lhs foo, rhs foo) ...;

And then we can say:

  RANGE OF text__en_US

  RANGE OF text

... similarly to how we declare array types with ARRAY.

One can also just define range values as they do array values, such as like 
this:

  range('foo','bar')  # default collation

  range('foo'::text__en_US, 'bar'::text__en_US)  # en_us collation

If that seems verbose, I have a few words for you:

1.  Users should in practice name their wrapper types over their intended 
meaning, not their mechanics, such as like this (not using text for variety), 
and that may be more terse:


  CREATE TYPE acct_num WRAPS integer;  # inherits integer ordering by default

2.  If the wrapper types overload the base operators, either automatically or 
selectively (does it make sense to multiply an acct_num?), one doesn't have to 
keep unpacking and packing them to use them in most cases.  For example, I'd 
expect many text wrappers to polymorph catenation or substring etc, so no extra 
syntax.


3.  In practice, most literal values come from applications and are given to SQL 
code either as function parameters or bind parameter arguments.  While lots of 
example code may have literal values in it, I would think that most real-work 
code would hardly have any, and hence you'd rarely see any 'foo'::text__en_US 
for example.  You'd more likely see the less common var::text__en_US or such.


So that's my position, CREATE TYPE on the regular types or the like is the best 
solution, and anything else is an inferior solution.


Such a design is also how I do collations and ranges in my Muldis D language.

-- Darren Duncan

Jeff Davis wrote:

Different ranges over the same subtype make sense when using different
total orders for the subtype. This is most apparent with text collation,
but makes sense (at least mathematically, if not practically) for any
subtype.

For instance:
 [a, Z)
is a valid range in en_US, but not in C, so it makes sense to have
multiple ranges over the same subtype with different collations.

But what if you have a function (like a constructor), of the form:
  (anyelement, anyelement) - anyrange
? To work with the type system, you need to be able to figure out

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-26 Thread Tom Lane

Darren Duncan dar...@darrenduncan.net writes:
 I believe that the best general solution here is for every ordered base type 
 to 
 just have a single total order, which is always used with that type in any 
 generic order-sensitive operation, including any ranges defined over it, or 
 any 
 ORDER BY or any ,,etc.

We've spent years and blood on making sure that Postgres could support
multiple orderings for any datatype; and there are plenty of natural
examples for the usefulness of that.  So I'm not at all impressed by
any line of reasoning that starts out by baldly throwing that away.

 When we want to have some other native ordering for an existing type, such 
 as 
 when we want to use a different text collation, we do this by creating a *new 
 base type*,

Nope.  This has all sorts of problems that you're conveniently ignoring,
beginning with the need to duplicate all of the infrastructure for the
type (such as non-ordering-related operators), and then moving into
difficulties arising from added ambiguity as to which operator is meant.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-26 Thread Jeff Davis

On Sun, 2011-06-26 at 00:57 -0700, Darren Duncan wrote:
 I believe that the best general solution here is for every ordered base type 
 to 
 just have a single total order, which is always used with that type in any 
 generic order-sensitive operation, including any ranges defined over it, or 
 any 
 ORDER BY or any ,,etc.  The built-in generic text type would have exactly 1 
 system-defined collation that can't be changed, and it would be something 
 simple 
 and generic, such as simply sorting on the codepoint as integers.

Well, we're trying to support SQL, and SQL supports collations, so I
don't think we can just ignore that.

I also agree with Tom that it's not a good idea. My reasons are:

 * Practical considerations, such as having a bunch of cruft from
duplicated types all over the system. With sufficient changes to the
type system, maybe that could be overcome. Or perhaps domains could be
used to make that work for range types (sort of), but the result would
not be very consistent with the rest of the system.

 * It doesn't seem to be based in any mathematical argument. A type is a
set of values, and there's no reason it can't have several total orders;
or no total order at all. So it appears to just be piggybacking on the
type system infrastructure as a place to hold the metadata for a total
order.

 * Who's to say that a compare function is the only way to specify a
total order? There might be other interfaces that would support
something closer to a lexicographic sort. So, from a theoretical
standpoint, trying to attach a single notion of total order to a type
seems strange, because there might be multiple interfaces for specifying
even one total order.

 * It would require extra explicit type annotations. If you have 12 text
types, the only way to practically use any text type is to constantly
specify which more-specific text type it actually is (probably using
the :: operator). That is not necessarily a bad choice if starting a
language from scratch and forming the syntax in a way that it's
reasonable to do. But this is SQL, and lots of type annotations are
un-SQL-like.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-26 Thread Florian Pflug

On Jun26, 2011, at 00:29 , Jeff Davis wrote:
 declare the return type of a function, and then use the declared type to
 infer the argument types. That would be nice because you would just have
 to do:
  range(1,10)::int8range
 However, that's kind of backwards from how our type inference system
 works now, and sounds like a big change.

Well, there actually *is* some precedence for that kind of top-down
(form a syntactic perspective) type inference. We *enforce* the cast
in 
  array[]::arraytype
and actually for a very similar reason - without the case, there's no
way of knowing which type of empty array was meant. I think we also
special-case
  'literal'::type
to use the input function of type directly, instead of first creating
a text value and later casting it to type.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-26 Thread Jeff Davis

On Mon, 2011-06-27 at 00:56 +0200, Florian Pflug wrote:
 Well, there actually *is* some precedence for that kind of top-down
 (form a syntactic perspective) type inference. We *enforce* the cast
 in 
   array[]::arraytype
 and actually for a very similar reason - without the case, there's no
 way of knowing which type of empty array was meant. I think we also

That's a good point.

Although, I'm not sure whether that's an argument that we can make the
type system work as-is, or if it means that we should add syntax like
ARRAY[].

 special-case
   'literal'::type
 to use the input function of type directly, instead of first creating
 a text value and later casting it to type.

That is certainly true. Quoted strings never start out as text, they
start out as unknown and wait for the type inference to determine the
type. I'm not entirely sure whether a quoted string followed by a cast
is briefly unknown and then cast, or if it's directly interpreted using
the cast's type input function.

I don't know if that's a good example though because it's near the end
of the line and there's no function call in between the arguments and
the cast. It might get more complex with cases like:

  range(lower(range(1,2)),upper(range(1,2)))::int8range

but maybe that can be done more easily than I think?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Range Types, constructors, and the type system

2011-06-26 Thread Darren Duncan


Tom Lane wrote:

Darren Duncan dar...@darrenduncan.net writes:
I believe that the best general solution here is for every ordered base type to 
just have a single total order, which is always used with that type in any 
generic order-sensitive operation, including any ranges defined over it, or any 
ORDER BY or any ,,etc.


We've spent years and blood on making sure that Postgres could support
multiple orderings for any datatype; and there are plenty of natural
examples for the usefulness of that.  So I'm not at all impressed by
any line of reasoning that starts out by baldly throwing that away.


I'm not saying that you can't use multiple orderings with a data type.  I'm just 
saying that the type only has *at most* one (possibly none) *native* ordering, 
which is what is used when you do something ordered-sensitive with the type, 
such as have a range.


To be specific, if the type system supports a concept like Perl 6 roles (or 
other languages have similar concepts) where types can declare themselves 
members of a union type such as Ordered, then types of that union would have 
the native ordering and other types wouldn't and then generic range operators 
could be declared over ANYORDERED or such.


When we want to have some other native ordering for an existing type, such as 
when we want to use a different text collation, we do this by creating a *new 
base type*,


Nope.  This has all sorts of problems that you're conveniently ignoring,
beginning with the need to duplicate all of the infrastructure for the
type (such as non-ordering-related operators), and then moving into
difficulties arising from added ambiguity as to which operator is meant.


Well a related solution is to have exactly 1 text wrapper type which has 2 
attributes, one being the text value and the other being the collation name. 
Then you just have 1 type that does the job instead a separate one per 
collation.  But to keep the semantics, the collation name is part of the 
identity of the type.  For example:


  CREATE TYPE collated_text AS (t text, c collation);

The key point I'm trying to support is that collation issues are firmly attached 
to the text type, not the range type.


Anyway, if a better solution can be arrived at for the problem at hand, then 
good for the team; meanwhile, what I've proposed is the best one I can think of.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

58 matches

Mail list logo