Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-21 Thread Jeff Davis
On Fri, 2010-04-16 at 23:18 -0700, Scott Bailey wrote:
 Well I've been doing a lot of work with range abstract data types in 
 Oracle lately. And I've got to say that the OO features in Oracle make 
 it really nice. Of course its Oracle, so its like a half baked OO in 
 cobol syntax, lol. But I for one think it would be great if Postgres had 
   object data types that had methods and could be subclassed.

That's interesting. I think the most critical piece of that is
subclassing in the sense of a type interface.

There have already been a couple threads about type interfaces:

http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php
http://archives.postgresql.org/pgsql-hackers/2010-04/msg00443.php

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-17 Thread Scott Bailey

Jeff Davis wrote:

On Fri, 2010-04-09 at 12:50 -0500, Kevin Grittner wrote:

I just thought that if you were adding more type information,
oriented aournd the types themselves rather than index AMs, some form
of inheritence might fit in gracefully.


There are already some specific proposals for inheritance in database
theory literature. For instance: Databases, Types, and the Relational
Model by C.J. Date addresses inheritance explicitly (and the appendices
have some interesting discussion).

I'm not sure how compatible it is with SQL, though; and I am not very
optimistic that we could accomplish such a restructuring of the type
system while maintaining a reasonable level of backwards compatibility.

Either way, I think it's a separate topic. Two types that are not
related by any subtype/supertype relationship (like strings and ints)
can conform to the same interface (total ordering); while the very same
type can conform to two different interfaces.

Regards,
Jeff Davis



Well I've been doing a lot of work with range abstract data types in 
Oracle lately. And I've got to say that the OO features in Oracle make 
it really nice. Of course its Oracle, so its like a half baked OO in 
cobol syntax, lol. But I for one think it would be great if Postgres had 
 object data types that had methods and could be subclassed.


For those not familiar with ADT's in Oracle, here's an example:

CREATE TYPE period AS OBJECT (
  beginningDATE,
  ending   DATE,
  CONSTRUCTOR FUNCTION period (
self IN OUT NOCOPY period,
beginning DATE,
ending DATE
  )  RETURN SELF AS RESULT,
  -- config functions
  MEMBER FUNCTION granule RETURN INTERVAL DAY TO SECOND,
  MEMBER FUNCTION def_inc RETURN NUMBER,
  MEMBER FUNCTION range_union(p2 period) RETURN period
  ...
) NOT FINAL;

CREATE TYPE date_range UNDER period (
  OVERRIDING MEMBER FUNCTION granule RETURN INTERVAL DAY TO SECOND,
  OVERRIDING MEMBER FUNCTION def_inc RETURN NUMBER,
  ...
);

Scott Bailey

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-14 Thread Bruce Momjian
Robert Haas wrote:
 On Fri, Apr 9, 2010 at 11:07 AM, Yeb Havinga yebhavi...@gmail.com wrote:
 
  From the implementers perspective, IMHO an extra catalog entry in pg_type
  is not bad on its own, you would have one anyway if the range type was
  explicitly programmed. About different kinds of range types - I would not
  know how to 'promote' integer into anything else but just one kind of 
  'range
  of integer' type. So the number of extra pg_types would be more like
  O(number of linear ordered base types).
 
  .. I now see the example of different ranges in your original mail with
  different unit increments. Making that more general so there could be
  continuous and discrete ranges and for the latter, what would the increment
  be.. OTOH is a range of integers with increment x a different type from
  range of integers with increment y, if xy? Maybe the increment step and
  continuous/discrete could be typmods.
 
 Nope, not enough bits available there.  This is fundamentally why the
 typid/typmod system is so broken - representing a type as a fixed size
 object is extremely limiting.  A fixed size object that MUST consist
 of a 32-bit unsigned OID and a 32-bit signed integer is even more
 limiting.

You mean the typmod system we developed 13 years ago needs updating? 
Seems unlikely.  ;-)

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Yeb Havinga

Robert Haas wrote:

The advantage of specifying a + and a - in the type interface is that
the unit definition can then be specified as part of the type
declaration itself.  So you can do:

CREATE TYPE ts_sec AS RANGE OVER timestamp (UNIT = '1s');
CREATE TYPE ts_min AS RANGE OVER timestamp (UNIT = '1m');
  
The problem with mixing units with ranges is that units are properties 
of some underlying datatype but not all types on which ranges can be 
defined.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Yeb Havinga

Jeff Davis wrote:

To give some context, I started a thread a while ago:

http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php
  
Interesting, a join type for overlaps, which makes me think a bit of the 
staircase join for pre-post coordinates. However, does a join operator 
type need certain kinds of properties of the operator involved, e.g. 
being commutative, transitive etc? Else the join reordering fails. The 
latter fails for the overlap operator.


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Robert Haas
On Sat, Apr 10, 2010 at 12:05 PM, Yeb Havinga yebhavi...@gmail.com wrote:
 Jeff Davis wrote:

 To give some context, I started a thread a while ago:

 http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php


 Interesting, a join type for overlaps, which makes me think a bit of the
 staircase join for pre-post coordinates. However, does a join operator type
 need certain kinds of properties of the operator involved, e.g. being
 commutative, transitive etc? Else the join reordering fails. The latter
 fails for the overlap operator.

I don't think I follow this.  As far as I know, the join order
constraints don't depend on the choice of operator.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Yeb Havinga

Robert Haas wrote:

On Sat, Apr 10, 2010 at 12:05 PM, Yeb Havinga yebhavi...@gmail.com wrote:
  

Jeff Davis wrote:


To give some context, I started a thread a while ago:

http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php

  

Interesting, a join type for overlaps, which makes me think a bit of the
staircase join for pre-post coordinates. However, does a join operator type
need certain kinds of properties of the operator involved, e.g. being
commutative, transitive etc? Else the join reordering fails. The latter
fails for the overlap operator.



I don't think I follow this.  As far as I know, the join order
constraints don't depend on the choice of operator.
  
I was thinking of a case for instance for ranges a,b,c in relations 
A,B,C respectively, where  a  b and b  c, but not a  c. Would the 
planner consider a join path of table A and C first, then that result 
with B. After looking in doxygen, it looks like having  defined 
without MERGES is what prevents this unwanted behaviour, since that 
prevents a,b and c to become members of the same equivalence class. 
Sorry for the spam on the list.


regards,
Yeb Havinga



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Jeff Davis
On Sat, 2010-04-10 at 20:25 +0200, Yeb Havinga wrote:
 I was thinking of a case for instance for ranges a,b,c in relations 
 A,B,C respectively, where  a  b and b  c, but not a  c. Would the 
 planner consider a join path of table A and C first, then that result 
 with B. After looking in doxygen, it looks like having  defined 
 without MERGES is what prevents this unwanted behaviour, since that 
 prevents a,b and c to become members of the same equivalence class. 

Interesting, I would have to make sure that didn't happen. Most likely
there would be a new property like RANGEMERGES, it wouldn't reuse the
existing MERGES property.

 Sorry for the spam on the list.

Not at all, it's an interesting point.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Robert Haas
On Fri, Apr 9, 2010 at 5:49 PM, Jeff Davis pg...@j-davis.com wrote:
 On Thu, 2010-04-08 at 22:29 -0400, Robert Haas wrote:
 1. knngist wants to use index scans to speed up queries of the form
 SELECT ... ORDER BY column op constant (as opposed to the
 existing machinery which only knows how to use an index for SELECT ...
 ORDER BY column).
 2. Window functions want to define windows over a range of values
 defined by the underlying data type.  To do this, we need to define
 what addition and subtraction mean for a particular data type.
 3. Jeff Davis is interested in implementing range types.  When the
 underlying base type is discrete, e.g. integers, you can say that
 [1,3] = [1,4), but only if you know that 3 and 4 are consecutive (in
 that order).

 To give some context, I started a thread a while ago:

 http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php

 Tom provided some interesting suggestions in that thread, but I'm not
 sure they would work for #1 or #2.

rereads thread

The map  to  case is interesting.  It doesn't seem like it's
really a good candidate for type interfaces, because you you're not
really looking for the strictly-left-of operator; you're looking for
the strictly-left-of operator associated with the overlaps operator
actually specified.  And ideally there might be an index strategy
number available for , too, so that you could consider doing an
index scan instead of a sort, but not necessarily.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-10 Thread Robert Haas
On Sat, Apr 10, 2010 at 2:30 PM, Jeff Davis pg...@j-davis.com wrote:
 On Sat, 2010-04-10 at 20:25 +0200, Yeb Havinga wrote:
 I was thinking of a case for instance for ranges a,b,c in relations
 A,B,C respectively, where  a  b and b  c, but not a  c. Would the
 planner consider a join path of table A and C first, then that result
 with B. After looking in doxygen, it looks like having  defined
 without MERGES is what prevents this unwanted behaviour, since that
 prevents a,b and c to become members of the same equivalence class.

 Interesting, I would have to make sure that didn't happen. Most likely
 there would be a new property like RANGEMERGES, it wouldn't reuse the
 existing MERGES property.

 Sorry for the spam on the list.

 Not at all, it's an interesting point.

Yeah... I agree.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Dimitri Fontaine
Hi,

First, I like the way you got back to the needs before trying to
organize an approach to find a solution. Having said it allows me to cut
a lot of your text, it's the one I agree with :)

Robert Haas robertmh...@gmail.com writes:
 Given a type T, I think we'd like to be able to define a type U as
 the natural type to be added to or subtracted from T.  As Jeff
 pointed out to me, this is not necessarily the same as the underlying
 type.  For example, if T is a timestamp, U is an interval; if T is a
 numeric, U is also a numeric; if T is a cidr, U is an integer.  Then
 we'd like to define a canonical addition operator and a canonical
 subtraction operator.  I think that would be sufficient for the needs
 of RANGE BETWEEN ... PRECEDING AND ... FOLLOWING.  It would also be
 nearly sufficient for range types, but in that case you also need to
 specify the unit increment within U - i.e. a 1 value for the
 datatype.  It may or may not be worth building the concept of a unit
 increment into the type interface machinery, though: one could imagine
 two different range types built over the same base type with different
 unit increments - e.g. one timestamp range with unit increment = 1s,
 and one with unit increment = 1m.  Under the first type [4pm,5pm) =
 [4pm,4:59:59pm], while under the second [4pm,5pm) = [4pm,4:59pm].

Do we want to enable support for string based ranges, as in the
contributed prefix_range type?

 Thoughts?

I like the type interface approach and I think this concept has been
studied in great details in math and that we should start from existing
concepts, even if most of them are way over my head.

The ORDER BY problem refers to a metric space, defined by a distance
function. Continuing your proposal the distance function return type
would be of domain U. KNNGist is then a way to use the GiST index to
sort by distance.

  http://archives.postgresql.org/pgsql-hackers/2010-02/msg01107.php

You'll see in this mail a proposal for an operator group notion, which
could get renamed to type interface if we think we won't need rings and
such rather than just groups in the future. And there's opportunity for
multi-type interfaces too (think families), like what's the distance
between a point and a circle?

The math groups already have a notion of neutral element, which for the
addition is 0 (zero), we could expand our version of it with a unity
element, which would be in the T domain. 

Then the range type could expand on this and provide a different unity
value in their own interface, in the U domain this time. IMO tying the
precision of the range interval into the type interface is a bad
abstraction. As you said we want to possibly have several ranges types
atop this.

We can say that [1,3] = [1,4) when considering a default integer range
because 4-3 = unity(integer). When considering a range over timestamps
with a range interval unity of 1s, we are still able to do the math, and
we can have another range over timestamps with a range interval unity of
10 mins in the same database. (I'm using this later example with the
period datatype in a real application).

While speaking of all that, in the prefix_range case, it'd be useful to
have a new kind of typemod system at the range level, to allow for
defining prefix text range with the '/' separator, say. Then

   greater_prefix('/etc/bar', '/etc/baz') = '/etc' (or is it '/etc/'?)

Whereas currently

  = select '/etc/baz'::prefix_range | '/etc/bar';
 ?column?   
  --
   /etc/ba[r-z]
  (1 row)

Regards,
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Yeb Havinga

Robert Haas wrote:

Under the first type [4pm,5pm) =
[4pm,4:59:59pm], while under the second [4pm,5pm) = [4pm,4:59pm].

Thoughts?
  
The examples with units look a lot like the IVLPQ datatype from HL7, 
see 
http://www.hl7.org/v3ballot/html/infrastructure/datatypes_r2/datatypes_r2.htm


About a type interface, the HL7 spec talks about promotion from e.g. a 
timestamp to an interval (hl7 speak for range) of timestamps (a range), 
and demotion for the back direction. Every 'quantity type', which is any 
type with a (possibly partially) lineair ordered domain, can be promoted 
to an interval of that type. In PostgreSQL terms, this could perhaps 
mean that by 'tagging' a datatype as a lineair order, it could 
automatically have a range type defined on it, like done for the array 
types currently.


regards,
Yeb Havinga






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Robert Haas
On Fri, Apr 9, 2010 at 7:55 AM, Yeb Havinga yebhavi...@gmail.com wrote:
 Robert Haas wrote:

 Under the first type [4pm,5pm) =
 [4pm,4:59:59pm], while under the second [4pm,5pm) = [4pm,4:59pm].

 Thoughts?


 The examples with units look a lot like the IVLPQ datatype from HL7, see
 http://www.hl7.org/v3ballot/html/infrastructure/datatypes_r2/datatypes_r2.htm

 About a type interface, the HL7 spec talks about promotion from e.g. a
 timestamp to an interval (hl7 speak for range) of timestamps (a range), and
 demotion for the back direction. Every 'quantity type', which is any type
 with a (possibly partially) lineair ordered domain, can be promoted to an
 interval of that type. In PostgreSQL terms, this could perhaps mean that by
 'tagging' a datatype as a lineair order, it could automatically have a range
 type defined on it, like done for the array types currently.

The way we've handled array types is, quite frankly, horrible.  It's
bad enough that we now have two catalog entries in pg_type for each
base type; what's even worse is that if we actually wanted to enforce
things like the number of array dimensions we'd need even more - say,
seven per base type, one for the base type itself, one for a
one-dimensional array, one for a two-dimensional array, one for a
three-dimensional array.  And then if we want to support range types
that's another one for every base type, maybe more if there's more
than one kind of range over a base type.  It's just not feasible to
handle derived types in a way that require a new instance of each base
type to be created for each kind of derived type.  It scales as
O(number of base types * number of kinds of derived type), and that
rapidly gets completely out of hand

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Robert Haas
On Fri, Apr 9, 2010 at 10:33 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Apr 9, 2010 at 7:55 AM, Yeb Havinga yebhavi...@gmail.com wrote:
 Robert Haas wrote:

 Under the first type [4pm,5pm) =
 [4pm,4:59:59pm], while under the second [4pm,5pm) = [4pm,4:59pm].

 Thoughts?


 The examples with units look a lot like the IVLPQ datatype from HL7, see
 http://www.hl7.org/v3ballot/html/infrastructure/datatypes_r2/datatypes_r2.htm

 About a type interface, the HL7 spec talks about promotion from e.g. a
 timestamp to an interval (hl7 speak for range) of timestamps (a range), and
 demotion for the back direction. Every 'quantity type', which is any type
 with a (possibly partially) lineair ordered domain, can be promoted to an
 interval of that type. In PostgreSQL terms, this could perhaps mean that by
 'tagging' a datatype as a lineair order, it could automatically have a range
 type defined on it, like done for the array types currently.

 The way we've handled array types is, quite frankly, horrible.  It's
 bad enough that we now have two catalog entries in pg_type for each
 base type; what's even worse is that if we actually wanted to enforce
 things like the number of array dimensions we'd need even more - say,
 seven per base type, one for the base type itself, one for a
 one-dimensional array, one for a two-dimensional array, one for a
 three-dimensional array.  And then if we want to support range types
 that's another one for every base type, maybe more if there's more
 than one kind of range over a base type.  It's just not feasible to
 handle derived types in a way that require a new instance of each base
 type to be created for each kind of derived type.  It scales as
 O(number of base types * number of kinds of derived type), and that
 rapidly gets completely out of hand

...which by the way, doesn't mean that your idea is bad (although it
might not be what I would choose to do), just that I don't think our
current infrastructure can support it.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Robert Haas
On Fri, Apr 9, 2010 at 4:10 AM, Dimitri Fontaine dfonta...@hi-media.com wrote:
 Do we want to enable support for string based ranges, as in the
 contributed prefix_range type?

Yes, probably, but that doesn't require as much knowledge of the
underlying data type, so I didn't feel it needed to be brought up in
this context.  There is no x such that ['a','b') = ['a',x]; it's
generally impossible to convert between open and closed intervals in
this type of range type.  That's the case where type interfaces are
needed; if you're not converting between different kinds of intervals
then you can probably get by with the existing system of using the
default btree opclass to find equality and comparison operators.

 I like the type interface approach and I think this concept has been
 studied in great details in math and that we should start from existing
 concepts, even if most of them are way over my head.

I'm not too excited about patterning this too closely after
mathematical concepts; I think we need to have a pragmatic approach
that focuses on what the database actually needs.  We need to think
generally enough about what we're trying to provide that we don't box
ourselves into a corner, but we're not trying to build a
theorem-prover.

 You'll see in this mail a proposal for an operator group notion, which
 could get renamed to type interface if we think we won't need rings and
 such rather than just groups in the future. And there's opportunity for
 multi-type interfaces too (think families), like what's the distance
 between a point and a circle?

Yeah, that needs some thought.

 The math groups already have a notion of neutral element, which for the
 addition is 0 (zero), we could expand our version of it with a unity
 element, which would be in the T domain.

I don't know what that would mean in this case.  We're trying to add
and subtract from T, so a unit or identity element makes sense for U,
but not for T.

 Then the range type could expand on this and provide a different unity
 value in their own interface, in the U domain this time. IMO tying the
 precision of the range interval into the type interface is a bad
 abstraction. As you said we want to possibly have several ranges types
 atop this.

Right - so I think there's no point in specifying this in the type
interface at all.  We can always add it later if we find a real need
for it.

 We can say that [1,3] = [1,4) when considering a default integer range
 because 4-3 = unity(integer). When considering a range over timestamps
 with a range interval unity of 1s, we are still able to do the math, and
 we can have another range over timestamps with a range interval unity of
 10 mins in the same database. (I'm using this later example with the
 period datatype in a real application).

 While speaking of all that, in the prefix_range case, it'd be useful to
 have a new kind of typemod system at the range level, to allow for
 defining prefix text range with the '/' separator, say. Then

   greater_prefix('/etc/bar', '/etc/baz') = '/etc' (or is it '/etc/'?)

 Whereas currently

  = select '/etc/baz'::prefix_range | '/etc/bar';
     ?column?
  --
   /etc/ba[r-z]
  (1 row)

Not sure I'm really following this.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Yeb Havinga

Robert Haas wrote:

On Fri, Apr 9, 2010 at 10:33 AM, Robert Haas robertmh...@gmail.com wrote:
  

On Fri, Apr 9, 2010 at 7:55 AM, Yeb Havinga yebhavi...@gmail.com wrote:


Robert Haas wrote:
  

Under the first type [4pm,5pm) =
[4pm,4:59:59pm], while under the second [4pm,5pm) = [4pm,4:59pm].

Thoughts?



The examples with units look a lot like the IVLPQ datatype from HL7, see
http://www.hl7.org/v3ballot/html/infrastructure/datatypes_r2/datatypes_r2.htm

About a type interface, the HL7 spec talks about promotion from e.g. a
timestamp to an interval (hl7 speak for range) of timestamps (a range), and
demotion for the back direction. Every 'quantity type', which is any type
with a (possibly partially) lineair ordered domain, can be promoted to an
interval of that type. In PostgreSQL terms, this could perhaps mean that by
'tagging' a datatype as a lineair order, it could automatically have a range
type defined on it, like done for the array types currently.
  

The way we've handled array types is, quite frankly, horrible.  It's
bad enough that we now have two catalog entries in pg_type for each
base type; what's even worse is that if we actually wanted to enforce
things like the number of array dimensions we'd need even more - say,
seven per base type, one for the base type itself, one for a
one-dimensional array, one for a two-dimensional array, one for a
three-dimensional array.  And then if we want to support range types
that's another one for every base type, maybe more if there's more
than one kind of range over a base type.  It's just not feasible to
handle derived types in a way that require a new instance of each base
type to be created for each kind of derived type.  It scales as
O(number of base types * number of kinds of derived type), and that
rapidly gets completely out of hand



...which by the way, doesn't mean that your idea is bad (although it
might not be what I would choose to do), just that I don't think our
current infrastructure can support it.
  
Well yeah the idea was to 'automagically' have a range type available, 
if the underlying type would support it, i.e. has a lineair order and 
therefore ,,= etc defined on it, just like the array types, from a 
user / datatype developer perspective.


From the implementers perspective, IMHO an extra catalog entry in 
pg_type is not bad on its own, you would have one anyway if the range 
type was explicitly programmed. About different kinds of range types - I 
would not know how to 'promote' integer into anything else but just one 
kind of 'range of integer' type. So the number of extra pg_types would 
be more like O(number of linear ordered base types).


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Yeb Havinga


From the implementers perspective, IMHO an extra catalog entry in 
pg_type is not bad on its own, you would have one anyway if the range 
type was explicitly programmed. About different kinds of range types - 
I would not know how to 'promote' integer into anything else but just 
one kind of 'range of integer' type. So the number of extra pg_types 
would be more like O(number of linear ordered base types).
.. I now see the example of different ranges in your original mail with 
different unit increments. Making that more general so there could be 
continuous and discrete ranges and for the latter, what would the 
increment be.. OTOH is a range of integers with increment x a different 
type from range of integers with increment y, if xy? Maybe the 
increment step and continuous/discrete could be typmods.


regards
Yeb Havinga





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Robert Haas
On Fri, Apr 9, 2010 at 11:07 AM, Yeb Havinga yebhavi...@gmail.com wrote:

 From the implementers perspective, IMHO an extra catalog entry in pg_type
 is not bad on its own, you would have one anyway if the range type was
 explicitly programmed. About different kinds of range types - I would not
 know how to 'promote' integer into anything else but just one kind of 'range
 of integer' type. So the number of extra pg_types would be more like
 O(number of linear ordered base types).

 .. I now see the example of different ranges in your original mail with
 different unit increments. Making that more general so there could be
 continuous and discrete ranges and for the latter, what would the increment
 be.. OTOH is a range of integers with increment x a different type from
 range of integers with increment y, if xy? Maybe the increment step and
 continuous/discrete could be typmods.

Nope, not enough bits available there.  This is fundamentally why the
typid/typmod system is so broken - representing a type as a fixed size
object is extremely limiting.  A fixed size object that MUST consist
of a 32-bit unsigned OID and a 32-bit signed integer is even more
limiting.  Fortunately, we don't need to solve that problem in order
to implement range types: we can just have people explicitly create
the ones they need.  This will, for example, avoid creating ranges
over every composite type that springs into existence because a table
is created, even though in most cases a fairly well-defined range type
could be constructed.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Joe Conway
On 04/09/2010 07:33 AM, Robert Haas wrote:
 On Fri, Apr 9, 2010 at 7:55 AM, Yeb Havinga yebhavi...@gmail.com wrote:
 'tagging' a datatype as a lineair order, it could automatically have a range
 type defined on it, like done for the array types currently.
 
 The way we've handled array types is, quite frankly, horrible.  It's
 bad enough that we now have two catalog entries in pg_type for each
 base type; what's even worse is that if we actually wanted to enforce
 things like the number of array dimensions we'd need even more - say,
 seven per base type, one for the base type itself, one for a
 one-dimensional array, one for a two-dimensional array, one for a
 three-dimensional array.  And then if we want to support range types
 that's another one for every base type, maybe more if there's more
 than one kind of range over a base type.  It's just not feasible to
 handle derived types in a way that require a new instance of each base
 type to be created for each kind of derived type.  It scales as
 O(number of base types * number of kinds of derived type), and that
 rapidly gets completely out of hand

Perhaps off the original topic (and thinking out loud), but I agree with
you on the handling of array types. I have long thought (and at least
once played with the idea) that a single array type, anyarray, made up
of elements, anyelement, could be made to work. Further, anyelement
should be defined to be any valid existing type, including anyarray.
Essentially, at least by my reading of the SQL spec, a multidimensional
array ought to be an array of arrays, which is different in subtle ways
from what we have today.

Joe



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 Given a type T, I think we'd like to be able to define a type U as
 the natural type to be added to or subtracted from T.  As Jeff
 pointed out to me, this is not necessarily the same as the
 underlying type.  For example, if T is a timestamp, U is an
 interval; if T is a numeric, U is also a numeric; if T is a cidr,
 U is an integer. Then we'd like to define a canonical addition
 operator and a canonical subtraction operator.
 
As it is de rigueur for someone to escalate the proposed complexity
of an idea by at least one order of magnitude, and everyone else has
fallen down on this one:  ;-)
 
I've often thought that if we rework the type system, it would be
very nice to support a concept of hierarchy.  If you could
subclass money to have a subclass like assessable, which in turn
has subclasses of fine, fee, restitution, etc. you could then
automatically do anything with a subclass which you could do with
the superclass, and support such things as treating the sum of
various classes as the lowest common subclass.  It seems like this
sort of approach, if done right, might allow some easier way to
establish sensible operations between types (like distance / speed =
time or speed * time = distance).
 
Just a thought
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Robert Haas
On Fri, Apr 9, 2010 at 1:13 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 As it is de rigueur for someone to escalate the proposed complexity
 of an idea by at least one order of magnitude, and everyone else has
 fallen down on this one:  ;-)

Gee, thanks for filling in?

 I've often thought that if we rework the type system, it would be
 very nice to support a concept of hierarchy.  If you could
 subclass money to have a subclass like assessable, which in turn
 has subclasses of fine, fee, restitution, etc. you could then
 automatically do anything with a subclass which you could do with
 the superclass, and support such things as treating the sum of
 various classes as the lowest common subclass.  It seems like this
 sort of approach, if done right, might allow some easier way to
 establish sensible operations between types (like distance / speed =
 time or speed * time = distance).

 Just a thought

I dowanna rework the type system.  I'm not even 100% sure I want to
implement what I actually proposed.  I do want to find out if people
think the framework makes sense and whether it's the right way forward
for those projects that need these features.  What you're proposing
here sounds suspiciously like something that should be handled by
creating domains - but in any case it's almost entirely unrelated to
what I was talking about.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 I dowanna rework the type system.  I'm not even 100% sure I want
 to implement what I actually proposed.  I do want to find out if
 people think the framework makes sense and whether it's the right
 way forward for those projects that need these features.
 
What you proposed sounds like it would be cleaner and less work than
further perverting the index system as a source of information about
types or hard-coding knowledge anywhere else.
 
 What you're proposing here sounds suspiciously like something that
 should be handled by creating domains
 
Not really.  Unless I've missed something domains are a single-level
layer over a data type.  I find them very useful and use them
heavily, but the standard implementation is rather limited.  Perhaps
that would be the area to add the functionality I suggested, though.
I'm totally at the hand-waving stage on it, with no concrete ideas.
I just thought that if you were adding more type information,
oriented aournd the types themselves rather than index AMs, some form
of inheritence might fit in gracefully.
 
 in any case it's almost entirely unrelated to what I was talking
 about.
 
OK
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Yeb Havinga

Robert Haas wrote:

On Fri, Apr 9, 2010 at 11:07 AM, Yeb Havinga yebhavi...@gmail.com wrote:
  

.. I now see the example of different ranges in your original mail with
different unit increments. Making that more general so there could be
continuous and discrete ranges and for the latter, what would the increment
be.. OTOH is a range of integers with increment x a different type from
range of integers with increment y, if xy? Maybe the increment step and
continuous/discrete could be typmods.



Nope, not enough bits available there.  This is fundamentally why the
typid/typmod system is so broken - representing a type as a fixed size
object is extremely limiting.  A fixed size object that MUST consist
of a 32-bit unsigned OID and a 32-bit signed integer is even more
limiting.  Fortunately, we don't need to solve that problem in order
to implement range types: we can just have people explicitly create
the ones they need.  This will, for example, avoid creating ranges
over every composite type that springs into existence because a table
is created, even though in most cases a fairly well-defined range type
could be constructed.
  
Ok, no typmod, not default extra types for base types, but the concept 
of an still there is one aspect of ranges (may I say intervals?) of 
'things' is something generic, and code to handle intervals of things 
could be shared between datatype implementations. A way to have generic 
support without automatic new types could be with something that looks like:


What about
CREATE TYPE ivl_int AS INTERVAL OF integer;

SELECT '[1;2]'::ivl_int;
etc

regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Robert Haas
On Fri, Apr 9, 2010 at 4:01 PM, Yeb Havinga yebhavi...@gmail.com wrote:
 Robert Haas wrote:

 On Fri, Apr 9, 2010 at 11:07 AM, Yeb Havinga yebhavi...@gmail.com wrote:


 .. I now see the example of different ranges in your original mail with
 different unit increments. Making that more general so there could be
 continuous and discrete ranges and for the latter, what would the
 increment
 be.. OTOH is a range of integers with increment x a different type from
 range of integers with increment y, if xy? Maybe the increment step and
 continuous/discrete could be typmods.


 Nope, not enough bits available there.  This is fundamentally why the
 typid/typmod system is so broken - representing a type as a fixed size
 object is extremely limiting.  A fixed size object that MUST consist
 of a 32-bit unsigned OID and a 32-bit signed integer is even more
 limiting.  Fortunately, we don't need to solve that problem in order
 to implement range types: we can just have people explicitly create
 the ones they need.  This will, for example, avoid creating ranges
 over every composite type that springs into existence because a table
 is created, even though in most cases a fairly well-defined range type
 could be constructed.


 Ok, no typmod, not default extra types for base types, but the concept of an
 still there is one aspect of ranges (may I say intervals?) of 'things' is
 something generic, and code to handle intervals of things could be shared
 between datatype implementations. A way to have generic support without
 automatic new types could be with something that looks like:

 What about
 CREATE TYPE ivl_int AS INTERVAL OF integer;

 SELECT '[1;2]'::ivl_int;
 etc

Yeah, that's how it has to work, I think.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Jeff Davis
On Fri, 2010-04-09 at 12:50 -0500, Kevin Grittner wrote:
 I just thought that if you were adding more type information,
 oriented aournd the types themselves rather than index AMs, some form
 of inheritence might fit in gracefully.

There are already some specific proposals for inheritance in database
theory literature. For instance: Databases, Types, and the Relational
Model by C.J. Date addresses inheritance explicitly (and the appendices
have some interesting discussion).

I'm not sure how compatible it is with SQL, though; and I am not very
optimistic that we could accomplish such a restructuring of the type
system while maintaining a reasonable level of backwards compatibility.

Either way, I think it's a separate topic. Two types that are not
related by any subtype/supertype relationship (like strings and ints)
can conform to the same interface (total ordering); while the very same
type can conform to two different interfaces.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Jeff Davis
On Fri, 2010-04-09 at 11:14 -0400, Robert Haas wrote:
  range of integers with increment y, if xy? Maybe the increment step and
  continuous/discrete could be typmods.
 
 Nope, not enough bits available there. 

I think the problem is deeper than that. Typmods aren't carried along as
part of the result of a function call, so typmod is not really a part of
the type at all -- it's more a property of the column.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Jeff Davis
On Thu, 2010-04-08 at 22:29 -0400, Robert Haas wrote:
 1. knngist wants to use index scans to speed up queries of the form
 SELECT ... ORDER BY column op constant (as opposed to the
 existing machinery which only knows how to use an index for SELECT ...
 ORDER BY column).
 2. Window functions want to define windows over a range of values
 defined by the underlying data type.  To do this, we need to define
 what addition and subtraction mean for a particular data type.
 3. Jeff Davis is interested in implementing range types.  When the
 underlying base type is discrete, e.g. integers, you can say that
 [1,3] = [1,4), but only if you know that 3 and 4 are consecutive (in
 that order).

To give some context, I started a thread a while ago:

http://archives.postgresql.org/pgsql-hackers/2009-10/msg01403.php

Tom provided some interesting suggestions in that thread, but I'm not
sure they would work for #1 or #2.

 It may or may not be worth building the concept of a unit
 increment into the type interface machinery, though: one could imagine
 two different range types built over the same base type with different
 unit increments - e.g. one timestamp range with unit increment = 1s,
 and one with unit increment = 1m.  Under the first type [4pm,5pm) =
 [4pm,4:59:59pm], while under the second [4pm,5pm) = [4pm,4:59pm].

Right. Part of the interface could be a unit() function, and that can
return whatever you want.

I was originally thinking about it in terms of next() and prev(), but
you could build those from +, -, and unit().

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Robert Haas
On Fri, Apr 9, 2010 at 5:49 PM, Jeff Davis pg...@j-davis.com wrote:
 It may or may not be worth building the concept of a unit
 increment into the type interface machinery, though: one could imagine
 two different range types built over the same base type with different
 unit increments - e.g. one timestamp range with unit increment = 1s,
 and one with unit increment = 1m.  Under the first type [4pm,5pm) =
 [4pm,4:59:59pm], while under the second [4pm,5pm) = [4pm,4:59pm].

 Right. Part of the interface could be a unit() function, and that can
 return whatever you want.

 I was originally thinking about it in terms of next() and prev(), but
 you could build those from +, -, and unit().

The advantage of specifying a + and a - in the type interface is that
the unit definition can then be specified as part of the type
declaration itself.  So you can do:

CREATE TYPE ts_sec AS RANGE OVER timestamp (UNIT = '1s');
CREATE TYPE ts_min AS RANGE OVER timestamp (UNIT = '1m');

All of the stuff about defining + and - is hidden from the user - it's
part of the type interface, which is pre-created.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Nathan Boley
 The advantage of specifying a + and a - in the type interface is that
 the unit definition can then be specified as part of the type
 declaration itself.  So you can do:

 CREATE TYPE ts_sec AS RANGE OVER timestamp (UNIT = '1s');
 CREATE TYPE ts_min AS RANGE OVER timestamp (UNIT = '1m');

 All of the stuff about defining + and - is hidden from the user - it's
 part of the type interface, which is pre-created.


The disadvantage is that it does not permit irregularly spaced units.

-Nathan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Robert Haas
On Fri, Apr 9, 2010 at 7:18 PM, Nathan Boley npbo...@gmail.com wrote:
 The advantage of specifying a + and a - in the type interface is that
 the unit definition can then be specified as part of the type
 declaration itself.  So you can do:

 CREATE TYPE ts_sec AS RANGE OVER timestamp (UNIT = '1s');
 CREATE TYPE ts_min AS RANGE OVER timestamp (UNIT = '1m');

 All of the stuff about defining + and - is hidden from the user - it's
 part of the type interface, which is pre-created.

 The disadvantage is that it does not permit irregularly spaced units.

True.  The only types I can think of that have irregularly spaced
units would be things based on floating points, and I was assuming
that people would only want continuous intervals on those.  If someone
really wants to be able to deduce that [1.0,3.0) = [1.0,3.0-epsilon),
then we need a different design.  But I find it hard to believe that's
very useful.  Maybe you feel otherwise?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extended operator classes vs. type interfaces

2010-04-09 Thread Robert Haas
On Fri, Apr 9, 2010 at 7:53 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Apr 9, 2010 at 7:18 PM, Nathan Boley npbo...@gmail.com wrote:
 The advantage of specifying a + and a - in the type interface is that
 the unit definition can then be specified as part of the type
 declaration itself.  So you can do:

 CREATE TYPE ts_sec AS RANGE OVER timestamp (UNIT = '1s');
 CREATE TYPE ts_min AS RANGE OVER timestamp (UNIT = '1m');

 All of the stuff about defining + and - is hidden from the user - it's
 part of the type interface, which is pre-created.

 The disadvantage is that it does not permit irregularly spaced units.

 True.  The only types I can think of that have irregularly spaced
 units would be things based on floating points, and I was assuming
 that people would only want continuous intervals on those.  If someone
 really wants to be able to deduce that [1.0,3.0) = [1.0,3.0-epsilon),
 then we need a different design.  But I find it hard to believe that's
 very useful.  Maybe you feel otherwise?

Er, that [1.0,3.0) = [1.0,3.0-epsilon], rather.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers