Re: [HACKERS] Inheritance

2016-05-26 Thread Jim Nasby

On 5/25/16 8:19 PM, Craig Ringer wrote:

Postgres is well past the point where our relational features are
the big selling point. It's now about scale, an incredibly robust
storage engine, and all the extensiblity opportunities. We've moved
from being an RDBMS to being a "Data Platform". Improving our OO
capabilities just continues that.


Right. I'm just not convinced table inheritance is a useful part of that
picture.


In it's current state it's certainly not the best tool in the toolbox, 
but IMHO there's still plenty of room for the ability to support 
restricted polymorphism, because of all the benefits of not allowing 
willy-nilly stuff in the schema. There's certainly other cases (ie: 
per-customer custom fields) where willy-nilly is what you actually need.


And certainly things that reduce table modification pain for *any* 
operations are most welcome! I think allowing queued backgrounded 
processes would be a huge win there. If we had real stored procedures 
(IE: ability to control transaction state) and a modest background 
scheduler then it wouldn't be hard to build that.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance

2016-05-25 Thread Craig Ringer
On 26 May 2016 at 01:56, Jim Nasby  wrote:

> On 5/24/16 8:52 PM, Craig Ringer wrote:
> > Absolutely, and I use OO heavily. But a half-assed "object relational"
> > feature in the database that only kind-of works isn't OO, and it's
> > generally less useful than using existing relational-to-oo modelling
> > techniques like FK-related 1:1 child tables for specialisation.
>
> How is it less useful than that? To me, the FK "solution" is the absolute
> worst of everything: you still have all the separate child tables that you
> must explicitly query *and* you have to get all the joins correct as well.
> And hope it doesn't have horrible performance.
>
> Note that there was enough enthusiasm to adopt whole new database
>> engines, but not enough to use PostgreSQL's existing features for that.
>> Partly because they suck. In particular, people looking for this tend to
>> want to be able to create new subtypes without having to mess around
>> making schema changes and modelling everything.
>>
>
> Which is a decision people have come to regret, because then your codebase
> somehow has to deal with 38 different versions of what a "customer" is.


Oh, I totally agree there. It's almost as bad as people serialising Java
objects into the DB. Ugh. It's a bad, bad idea.

It's also what people seem to want to do, and I understand that somewhat
given the pain involved in full table rewrites under extended locks and the
hoop-jumping required to avoid them. It's particularly painful with
something app devs tend to need, but RDBMS designers prefer to ignore:
user-defined columns/attributes. Find me someone who *doesn't* want the
ability for their app users/deployers/etc to add arbitrary attributes to
customer records etc w/o code changes. jsonb helps a lot there, but you
lose Pg's type system and have to use a different query syntax etc.

Lower-pain ways to make schema changes and blend dynamic columns
(application-user-defined columns)  with normal columns would help a lot
there. A pseudo-column that can store a TOASTable record extension that's a
set of colname/coltype/coltypmod . Storage is the relatively easy bit
though, the problem is how to work with that though the planner and
executor and output useful results...

There's much more future in improving document-structured storage like
>> jsonb, and possibly extending in future toward hybrid storage with some
>> normal cols and some dynamic cols, than with Pg's
>> pseudo-object-relational inheritance feature.
>>
>
> I don't see why we can't do both. There's many cases where more
> flexibility in what output tuples look like would be very valuable. The
> JSON features are one aspect; crosstab is another.
>

Agreed. A real PIVOT, especially one we can use as a source of tuples for
subqueries etc, would be really useful and is the sort of thing we'd need
some degree of dynamic column handling for.


> Postgres is well past the point where our relational features are the big
> selling point. It's now about scale, an incredibly robust storage engine,
> and all the extensiblity opportunities. We've moved from being an RDBMS to
> being a "Data Platform". Improving our OO capabilities just continues that.


Right. I'm just not convinced table inheritance is a useful part of that
picture.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Inheritance

2016-05-25 Thread Jim Nasby

On 5/24/16 8:52 PM, Craig Ringer wrote:
> Absolutely, and I use OO heavily. But a half-assed "object relational"
> feature in the database that only kind-of works isn't OO, and it's
> generally less useful than using existing relational-to-oo modelling
> techniques like FK-related 1:1 child tables for specialisation.

How is it less useful than that? To me, the FK "solution" is the 
absolute worst of everything: you still have all the separate child 
tables that you must explicitly query *and* you have to get all the 
joins correct as well. And hope it doesn't have horrible performance.



Note that there was enough enthusiasm to adopt whole new database
engines, but not enough to use PostgreSQL's existing features for that.
Partly because they suck. In particular, people looking for this tend to
want to be able to create new subtypes without having to mess around
making schema changes and modelling everything.


Which is a decision people have come to regret, because then your 
codebase somehow has to deal with 38 different versions of what a 
"customer" is.



There's much more future in improving document-structured storage like
jsonb, and possibly extending in future toward hybrid storage with some
normal cols and some dynamic cols, than with Pg's
pseudo-object-relational inheritance feature.


I don't see why we can't do both. There's many cases where more 
flexibility in what output tuples look like would be very valuable. The 
JSON features are one aspect; crosstab is another.


Postgres is well past the point where our relational features are the 
big selling point. It's now about scale, an incredibly robust storage 
engine, and all the extensiblity opportunities. We've moved from being 
an RDBMS to being a "Data Platform". Improving our OO capabilities just 
continues that.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance

2016-05-25 Thread Merlin Moncure
On Tue, May 24, 2016 at 9:47 PM, Craig Ringer  wrote:
> On 24 May 2016 at 22:45, Konstantin Knizhnik 
> wrote:
>>
>> There is one aspect of inheritance support which was not mentioned:
>> polymorphic queries.
>> Actually polymorphism is the fundamental feature of OOP, without it there
>> is no behavioral inheritance and inheritance can be considered just as
>> "syntax sugar" for sharing some common subset of attributes between tables.
>>
>> The main problem with supporting polymorphic queries is that SQL query
>> returns set of tuples, not set of objects.
>> So there is no nice way to return both instances of based and derived
>> tables. There are several alternatives
>> (for example return joined set of attributes in all derived tables,
>> leaving missed as NULLs) but none of them is good.
>
>
> Exactly. We have a sort-of-object-ish storage option, but none of the
> surrounding stuff to make it useful for actual OO / object-relational work.
>
> The "joined set of attributes" approach is exactly what ORMs already do, and
> many direct implementations of the same idea will use too. So we'd offer no
> advantage over what they already do in a way that works with multiple
> DBMSes, except we might be able to do it faster. Maybe.
>
> The lack of polymorphism is critical. It's not really usefully OO but it
> costs you important relational features if you use it. We have some very
> limited polymorphism in the sense that you can query the parent table and
> see rows in child tables, but you only get the subset of cols that exists at
> that level of the heirarchy.
>
> One thing I'd like to explore one day is a nice, user-friendly way to
> express "SELECT this row and the corresponding sets of rows from [these
> tables and their children in turn] as a structured object". Right now users
> have to write series of LEFT JOINs and de-duplicate the left-hand sides. Or
> do multiple queries (n+1 selects), possibly expensively with repeated join
> work involved. Or they have to write pretty baroque queries to construct a
> jsonb object with jsonb_agg with multiple levels of group-by in
> subqueries-in-from. We should be able to do this for them, so they can say
>
> SELECTOBJECT customer
>   CHILD JOIN invoice ON (customer.customer_id = invoice.customer_id AND
> invoice_date > $1)
>   CHILD JOIN invoiceline USING (invoice_id)
>   CHILD JOIN address USING (customer_id)
> WHERE customer.in_debt_collections;
>
> instead of the current nested mess of aggregation and subqueries needed,
> like:
>
> SELECT
> to_jsonb(customer) || (
>   SELECT jsonb_build_object('invoices', jsonb_agg(invoice_obj))
>   FROM (
> SELECT to_jsonb(invoice) || jsonb_build_object('invoice_lines',
> jsonb_agg(invoice_line))
> FROM invoice
> LEFT OUTER JOIN invoice_line ON (invoice.invoice_id =
> invoice_line.invoice_id)
> WHERE invoice.customer_id = customer.customer_id AND invoice_date >=
> current_date
> GROUP BY invoice.invoice_id
>   ) invoice_obj
> ) || (
>   SELECT jsonb_build_object('addresses', jsonb_agg(address))
>   FROM address
>   WHERE address.customer_id = customer.customer_id
> )
> FROM customer
> WHERE customer.in_debt_collections

Well, I don't know. There's a lot of ways to write that type of thing.
Personally, I tend to prefer to delay the serialization to json as
long as possible (although it's sometimes unavoidable) because it
keeps the query cleaner.  I also sometimes use the array() subquery
syntax for sake of brevity, but this query could be restructured to
use proper aggregation on all levels if you're concerned about
performance (this query would tend to underperform yours for very
large compositions because of the second subquery scan vs the hash
join OTOH, it's a faster serialization model).  I didn't test the
syntax, but you get the idea.

SELECT to_json(q)
FROM
(
  SELECT
c.*,
array(
  SELECT
i.*,
array(
  SELECT il
  FROM invoice_line il
  WHERE il.invoice_id = i.invoice_id
) AS invoice_lines
  FROM invoice i
  WHERE i.customer_id = c.customer_id AND invoice_date >= current_date
) AS invoices
  FROM customer c
  WHERE c.in_debt_collections
) q

The point is this: the postgresql type system is flexible enough that
you can do arbitrary constructions pretty easy and the situation has
been one of continuous improvement over the last several releases.  It
isn't perfect, but json enhancements FWICT have made syntactical
approaches to the problem a dead end; json gets the job done is less
likely to cause problems with the SQL standard down the road.  For the
same set of reasons I no longer use crosstab.

In the 15+ years I've been watching postgres inheritance has gone
precisely nowhere, and there other ways to do the things it can do
that also supply a much broader range of use cases.  Plus, I'm biased:
I happen to think the 90's OO 

Re: [HACKERS] Inheritance

2016-05-24 Thread Craig Ringer
On 24 May 2016 at 22:45, Konstantin Knizhnik 
wrote:

> There is one aspect of inheritance support which was not mentioned:
> polymorphic queries.
> Actually polymorphism is the fundamental feature of OOP, without it there
> is no behavioral inheritance and inheritance can be considered just as
> "syntax sugar" for sharing some common subset of attributes between tables.
>
> The main problem with supporting polymorphic queries is that SQL query
> returns set of tuples, not set of objects.
> So there is no nice way to return both instances of based and derived
> tables. There are several alternatives
> (for example return joined set of attributes in all derived tables,
> leaving missed as NULLs) but none of them is good.
>

Exactly. We have a sort-of-object-ish storage option, but none of the
surrounding stuff to make it useful for actual OO / object-relational work.

The "joined set of attributes" approach is exactly what ORMs already do,
and many direct implementations of the same idea will use too. So we'd
offer no advantage over what they already do in a way that works with
multiple DBMSes, except we might be able to do it faster. Maybe.

The lack of polymorphism is critical. It's not really usefully OO but it
costs you important relational features if you use it. We have some very
limited polymorphism in the sense that you can query the parent table and
see rows in child tables, but you only get the subset of cols that exists
at that level of the heirarchy.

One thing I'd like to explore one day is a nice, user-friendly way to
express "SELECT this row and the corresponding sets of rows from [these
tables and their children in turn] as a structured object". Right now users
have to write series of LEFT JOINs and de-duplicate the left-hand sides. Or
do multiple queries (n+1 selects), possibly expensively with repeated join
work involved. Or they have to write pretty baroque queries to construct a
jsonb object with jsonb_agg with multiple levels of group-by in
subqueries-in-from. We should be able to do this for them, so they can say

SELECTOBJECT customer
  CHILD JOIN invoice ON (customer.customer_id = invoice.customer_id AND
invoice_date > $1)
  CHILD JOIN invoiceline USING (invoice_id)
  CHILD JOIN address USING (customer_id)
WHERE customer.in_debt_collections;

instead of the current nested mess of aggregation and subqueries needed,
like:

SELECT
to_jsonb(customer) || (
  SELECT jsonb_build_object('invoices', jsonb_agg(invoice_obj))
  FROM (
SELECT to_jsonb(invoice) || jsonb_build_object('invoice_lines',
jsonb_agg(invoice_line))
FROM invoice
LEFT OUTER JOIN invoice_line ON (invoice.invoice_id =
invoice_line.invoice_id)
WHERE invoice.customer_id = customer.customer_id AND invoice_date
>= current_date
GROUP BY invoice.invoice_id
  ) invoice_obj
) || (
  SELECT jsonb_build_object('addresses', jsonb_agg(address))
  FROM address
  WHERE address.customer_id = customer.customer_id
)
FROM customer
WHERE customer.in_debt_collections

which is just pure joy to read, and gets even more contorted as the layers
of parent/child relationships get deeper. The resulting query plan clearly
expresses the desired result, but writing it in SQL is horrible:

 QUERY PLAN

-
 Seq Scan on customer  (cost=0.00..130226.27 rows=1270 width=64)
   SubPlan 1
 ->  Aggregate  (cost=77.47..77.48 rows=1 width=24)
   ->  Subquery Scan on invoice_obj  (cost=77.37..77.46 rows=3
width=24)
 ->  HashAggregate  (cost=77.37..77.43 rows=3 width=104)
   Group Key: invoice.invoice_id
   ->  Hash Right Join  (cost=50.84..77.36 rows=3
width=104)
 Hash Cond: (invoice_line.invoice_id =
invoice.invoice_id)
 ->  Seq Scan on invoice_line
 (cost=0.00..22.00 rows=1200 width=68)
 ->  Hash  (cost=50.80..50.80 rows=3 width=40)
   ->  Seq Scan on invoice
 (cost=0.00..50.80 rows=3 width=40)
 Filter: ((customer_id =
customer.customer_id) AND (invoice_date >= ('now'::cstring)::date))
   SubPlan 2
 ->  Aggregate  (cost=25.02..25.03 rows=1 width=32)
   ->  Seq Scan on address  (cost=0.00..25.00 rows=6 width=32)
 Filter: (customer_id = customer.customer_id)
(16 rows)


Maybe grouping sets can help avoid the nested joins, but I couldn't figure
out how without wrapping the grouping set output query in another query to
aggregate the produced objects into a top level one.

Inheritance does nothing to help with this.



> But I do not think that presence of ORM excludes necessity to have
> internal support of OO in DBMS.
>
>
I'd 

Re: [HACKERS] Inheritance

2016-05-24 Thread Craig Ringer
On 24 May 2016 at 21:51, Jim Nasby  wrote:

> On 5/23/16 10:30 PM, Craig Ringer wrote:
>
>> I find it notable that no popular ORM has bothered adopting PostgreSQL's
>> inheritance features, and instead just use big left joins or repeated
>> SELECTs to implement parent/child relationships, with foreign keys
>> enforcing constraints.
>>
>
> Since when do we consider ORMs to be an example of how to do good database
> design?
>

I don't, but I do consider them a useful measure of what the majority of
users in the field are seeing and doing.


> I'd rather like to quietly deprecate inheritance and eventually remove
>> it once we have real partitioning and some time has passed...
>>
>
> IMO that's a very unfortunate attitude to have for the "Worlds Most
> Advanced Open Source Database". Now that OO programming isn't treated as a
> magic bullet it's proven to be an extremely powerful tool, especially when
> used to encapsulate complex data.


Absolutely, and I use OO heavily. But a half-assed "object relational"
feature in the database that only kind-of works isn't OO, and it's
generally less useful than using existing relational-to-oo modelling
techniques like FK-related 1:1 child tables for specialisation.

I'm so sick of seeing "object relational" in PostgreSQL's tagline. It's
nonsense. We do so much so well, why focus on this bogus half-feature so
much?



> The ndarray->Series->DataFrame->Panel inheritance tree is a great example
> of this. Feature-wise, we need to be doing *more* of this kind of work, not
> less. Lack of support for OO paradigms was one of the drivers for NoSQL.


I agree.

Note that there was enough enthusiasm to adopt whole new database engines,
but not enough to use PostgreSQL's existing features for that. Partly
because they suck. In particular, people looking for this tend to want to
be able to create new subtypes without having to mess around making schema
changes and modelling everything.

There's much more future in improving document-structured storage like
jsonb, and possibly extending in future toward hybrid storage with some
normal cols and some dynamic cols, than with Pg's pseudo-object-relational
inheritance feature.

I really wish we could just remove it. I know it's impossible because it's
used for partitioning and because there's a significant and important,
albeit tiny relative to total users, group who're utterly reliant on it for
their DB schemas. I know of one health engine provider who uses it heavily
for example. It's a pity, because deleting the complexity around
inheritance entirely would simplify a number of useful features. I'd
certainly be in favour of allowing new PostgreSQL features _not_ to support
tables with inheritance, making it a semi-deprecated feature that new
functionality doesn't have to support. (I know that goes against the usual
way of things, and I'm usually really against introducing things where X
works unless you have Y or A enabled or it's a full moon on a Thursday, but
IMO the benefit justifies the cost for making inheritance support optional).

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Inheritance

2016-05-24 Thread Konstantin Knizhnik
There is one aspect of inheritance support which was not mentioned: 
polymorphic queries.
Actually polymorphism is the fundamental feature of OOP, without it 
there is no behavioral inheritance and inheritance can be considered 
just as "syntax sugar" for sharing some common subset of attributes 
between tables.


The main problem with supporting polymorphic queries is that SQL query 
returns set of tuples, not set of objects.
So there is no nice way to return both instances of based and derived 
tables. There are several alternatives
(for example return joined set of attributes in all derived tables, 
leaving missed as NULLs) but none of them is good.


There is more specific problem - polymorphic function calls, see:

http://postgresql.nabble.com/Polymorphic-function-calls-td5784772.html

which in principle can be solved... But looks like there is no such 
intention.


I do not know how much people will be interested in normal OO support 
from DBMS.
Right now PostgreQSL inheritance is used for different purposes: 
partitioning, FDW-based replication or sharding, ...
But IMHO it is mostly because current inheritance implementation just 
not make it possible to use it for OOP.
As far as I know before there were a lot of discussions whether arrays, 
json and other non-traditional for RDBMS types is needed.
But right now them are quite popular and frequently used. The same thing 
can happen with OO support: it will be popular if provide necessary 
functionality.


Right now OO DBMS applications are mostly using ORMs and them are not 
using PostgreSQL inheritance.
Mostly because ORMs are used to be more flexible in providing different 
approaches for implementing inheritance (for example in most cases 
placing all derived classes in single table works better than approach 
based on appending results of several queries).
Another reason is that ORM usually support more than one DBMS, and since 
most of them have no built-in support for inheritance,

ORM has to implement inheritance itself.
But I do not think that presence of ORM excludes necessity to have 
internal support of OO in DBMS.




On 24.05.2016 16:51, Jim Nasby wrote:

On 5/23/16 10:30 PM, Craig Ringer wrote:

I find it notable that no popular ORM has bothered adopting PostgreSQL's
inheritance features, and instead just use big left joins or repeated
SELECTs to implement parent/child relationships, with foreign keys
enforcing constraints.


Since when do we consider ORMs to be an example of how to do good 
database design?


In this case, I'm sure no ORM uses the feature because either the 
ORM's authors have no clue it exists (except maybe for partitioning) 
or because we're the only mainstream database that has it.



I consider inheritance mostly useless without the ability to have UNIQUE
indexes that span a parent relation and all its children. You can use
them for partitioning only by sacrificing a bunch of integrity
protection or creating messy chains of FKs between individual 
partitions.


There's ways around this issue, but I agree that it's a very unhappy 
situation right now.



I'd rather like to quietly deprecate inheritance and eventually remove
it once we have real partitioning and some time has passed...


IMO that's a very unfortunate attitude to have for the "Worlds Most 
Advanced Open Source Database". Now that OO programming isn't treated 
as a magic bullet it's proven to be an extremely powerful tool, 
especially when used to encapsulate complex data. The 
ndarray->Series->DataFrame->Panel inheritance tree is a great example 
of this. Feature-wise, we need to be doing *more* of this kind of 
work, not less. Lack of support for OO paradigms was one of the 
drivers for NoSQL.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance

2016-05-24 Thread Jim Nasby

On 5/23/16 10:30 PM, Craig Ringer wrote:

I find it notable that no popular ORM has bothered adopting PostgreSQL's
inheritance features, and instead just use big left joins or repeated
SELECTs to implement parent/child relationships, with foreign keys
enforcing constraints.


Since when do we consider ORMs to be an example of how to do good 
database design?


In this case, I'm sure no ORM uses the feature because either the ORM's 
authors have no clue it exists (except maybe for partitioning) or 
because we're the only mainstream database that has it.



I consider inheritance mostly useless without the ability to have UNIQUE
indexes that span a parent relation and all its children. You can use
them for partitioning only by sacrificing a bunch of integrity
protection or creating messy chains of FKs between individual partitions.


There's ways around this issue, but I agree that it's a very unhappy 
situation right now.



I'd rather like to quietly deprecate inheritance and eventually remove
it once we have real partitioning and some time has passed...


IMO that's a very unfortunate attitude to have for the "Worlds Most 
Advanced Open Source Database". Now that OO programming isn't treated as 
a magic bullet it's proven to be an extremely powerful tool, especially 
when used to encapsulate complex data. The 
ndarray->Series->DataFrame->Panel inheritance tree is a great example of 
this. Feature-wise, we need to be doing *more* of this kind of work, not 
less. Lack of support for OO paradigms was one of the drivers for NoSQL.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance

2016-05-23 Thread Craig Ringer
On 24 May 2016 at 00:05, Merlin Moncure  wrote:


>
> This feature was very much a product of the time, at the height of the
> "Object Relational" fad.  The trend for postgres has been in the exact
> opposite direction, towards the SQL standard.  Further complicating
> matters, inheritance has been repurposed to be the foundation for
> table partitioning, making heavy changes problematic.
>

Indeed.

I find it notable that no popular ORM has bothered adopting PostgreSQL's
inheritance features, and instead just use big left joins or repeated
SELECTs to implement parent/child relationships, with foreign keys
enforcing constraints.

I consider inheritance mostly useless without the ability to have UNIQUE
indexes that span a parent relation and all its children. You can use them
for partitioning only by sacrificing a bunch of integrity protection or
creating messy chains of FKs between individual partitions.

I'd rather like to quietly deprecate inheritance and eventually remove it
once we have real partitioning and some time has passed...

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Inheritance

2016-05-23 Thread Alvaro Herrera
Tom Lane wrote:

> My feeling about it is that we need to provide a partitioning feature
> that doesn't rely on the current notion of inheritance at all.  We've
> heard from multiple users who want to use large numbers of partitions,
> enough that simply having a separate relcache entry for each partition
> would be a performance problem, never mind the current approach to
> planning queries over inheritance trees.  So the partitions need to be
> objects much simpler than full-fledged tables.

Sorry to hijack the thread, but I agree on this, and I'm worried that
the patch being floated for partitioning may paint us on a corner from
which it may be difficult to get out.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance

2016-05-23 Thread Tom Lane
Joe Conway  writes:
> But then again, maybe we need to start with a clear notion of what
> problems people are trying to solve when they use partitions. At least
> some of the historic reasons are no longer valid.

That's true.  Just because people want to have a gazillion partitions
doesn't necessarily mean it's a good design that we need to support well.
Some investigation would be a smart use of time.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance

2016-05-23 Thread Joe Conway
On 05/23/2016 03:05 PM, Tom Lane wrote:
> Jim Nasby  writes:
>> I don't see why partitioning complicates fixing these issues. ISTM it's 
>> the exact same complaint for both inheritance and partitioning.
> 
> My feeling about it is that we need to provide a partitioning feature
> that doesn't rely on the current notion of inheritance at all.  We've
> heard from multiple users who want to use large numbers of partitions,
> enough that simply having a separate relcache entry for each partition
> would be a performance problem, never mind the current approach to
> planning queries over inheritance trees.  So the partitions need to be
> objects much simpler than full-fledged tables.

I wonder if it wouldn't make sense to define a partition as a list of
segments within a single table that represent the partition?

But then again, maybe we need to start with a clear notion of what
problems people are trying to solve when they use partitions. At least
some of the historic reasons are no longer valid.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Inheritance

2016-05-23 Thread Tom Lane
Jim Nasby  writes:
> On 5/23/16 11:05 AM, Merlin Moncure wrote:
>> This feature was very much a product of the time, at the height of the
>> "Object Relational" fad.  The trend for postgres has been in the exact
>> opposite direction, towards the SQL standard.  Further complicating
>> matters, inheritance has been repurposed to be the foundation for
>> table partitioning, making heavy changes problematic.

> I don't see why partitioning complicates fixing these issues. ISTM it's 
> the exact same complaint for both inheritance and partitioning.

My feeling about it is that we need to provide a partitioning feature
that doesn't rely on the current notion of inheritance at all.  We've
heard from multiple users who want to use large numbers of partitions,
enough that simply having a separate relcache entry for each partition
would be a performance problem, never mind the current approach to
planning queries over inheritance trees.  So the partitions need to be
objects much simpler than full-fledged tables.

If we had that, and encouraged people to migrate simple partitioning
use-cases to it, that might take off enough pressure that we could
afford to consider more-complicated inheritance schemes rather than
treating inheritance as an unfortunate legacy design.  But we're
some years away from being able to do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance

2016-05-23 Thread Jim Nasby

On 5/23/16 11:05 AM, Merlin Moncure wrote:

Postgres doesn't work that way, and the documentation disclaims this:
"Note: Although inheritance is frequently useful, it has not been
integrated with unique constraints or foreign keys, which limits its
usefulness. See Section 5.8 for more detail."

Personally, I don't think this will ever be fixed.  The reason why it
doesn't work is due to some foundational implementation decisions that
would have to be revisited.


If the complaint is really about FKs/UNIQUE (and really AFAIK it's only 
UNIQUE that's the problem), then I agree: it should be addressed. It's a 
major impediment to partitioning (and generic inheritance).



This feature was very much a product of the time, at the height of the
"Object Relational" fad.  The trend for postgres has been in the exact
opposite direction, towards the SQL standard.  Further complicating
matters, inheritance has been repurposed to be the foundation for
table partitioning, making heavy changes problematic.


I don't see why partitioning complicates fixing these issues. ISTM it's 
the exact same complaint for both inheritance and partitioning.


I also disagree about PK:PK FK's between a bunch of completely 
independent tables being a good way to model this stuff. It doubles the 
complexity of every query against a child table and doesn't perform 
nearly as well, because your data locality goes down the tubes.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance

2016-05-23 Thread Merlin Moncure
On Mon, May 23, 2016 at 10:21 AM, Jim Nasby  wrote:
> On 5/22/16 1:37 AM, Jan Johansson wrote:
>>
>>  - Allow single (behavior) inheritance (model here is quite a few modern
>> languages, such as C#, D, ...)
>>  - Allow multiple declarative inheritance (interface like, the
>> inheritance almost works like this today though)
>>
>> If, with these restrictions (or maybe only the first), do you think that
>> it will simplify implementation and make it more feature complete?
>
> I think you'll need to be a bit more specific to elicit a response. What
> exactly are you proposing to change?

I would guess OP is complaining about what everyone complains about.
What people want is for tables to have a base set of shared columns by
a varying set of derived type dependent columns.  Constraints on the
shared columns will be enforced on all the derived columns.

Postgres doesn't work that way, and the documentation disclaims this:
"Note: Although inheritance is frequently useful, it has not been
integrated with unique constraints or foreign keys, which limits its
usefulness. See Section 5.8 for more detail."

Personally, I don't think this will ever be fixed.  The reason why it
doesn't work is due to some foundational implementation decisions that
would have to be revisited.

This feature was very much a product of the time, at the height of the
"Object Relational" fad.  The trend for postgres has been in the exact
opposite direction, towards the SQL standard.  Further complicating
matters, inheritance has been repurposed to be the foundation for
table partitioning, making heavy changes problematic.

The classic SQL approach to the problem, establishing a base table
plus a type and derived tables with a pkey:pkey link isn't a very bad
one from a data modelling perspective and serialization to the
application is increasingly going to be handled by json going forward
as opposed to hacking the postgres type system.  This really reduces
the value proposition of heavy changes to the inheritance features.
If there was consensus on that point, I suppose the way forward is
some documentation restructuring, starting with removing the
increasingly baroque tutorial (trivia: the tutorial was the old manual
at one point).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance

2016-05-23 Thread Jim Nasby

On 5/22/16 1:37 AM, Jan Johansson wrote:

 - Allow single (behavior) inheritance (model here is quite a few modern
languages, such as C#, D, ...)
 - Allow multiple declarative inheritance (interface like, the
inheritance almost works like this today though)

If, with these restrictions (or maybe only the first), do you think that
it will simplify implementation and make it more feature complete?


I think you'll need to be a bit more specific to elicit a response. What 
exactly are you proposing to change?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-21 Thread Robert Haas
On Sun, Jun 21, 2015 at 5:45 AM, Dean Rasheed dean.a.rash...@gmail.com wrote:
 On 21 June 2015 at 05:27, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Jun 20, 2015 at 6:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I propose instead the attached patch, which operates by identifying which
 of the append_rel_list entries actually contain subquery references, and
 copying only those; the other ones are just linked into the child's
 append_rel_list by reference, which is okay because they won't get
 modified.

 Better than unpatched, definitely!  Not sure how it compares to my patch.


 I tested on my machine (optimised build, asserts off). With HEAD I got:

 Q1: 8076ms
 Q2: 7165ms
 Q3: 4027ms
 Q4: OOM (killed by kernel, used  16GB RAM)
 Q5: 4131ms

 The machine only has 16GB of RAM and almost no swap, so it wasn't able to do 
 Q4.

 With Robert's patch:

 Q1: 1121ms
 Q2: 542ms
 Q3: 498ms
 Q4: 50763ms (used 3GB RAM)
 Q5: 556ms

 and with Tom's patch:

 Q1: 2264ms
 Q2: 3785ms
 Q3: 507ms
 Q4: 50851ms (used 3GB RAM)
 Q5: 558ms

 However, there's an obvious improvement that can be made to Tom's
 patch -- having computed modifiableARIindexes, you may as well use it
 in the innermost loop to only apply ChangeVarNodes() to those
 AppendRelInfo's that can actually change, rather than having it trawl
 through all the other ones that we know won't be touched.

 With that improvement (attached), the timings become:

 Q1: 1148ms
 Q2: 547ms
 Q3: 505ms
 Q4: 51325ms
 Q5: 544ms

 i.e., basically the same as Robert's patch.

Cool.  That sounds good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-21 Thread Tom Lane
Dean Rasheed dean.a.rash...@gmail.com writes:
 However, there's an obvious improvement that can be made to Tom's
 patch -- having computed modifiableARIindexes, you may as well use it
 in the innermost loop to only apply ChangeVarNodes() to those
 AppendRelInfo's that can actually change, rather than having it trawl
 through all the other ones that we know won't be touched.

Right.  Also, as Robert noted, we can short-circuit a few more things when
there are no subquery RTEs.  I'll combine these ideas and push something,
but probably not till tomorrow.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-20 Thread Thomas Munro
On Fri, Jun 19, 2015 at 9:20 AM, Robert Haas robertmh...@gmail.com wrote:
 The extraordinarily planning time for query 4 is caused by a
 completely different problem: SearchCatCache eats up huge amounts of
 CPU; its callers are get_attavgwidth and get_typlen.  It's not clear
 to me why doubling the number of relations causes such an enormous
 explosion in calls to those functions; I would expect the number of
 calls to double, but presumably the actual increase is much more.
 That's a separate problem, though, unconnected to
 c03ad5602f529787968fa3201b35c119bbc6d782 and not necessarily a
 regression.

I don't have a great high level understanding of the planner, and Q4
may be somehow asking for trouble or unrepresentative of anything
useful, but I did some profiling and instrumenting, and I noticed that
we spend tables^2 * columns time in get_attavgwidth.  I wonder if
estimate_rel_size (or some other function in that stack, or some new
function wrapper) should remember the result for each relation for the
scope of this planner invocation.  That should bring the calls to
get_attavgwidth down to the same order as Q3 (tables * columns).

Here is some profiler output from a 500 table, 500 column Q4 run:

160295.0ms   60.2% 95inheritance_planner
120064.0ms   45.1% 0 grouping_planner
119826.0ms   45.0% 2  query_planner
114204.0ms   42.9% 0   add_base_rels_to_query
114204.0ms   42.9% 0add_base_rels_to_query
114204.0ms   42.9% 151 build_simple_rel
113817.0ms   42.8% 57  build_simple_rel
113600.0ms   42.7% 19   get_relation_info
112123.0ms   42.1% 27estimate_rel_size
111557.0ms   41.9% 14139 get_rel_data_width
80152.0ms   30.1% 362  get_attavgwidth
79788.0ms   30.0% 282   SearchSysCache
79368.0ms   29.8% 52373SearchCatCache
13182.0ms4.9% 2125
CatalogCacheComputeHashValue

Here are some tables showing function call counts.  The columns are
ordered like this:

1: Query number
2: Number of child tables
3: Number of columns
4: Number of calls to add_base_rels_to_query
5: Number of calls to build_simple_rel
6: Number of calls to get_relation_info
7: Number of calls to estimate_rel_size
8: Number of calls to get_attavgwidth

Q3 10 10 22 11 11 11 131
Q3 10 20 22 11 11 11 241
Q3 10 30 22 11 11 11 351
Q3 20 10 42 21 21 21 251
Q3 20 20 42 21 21 21 461
Q3 20 30 42 21 21 21 671
Q3 30 10 62 31 31 31 371
Q3 30 20 62 31 31 31 681
Q3 30 30 62 31 31 31 991
Q3 500 500 1002 501 501 501 251501

Q4 10 10 33 143 143 132 1451
Q4 10 20 33 143 143 132 2661
Q4 10 30 33 143 143 132 3871
Q4 20 10 63 483 483 462 5291
Q4 20 20 63 483 483 462 9701
Q4 20 30 63 483 483 462 14111
Q4 30 10 93 1023 1023 992 11531
Q4 30 20 93 1023 1023 992 21141
Q4 30 30 93 1023 1023 992 30751
Q4 500 500 1503 252003 252003 251502 126002501

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-20 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Meanwhile, here is an updated patch.

I don't care for that patch too much: it seems a bit brute-force, and I'm
quite worried by the assumption that it's okay to destroy each child's
append_rel_list after processing the child.  That would fail if any of
the Vars/subexpressions in the lists get incorporated into the resulting
child Plan, which does not seem impossible.  (I think in many cases we'd
do a copyObject() when extracting an append_rel_list expression, but this
hardly seems guaranteed.)

I propose instead the attached patch, which operates by identifying which
of the append_rel_list entries actually contain subquery references, and
copying only those; the other ones are just linked into the child's
append_rel_list by reference, which is okay because they won't get
modified.

On my laptop, I get the following timings for your test queries from
unmodified HEAD (--cassert build):

# Q1: 41260.239 ms
# Q2: 45225.768 ms
# Q3: 43066.958 ms
# Q4: 193360.726 ms
# Q5: 40746.503 ms

and these with my patch:

# Q1: 1767.753 ms
# Q2: 3662.131 ms
# Q3: 814.293 ms
# Q4: 64468.914 ms
# Q5: 881.295 ms

which seems to be generally a better result.

 The extraordinarily planning time for query 4 is caused by a
 completely different problem: SearchCatCache eats up huge amounts of
 CPU; its callers are get_attavgwidth and get_typlen.  It's not clear
 to me why doubling the number of relations causes such an enormous
 explosion in calls to those functions; I would expect the number of
 calls to double, but presumably the actual increase is much more.

Actually, Q4 necessarily involves O(N^2) planning time, because for
each of N target relations you're considering a join to an N-member
inheritance tree.  A lot of those ultimately get thrown away by
constraint exclusion, but not before we've expended significant
cycles on them.  I do not think we are going to get much traction
on that --- even if we do something to knock off whatever the leading
term is, there'll still be more O(N^2) behavior right behind it.

regards, tom lane

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 8afde2b7d5069707e346901f819bed888a2333ee..d7fee96ba01272efdf7231d5985ab688912bcf58 100644
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*** inheritance_planner(PlannerInfo *root)
*** 834,840 
  {
  	Query	   *parse = root-parse;
  	int			parentRTindex = parse-resultRelation;
! 	Bitmapset  *resultRTindexes = NULL;
  	int			nominalRelation = -1;
  	List	   *final_rtable = NIL;
  	int			save_rel_array_size = 0;
--- 834,842 
  {
  	Query	   *parse = root-parse;
  	int			parentRTindex = parse-resultRelation;
! 	Bitmapset  *resultRTindexes;
! 	Bitmapset  *subqueryRTindexes;
! 	Bitmapset  *modifiableARIindexes;
  	int			nominalRelation = -1;
  	List	   *final_rtable = NIL;
  	int			save_rel_array_size = 0;
*** inheritance_planner(PlannerInfo *root)
*** 845,850 
--- 847,853 
  	List	   *returningLists = NIL;
  	List	   *rowMarks;
  	ListCell   *lc;
+ 	Index		rti;
  
  	Assert(parse-commandType != CMD_INSERT);
  
*** inheritance_planner(PlannerInfo *root)
*** 867,874 
  	 * subqueries during planning, and so we must create copies of them too,
  	 * except where they are target relations, which will each only be used in
  	 * a single plan.
  	 */
! 	resultRTindexes = bms_add_member(resultRTindexes, parentRTindex);
  	foreach(lc, root-append_rel_list)
  	{
  		AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
--- 870,879 
  	 * subqueries during planning, and so we must create copies of them too,
  	 * except where they are target relations, which will each only be used in
  	 * a single plan.
+ 	 *
+ 	 * To begin with, we'll need a bitmapset of the target relation relids.
  	 */
! 	resultRTindexes = bms_make_singleton(parentRTindex);
  	foreach(lc, root-append_rel_list)
  	{
  		AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
*** inheritance_planner(PlannerInfo *root)
*** 878,889 
  			 appinfo-child_relid);
  	}
  
  	foreach(lc, root-append_rel_list)
  	{
  		AppendRelInfo *appinfo = (AppendRelInfo *) lfirst(lc);
  		PlannerInfo subroot;
  		Plan	   *subplan;
! 		Index		rti;
  
  		/* append_rel_list contains all append rels; ignore others */
  		if (appinfo-parent_relid != parentRTindex)
--- 883,937 
  			 appinfo-child_relid);
  	}
  
+ 	/*
+ 	 * Now, generate a bitmapset of the relids of the subquery RTEs, including
+ 	 * security-barrier RTEs that will become subqueries, as just explained.
+ 	 */
+ 	subqueryRTindexes = NULL;
+ 	rti = 1;
+ 	foreach(lc, parse-rtable)
+ 	{
+ 		RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc);
+ 
+ 		if (rte-rtekind == RTE_SUBQUERY ||
+ 			(rte-securityQuals != NIL 
+ 			 !bms_is_member(rti, resultRTindexes)))
+ 			subqueryRTindexes = 

Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-20 Thread Robert Haas
On Sat, Jun 20, 2015 at 6:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Meanwhile, here is an updated patch.

 I don't care for that patch too much: it seems a bit brute-force, and I'm
 quite worried by the assumption that it's okay to destroy each child's
 append_rel_list after processing the child.  That would fail if any of
 the Vars/subexpressions in the lists get incorporated into the resulting
 child Plan, which does not seem impossible.  (I think in many cases we'd
 do a copyObject() when extracting an append_rel_list expression, but this
 hardly seems guaranteed.)

Well, if such a thing is possible, the regression tests don't catch it
- can we add one that would?

 I propose instead the attached patch, which operates by identifying which
 of the append_rel_list entries actually contain subquery references, and
 copying only those; the other ones are just linked into the child's
 append_rel_list by reference, which is okay because they won't get
 modified.

I thought about that approach, but wasn't sure if I could make it
simple enough to pass muster.  Note that I generally erred on the side
of deferring all work as long as possible and to the greatest extent
possible in a way that your patch does not.  We don't need to compute
modifiableARIindexes if subqueryRTindexes ends up empty, and we
definitely don't need to generate O(n^2) list cells in that case.  I
think that latter point, at least, is quite likely to be worth
optimizing.  Granted, spewing out extra ListCells is far less harmful
than doing the same thing with AppendRelInfos and their entire list of
translated_vars, but it's still not good.  Can't we move the loop that
copies root.append_rel_list inside if (final_rtable != NIL 
!bms_is_empty(subqueryRTindexes))?  If we don't take that branch,
root.append_rel_list isn't getting modified at all, so a shallow copy
is good enough.

 On my laptop, I get the following timings for your test queries from
 unmodified HEAD (--cassert build):

 # Q1: 41260.239 ms
 # Q2: 45225.768 ms
 # Q3: 43066.958 ms
 # Q4: 193360.726 ms
 # Q5: 40746.503 ms

 and these with my patch:

 # Q1: 1767.753 ms
 # Q2: 3662.131 ms
 # Q3: 814.293 ms
 # Q4: 64468.914 ms
 # Q5: 881.295 ms

 which seems to be generally a better result.

Better than unpatched, definitely!  Not sure how it compares to my patch.

 The extraordinarily planning time for query 4 is caused by a
 completely different problem: SearchCatCache eats up huge amounts of
 CPU; its callers are get_attavgwidth and get_typlen.  It's not clear
 to me why doubling the number of relations causes such an enormous
 explosion in calls to those functions; I would expect the number of
 calls to double, but presumably the actual increase is much more.

 Actually, Q4 necessarily involves O(N^2) planning time, because for
 each of N target relations you're considering a join to an N-member
 inheritance tree.  A lot of those ultimately get thrown away by
 constraint exclusion, but not before we've expended significant
 cycles on them.  I do not think we are going to get much traction
 on that --- even if we do something to knock off whatever the leading
 term is, there'll still be more O(N^2) behavior right behind it.

Hmm, maybe so.  On the other hand, if there's a way to significantly
shrink the constant factor on that O(N^2) stuff, it could bring a lot
of people some much-needed relief.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-19 Thread Tom Lane
Petr Jelinek p...@2ndquadrant.com writes:
 On 2015-06-19 01:04, Petr Jelinek wrote:
 On 2015-06-19 00:38, Petr Jelinek wrote:
 On 2015-06-18 22:04, Tom Lane wrote:
 By the by, the tablesample additions to range_table_mutator are
 obviously broken.

 Apparently it's not a good idea to do this at 1AM after long day :/

 The previous diff included some garbage in tests from my local 
 experimentations.

Pushed with minor adjustments.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-18 Thread Robert Haas
On Wed, Jun 17, 2015 at 9:56 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 17, 2015 at 9:32 AM, Thomas Munro
 thomas.mu...@enterprisedb.com wrote:
 We saw a rather extreme performance problem in a cluster upgraded from
 9.1 to 9.3.  It uses a largish number of child tables (partitions) and
 many columns.  Planning a simple UPDATE via the base table started
 using a very large amount of memory and CPU time.

 My colleague Rushabh Lathia tracked the performance change down to
 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c03ad5602f529787968fa3201b35c119bbc6d782
 .

 The call to copyObject in the loop introduced here seems to be
 problematic (copying append_rel_list for every element in
 append_rel_list unconditionally), though we're still trying to figure
 it out.  Attached is a simple repro script, with variables to tweak.

 Quite a few others have posted about this sort of thing and been
 politely reminded of the 100 table caveat [1][2] which is fair enough,
 but the situations seems to have got dramatically worse for some users
 after an upgrade.

 Yes.  The copyObject() call introduced by this commit seems to have
 complexity O(T^2*C) where T is the number of child tables and C is the
 number of columns per child.  That's because the List that is being
 copied is a list of AppendRelInfo nodes, each of which has a
 translated_vars member that is a list of every Var in one table, and
 we copy that list once per child.

 It appears that in a lot of cases this work is unnecessary.  The
 second (long) for loop in inheritance_planner copies root-rowMarks
 and root-append_rel_list so as to be able to apply ChangeVarNodes()
 to the result, but we only actually do that if the rte is of type
 RTE_SUBQUERY or if it has security quals.  In the common case where we
 reach inheritance_planner not because of UNION ALL but just because
 the table has a bunch of inheritance children (none of which have RLS
 policies applied), we copy everything and then modify none of it,
 using up startling large amounts of memory in ways that pre-9.2
 versions did not.

The attached patch helps.  It does two things:

1. It arranges for inheritance_planner to throw away the memory
consumed by the subroot's rowMarks and append_rel_list after the call
to grouping_planner for that subroot returns.  This prevents the
explosive growth of memory usage in all cases I've tested so far, but
planning is still really slow.

2. It arranges not to deep-copy append_rel_list when the root's
append_rel_list doesn't need to be modified for the subroot.  This
makes planning much much faster in simple cases, like a simple update
on a table with many partitions.  But if you do attach a FROM clause
containing a subquery to such an update, then this optimization
doesn't kick in any more and things are still very slow (though still
memory-bounded, due to part 1).

I feel I might be missing a trick here.  It seems unlikely to me that
we actually need the entire append_rel_list for every subquery; and we
almost certainly don't need to modify every element of the
append_rel_list for every subquery.  Even the ones that no
ChangeVarNodes() call mutates still get deep-copied.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 8afde2b..84c75f1 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -43,6 +43,7 @@
 #include parser/parsetree.h
 #include parser/parse_agg.h
 #include rewrite/rewriteManip.h
+#include utils/memutils.h
 #include utils/rel.h
 #include utils/selfuncs.h
 
@@ -845,9 +846,17 @@ inheritance_planner(PlannerInfo *root)
 	List	   *returningLists = NIL;
 	List	   *rowMarks;
 	ListCell   *lc;
+	MemoryContext	myctx;
+	MemoryContext	oldctx;
 
 	Assert(parse-commandType != CMD_INSERT);
 
+	myctx = AllocSetContextCreate(CurrentMemoryContext,
+  inheritance_planner,
+  ALLOCSET_DEFAULT_MINSIZE,
+  ALLOCSET_DEFAULT_INITSIZE,
+  ALLOCSET_DEFAULT_MAXSIZE);
+
 	/*
 	 * We generate a modified instance of the original Query for each target
 	 * relation, plan that, and put all the plans into a list that will be
@@ -908,18 +917,16 @@ inheritance_planner(PlannerInfo *root)
 
 		/*
 		 * The rowMarks list might contain references to subquery RTEs, so
-		 * make a copy that we can apply ChangeVarNodes to.  (Fortunately, the
-		 * executor doesn't need to see the modified copies --- we can just
-		 * pass it the original rowMarks list.)
+		 * make a copy that we can apply ChangeVarNodes to.  If any security
+		 * barrier quals are present, the rowMarks list may be further modified
+		 * by grouping_planner. (Fortunately, the executor doesn't need to see
+		 * the modified copies --- we can just pass it the original rowMarks
+		 * list.  For the same reason, we can arrange to throw away the copy
+		 * we make here relatively 

Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-18 Thread Dean Rasheed
On 18 June 2015 at 14:48, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 17, 2015 at 9:56 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 17, 2015 at 9:32 AM, Thomas Munro
 thomas.mu...@enterprisedb.com wrote:
 We saw a rather extreme performance problem in a cluster upgraded from
 9.1 to 9.3.  It uses a largish number of child tables (partitions) and
 many columns.  Planning a simple UPDATE via the base table started
 using a very large amount of memory and CPU time.

 My colleague Rushabh Lathia tracked the performance change down to
 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c03ad5602f529787968fa3201b35c119bbc6d782
 .

 The call to copyObject in the loop introduced here seems to be
 problematic (copying append_rel_list for every element in
 append_rel_list unconditionally), though we're still trying to figure
 it out.  Attached is a simple repro script, with variables to tweak.

 Quite a few others have posted about this sort of thing and been
 politely reminded of the 100 table caveat [1][2] which is fair enough,
 but the situations seems to have got dramatically worse for some users
 after an upgrade.

 Yes.  The copyObject() call introduced by this commit seems to have
 complexity O(T^2*C) where T is the number of child tables and C is the
 number of columns per child.  That's because the List that is being
 copied is a list of AppendRelInfo nodes, each of which has a
 translated_vars member that is a list of every Var in one table, and
 we copy that list once per child.

 It appears that in a lot of cases this work is unnecessary.  The
 second (long) for loop in inheritance_planner copies root-rowMarks
 and root-append_rel_list so as to be able to apply ChangeVarNodes()
 to the result, but we only actually do that if the rte is of type
 RTE_SUBQUERY or if it has security quals.  In the common case where we
 reach inheritance_planner not because of UNION ALL but just because
 the table has a bunch of inheritance children (none of which have RLS
 policies applied), we copy everything and then modify none of it,
 using up startling large amounts of memory in ways that pre-9.2
 versions did not.

 The attached patch helps.  It does two things:

 1. It arranges for inheritance_planner to throw away the memory
 consumed by the subroot's rowMarks and append_rel_list after the call
 to grouping_planner for that subroot returns.  This prevents the
 explosive growth of memory usage in all cases I've tested so far, but
 planning is still really slow.

 2. It arranges not to deep-copy append_rel_list when the root's
 append_rel_list doesn't need to be modified for the subroot.  This
 makes planning much much faster in simple cases, like a simple update
 on a table with many partitions.  But if you do attach a FROM clause
 containing a subquery to such an update, then this optimization
 doesn't kick in any more and things are still very slow (though still
 memory-bounded, due to part 1).

 I feel I might be missing a trick here.  It seems unlikely to me that
 we actually need the entire append_rel_list for every subquery; and we
 almost certainly don't need to modify every element of the
 append_rel_list for every subquery.  Even the ones that no
 ChangeVarNodes() call mutates still get deep-copied.


Yeah, you could probably pre-compute the indexes of the RTEs that need
to copied, outside of the big loop, and store them in a bitmapset.
Then, instead of copying the entire list of rowmarks/append_rel_infos
each time, you could just copy the ones that referred to those RTE
indexes (and only if the bitmapset was non-empty, which is the
equivalent of your second optimisation). However, for AppendRelInfos,
ChangeVarNodes() descends into the Vars in the translated_vars list,
so short-cutting the copying of the AppendRelInfo isn't obviously
safe. But, looking more closely, does ChangeVarNodes actually need to
examine translated_vars (the fall-through case) when child_relid isn't
the old rt_index? If not, that could be a big saving in cases like
this.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-18 Thread Tom Lane
Dean Rasheed dean.a.rash...@gmail.com writes:
 On 18 June 2015 at 14:48, Robert Haas robertmh...@gmail.com wrote:
 I feel I might be missing a trick here.  It seems unlikely to me that
 we actually need the entire append_rel_list for every subquery; and we
 almost certainly don't need to modify every element of the
 append_rel_list for every subquery.  Even the ones that no
 ChangeVarNodes() call mutates still get deep-copied.

 Yeah, you could probably pre-compute the indexes of the RTEs that need
 to copied, outside of the big loop, and store them in a bitmapset.
 Then, instead of copying the entire list of rowmarks/append_rel_infos
 each time, you could just copy the ones that referred to those RTE
 indexes (and only if the bitmapset was non-empty, which is the
 equivalent of your second optimisation). However, for AppendRelInfos,
 ChangeVarNodes() descends into the Vars in the translated_vars list,
 so short-cutting the copying of the AppendRelInfo isn't obviously
 safe. But, looking more closely, does ChangeVarNodes actually need to
 examine translated_vars (the fall-through case) when child_relid isn't
 the old rt_index? If not, that could be a big saving in cases like
 this.

I'm a bit surprised that duplicating the append_rel_list is a noticeable
performance problem.  It ought to be far smaller than the Query tree that
we've always duplicated in this loop --- in particular, it's really a
subset of what we have in the RTE list, no?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-18 Thread Robert Haas
On Thu, Jun 18, 2015 at 3:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Dean Rasheed dean.a.rash...@gmail.com writes:
 On 18 June 2015 at 14:48, Robert Haas robertmh...@gmail.com wrote:
 I feel I might be missing a trick here.  It seems unlikely to me that
 we actually need the entire append_rel_list for every subquery; and we
 almost certainly don't need to modify every element of the
 append_rel_list for every subquery.  Even the ones that no
 ChangeVarNodes() call mutates still get deep-copied.

 Yeah, you could probably pre-compute the indexes of the RTEs that need
 to copied, outside of the big loop, and store them in a bitmapset.
 Then, instead of copying the entire list of rowmarks/append_rel_infos
 each time, you could just copy the ones that referred to those RTE
 indexes (and only if the bitmapset was non-empty, which is the
 equivalent of your second optimisation). However, for AppendRelInfos,
 ChangeVarNodes() descends into the Vars in the translated_vars list,
 so short-cutting the copying of the AppendRelInfo isn't obviously
 safe. But, looking more closely, does ChangeVarNodes actually need to
 examine translated_vars (the fall-through case) when child_relid isn't
 the old rt_index? If not, that could be a big saving in cases like
 this.

 I'm a bit surprised that duplicating the append_rel_list is a noticeable
 performance problem.  It ought to be far smaller than the Query tree that
 we've always duplicated in this loop --- in particular, it's really a
 subset of what we have in the RTE list, no?

Well, append_rel_list has an AppendRelInfo for every RTE and that
contains a List (translated_vars) which in turn contains a Var node
for every column.  I'm not sure how that compares to the RTE itself.
I think it's the cost of copying the translated_vars list that is
really the problem here - you can have 200 or 500 columns in a table,
so that's a Var and a ListCell for each one.  In the problem cases,
the number of AppendRelInfo elements is a small percentage of the
number of Var nodes under them.

That having been said, I don't know how the size of all that compares
to the size of the Query.  But I think the Query must be smaller,
because arranging to discard the AppendRelInfo and its translated_vars
list, and the per-subroot rowMarks list, after every call to
grouping_planner stops the memory blowup.

The whole translated_vars representation seems needless inefficient.
For a subquery, you probably need something like that.  But for an
inheritance child, you just need to swap the varno and maybe remap
some varattnos.  Indexing into a C array to grab the varattno seems
like it would be a heck of a lot more efficient than calling
list_nth().  It might be worth making AppendRelInfo support a choice
of representations so that we can use something more optimized for the
simple case while not losing the full generality when we need it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Jun 18, 2015 at 3:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm a bit surprised that duplicating the append_rel_list is a noticeable
 performance problem.  It ought to be far smaller than the Query tree that
 we've always duplicated in this loop --- in particular, it's really a
 subset of what we have in the RTE list, no?

 Well, append_rel_list has an AppendRelInfo for every RTE and that
 contains a List (translated_vars) which in turn contains a Var node
 for every column.  I'm not sure how that compares to the RTE itself.

RTEs also have a per-column component, namely the lists of column alias
names.  So there's something odd going on here.  I'll dig into it when
I get a chance (possibly not during PGCon).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-18 Thread Tom Lane
I wrote:
 Robert Haas robertmh...@gmail.com writes:
 Well, append_rel_list has an AppendRelInfo for every RTE and that
 contains a List (translated_vars) which in turn contains a Var node
 for every column.  I'm not sure how that compares to the RTE itself.

 RTEs also have a per-column component, namely the lists of column alias
 names.

... although I see that range_table_mutator doesn't bother to copy/change
the column alias substructure.  (Wonder if that gives rise to any
observable EXPLAIN bugs...)  But it still seems like the append_rel_list
shouldn't be all that much bulkier than all the other crap that gets
generated inside this loop.  We're not doing anything at all to reclaim
space consumed inside subquery_planner, and you'd think that would be
a lot.

By the by, the tablesample additions to range_table_mutator are obviously
broken.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-18 Thread Robert Haas
On Thu, Jun 18, 2015 at 4:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ... although I see that range_table_mutator doesn't bother to copy/change
 the column alias substructure.  (Wonder if that gives rise to any
 observable EXPLAIN bugs...)  But it still seems like the append_rel_list
 shouldn't be all that much bulkier than all the other crap that gets
 generated inside this loop.  We're not doing anything at all to reclaim
 space consumed inside subquery_planner, and you'd think that would be
 a lot.

 By the by, the tablesample additions to range_table_mutator are obviously
 broken.

Whee.

Meanwhile, here is an updated patch.  The attached script (a modified
version of something Thomas Munro sent me privately) contains a bunch
of test queries.  With the original patch I sent earlier, here are the
timings I got:

Q1 Time: 16215.887 ms
Q2 Time: 18674.139 ms
Q3 Time: 1029.093 ms
Q4 Time: 86497.781 ms
Q5 Time: 1143.851 ms

This version is about the same for the last three, but the first two
get much faster:

Q1 Time: 2951.231 ms
Q2 Time: 1251.809 ms
Q3 Time: 1049.235 ms
Q4 Time: 88477.803 ms
Q5 Time: 1172.965 ms

The speedup comes from the following trick: the first time we hit a
query that might requite a ChangeVarNodes() on the append_rel_list, we
compute a bitmapset of varnos that appear in that list.  Then, every
time we're thinking about doing a ChangeVarNodes from rti to new_rti,
we check whether rti appears in the Bitmapset.  If not, we can skip
ChangeVarNodes().  That seems to reduce the amount of object-copying
and object-walking attributable to this loop to something negligible
in all of these test cases.

The extraordinarily planning time for query 4 is caused by a
completely different problem: SearchCatCache eats up huge amounts of
CPU; its callers are get_attavgwidth and get_typlen.  It's not clear
to me why doubling the number of relations causes such an enormous
explosion in calls to those functions; I would expect the number of
calls to double, but presumably the actual increase is much more.
That's a separate problem, though, unconnected to
c03ad5602f529787968fa3201b35c119bbc6d782 and not necessarily a
regression.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


repro-planner-explosion.sh
Description: Bourne shell script
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 8afde2b..6737818 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -43,6 +43,7 @@
 #include parser/parsetree.h
 #include parser/parse_agg.h
 #include rewrite/rewriteManip.h
+#include utils/memutils.h
 #include utils/rel.h
 #include utils/selfuncs.h
 
@@ -845,9 +846,19 @@ inheritance_planner(PlannerInfo *root)
 	List	   *returningLists = NIL;
 	List	   *rowMarks;
 	ListCell   *lc;
+	MemoryContext	myctx;
+	MemoryContext	oldctx;
+	bool		did_extract = false;
+	Relids		append_rel_list_relids = NULL;
 
 	Assert(parse-commandType != CMD_INSERT);
 
+	myctx = AllocSetContextCreate(CurrentMemoryContext,
+  inheritance_planner,
+  ALLOCSET_DEFAULT_MINSIZE,
+  ALLOCSET_DEFAULT_INITSIZE,
+  ALLOCSET_DEFAULT_MAXSIZE);
+
 	/*
 	 * We generate a modified instance of the original Query for each target
 	 * relation, plan that, and put all the plans into a list that will be
@@ -908,18 +919,16 @@ inheritance_planner(PlannerInfo *root)
 
 		/*
 		 * The rowMarks list might contain references to subquery RTEs, so
-		 * make a copy that we can apply ChangeVarNodes to.  (Fortunately, the
-		 * executor doesn't need to see the modified copies --- we can just
-		 * pass it the original rowMarks list.)
+		 * make a copy that we can apply ChangeVarNodes to.  If any security
+		 * barrier quals are present, the rowMarks list may be further modified
+		 * by grouping_planner. (Fortunately, the executor doesn't need to see
+		 * the modified copies --- we can just pass it the original rowMarks
+		 * list.  For the same reason, we can arrange to throw away the copy
+		 * we make here relatively quickly.)
 		 */
+		oldctx = MemoryContextSwitchTo(myctx);
 		subroot.rowMarks = (List *) copyObject(root-rowMarks);
-
-		/*
-		 * The append_rel_list likewise might contain references to subquery
-		 * RTEs (if any subqueries were flattenable UNION ALLs).  So prepare
-		 * to apply ChangeVarNodes to that, too.
-		 */
-		subroot.append_rel_list = (List *) copyObject(root-append_rel_list);
+		MemoryContextSwitchTo(oldctx);
 
 		/*
 		 * Add placeholders to the child Query's rangetable list to fill the
@@ -942,6 +951,7 @@ inheritance_planner(PlannerInfo *root)
 		if (final_rtable != NIL)
 		{
 			ListCell   *lr;
+			bool		did_copy = false;
 
 			rti = 1;
 			foreach(lr, parse-rtable)
@@ -968,7 +978,39 @@ inheritance_planner(PlannerInfo *root)
 	newrti = list_length(subroot.parse-rtable) + 1;
 	ChangeVarNodes((Node *) subroot.parse, rti, newrti, 0);
 	ChangeVarNodes((Node *) 

Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-18 Thread Petr Jelinek

On 2015-06-18 22:04, Tom Lane wrote:


By the by, the tablesample additions to range_table_mutator are obviously
broken.



Bah, typos. Attached patch corrects them.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index a2bcca5..931d464 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2870,9 +2870,9 @@ range_table_mutator(List *rtable,
 			case RTE_RELATION:
 if (rte-tablesample)
 {
-	MUTATE(rte-tablesample-args, rte-tablesample-args,
+	MUTATE(newrte-tablesample-args, rte-tablesample-args,
 		   List *);
-	MUTATE(rte-tablesample-repeatable,
+	MUTATE(newrte-tablesample-repeatable,
 		   rte-tablesample-repeatable, Node *);
 }
 break;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-18 Thread Petr Jelinek

On 2015-06-19 00:38, Petr Jelinek wrote:

On 2015-06-18 22:04, Tom Lane wrote:


By the by, the tablesample additions to range_table_mutator are obviously
broken.



Bah, typos. Attached patch corrects them.



Actually it should probably look more like this, sorry.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index a2bcca5..1c89abb 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2870,9 +2870,10 @@ range_table_mutator(List *rtable,
 			case RTE_RELATION:
 if (rte-tablesample)
 {
-	MUTATE(rte-tablesample-args, rte-tablesample-args,
+	FLATCOPY(newrte-tablesample, rte-tablesample, TableSampleClause);
+	MUTATE(newrte-tablesample-args, rte-tablesample-args,
 		   List *);
-	MUTATE(rte-tablesample-repeatable,
+	MUTATE(newrte-tablesample-repeatable,
 		   rte-tablesample-repeatable, Node *);
 }
 break;
diff --git a/src/test/regress/sql/tablesample.sql b/src/test/regress/sql/tablesample.sql
index 7b3eb9b..2be5d85 100644
--- a/src/test/regress/sql/tablesample.sql
+++ b/src/test/regress/sql/tablesample.sql
@@ -1,4 +1,5 @@
 CREATE TABLE test_tablesample (id int, name text) WITH (fillfactor=10); -- force smaller pages so we don't have to load too much data to get multiple pages
+CREATE TABLE test_tablesample_inh() INHERITS (test_tablesample);
 
 INSERT INTO test_tablesample SELECT i, repeat(i::text, 200) FROM generate_series(0, 9) s(i) ORDER BY i;
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-18 Thread Petr Jelinek

On 2015-06-19 01:04, Petr Jelinek wrote:

On 2015-06-19 00:38, Petr Jelinek wrote:

On 2015-06-18 22:04, Tom Lane wrote:


By the by, the tablesample additions to range_table_mutator are
obviously
broken.



Bah, typos. Attached patch corrects them.



Actually it should probably look more like this, sorry.



Apparently it's not a good idea to do this at 1AM after long day :/

The previous diff included some garbage in tests from my local 
experimentations.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index a2bcca5..1c89abb 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2870,9 +2870,10 @@ range_table_mutator(List *rtable,
 			case RTE_RELATION:
 if (rte-tablesample)
 {
-	MUTATE(rte-tablesample-args, rte-tablesample-args,
+	FLATCOPY(newrte-tablesample, rte-tablesample, TableSampleClause);
+	MUTATE(newrte-tablesample-args, rte-tablesample-args,
 		   List *);
-	MUTATE(rte-tablesample-repeatable,
+	MUTATE(newrte-tablesample-repeatable,
 		   rte-tablesample-repeatable, Node *);
 }
 break;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance planner CPU and memory usage change since 9.3.2

2015-06-17 Thread Robert Haas
On Wed, Jun 17, 2015 at 9:32 AM, Thomas Munro
thomas.mu...@enterprisedb.com wrote:
 We saw a rather extreme performance problem in a cluster upgraded from
 9.1 to 9.3.  It uses a largish number of child tables (partitions) and
 many columns.  Planning a simple UPDATE via the base table started
 using a very large amount of memory and CPU time.

 My colleague Rushabh Lathia tracked the performance change down to
 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=c03ad5602f529787968fa3201b35c119bbc6d782
 .

 The call to copyObject in the loop introduced here seems to be
 problematic (copying append_rel_list for every element in
 append_rel_list unconditionally), though we're still trying to figure
 it out.  Attached is a simple repro script, with variables to tweak.

 Quite a few others have posted about this sort of thing and been
 politely reminded of the 100 table caveat [1][2] which is fair enough,
 but the situations seems to have got dramatically worse for some users
 after an upgrade.

Yes.  The copyObject() call introduced by this commit seems to have
complexity O(T^2*C) where T is the number of child tables and C is the
number of columns per child.  That's because the List that is being
copied is a list of AppendRelInfo nodes, each of which has a
translated_vars member that is a list of every Var in one table, and
we copy that list once per child.

It appears that in a lot of cases this work is unnecessary.  The
second (long) for loop in inheritance_planner copies root-rowMarks
and root-append_rel_list so as to be able to apply ChangeVarNodes()
to the result, but we only actually do that if the rte is of type
RTE_SUBQUERY or if it has security quals.  In the common case where we
reach inheritance_planner not because of UNION ALL but just because
the table has a bunch of inheritance children (none of which have RLS
policies applied), we copy everything and then modify none of it,
using up startling large amounts of memory in ways that pre-9.2
versions did not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance of foregn key constraints - dropping isinternal triggers by a

2014-04-02 Thread Andrzej Mazurkiewicz
On Tuesday 01 of April 2014 11:06:00 you wrote:
 On Tue, Apr 1, 2014 at 9:13 AM, Andrzej Mazurkiewicz 
 
 andr...@mazurkiewicz.org wrote:
  It seems that if the trigger is internal (tgisinternal = true) it is not
  visible to the DROP TRIGGER command. So it cannot be deleted using DROP
  TRIGGER command, although the dependency type is DEPENDENCY_AUTOMATIC
 
 Try using a quoted identifier:
 
 DROP TRIGGER RI_ConstraintTrigger_c_19322 ON master;
 
 Regards,

 --
 Fabrízio de Royes Mello
 Consultoria/Coaching PostgreSQL
Good Afternoon.

Sorry for my mistake. I have forgotten about modifications of names to lower 
case.

The owner of a table can drop trigger using the command DROP TRIGGER.

However, where I can find information on philosophy of that approach.

For me internal triggers are marked isinternal not to be accessed using SQL 
commands. Removing internals, f. e. FK triggers, by hand by a programmer and 
even by an administrator seems not to be a recommended practice.

Kind Regards
Andrzej Mazurkiewicz


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance of foregn key constraints.

2014-04-01 Thread Andrzej Mazurkiewicz
Good Afternoon.

Enclosed please find continuation of the discussion of an accidental or 
malicious breaking a server consistency.

After reading please comment if there are more objections for changing the 
depedency type for trigger to constraint dependency from the 
DEPENDENCY_INTERNAL to DEPENDENCY_AUTO.

That change is necessary to reduce scope of modifications necessary for an 
implementation of the inheritance of foregn key constraints, particularly for 
removing of objects.

Kind Regards
Andrzej Mazurkiewicz


On Saturday 22 of March 2014 11:13:56 you wrote:
 Andrzej Mazurkiewicz andr...@mazurkiewicz.org writes:
  So in other words, somebody could (accidentally or maliciously) break the
  constraint by dropping one of its implementation triggers.  I doubt
  that's
  acceptable.

I have done some more digging in the subject.

All following tests are perfomed on my patched 9.3 postgres server where the 
depedency type for trigger to constraint dependency has been changed to the 
DEPENDENCY_AUTOMATIC.

It seems that if the trigger is internal (tgisinternal = true) it is not 
visible to the DROP TRIGGER command. So it cannot be deleted using DROP 
TRIGGER command, although the dependency type is DEPENDENCY_AUTOMATIC (ref. to 
the last SELECT).

Please have a look at the following actions.

Kind regards
Andrzej Mazurkiewicz



They are performed by a lipa user. The lipa user is not a superuser; 

postgres=# CREATE USER lipa;
CREATE ROLE
postgres=# CREATE DATABASE lipa OWNER lipa;
CREATE DATABASE


postgres93@tata:~$ psql -W lipa lipa
Password for user lipa: 
psql (9.3.3)
Type help for help.

lipa= SELECT CURRENT_USER;
 current_user 
--
 lipa
(1 row)

lipa= CREATE TABLE master (master_a int, CONSTRAINT pk_master PRIMARY KEY 
(master_a));
CREATE TABLE
lipa= CREATE TABLE detail (master_a int, detail_a int, CONSTRAINT fk0_detail 
FOREIGN KEY (master_a) REFERENCES master(master_a));
CREATE TABLE
lipa= SELECT oid, tgrelid, tgname FROM pg_trigger ;
  oid  | tgrelid |tgname
---+-+--
 19322 |   19313 | RI_ConstraintTrigger_a_19322
 19323 |   19313 | RI_ConstraintTrigger_a_19323
 19324 |   19318 | RI_ConstraintTrigger_c_19324
 19325 |   19318 | RI_ConstraintTrigger_c_19325
(4 rows)

lipa= DROP TRIGGER RI_ConstraintTrigger_c_19322 ON master;
ERROR:  trigger ri_constrainttrigger_c_19322 for table master does not 
exist
lipa= DROP TRIGGER RI_ConstraintTrigger_c_19322 ON detail;
ERROR:  trigger ri_constrainttrigger_c_19322 for table detail does not 
exist

lipa= SELECT oid, tgrelid, tgname, tgconstraint FROM pg_trigger ;
  oid  | tgrelid |tgname| tgconstraint 
---+-+--+--
 19322 |   19313 | RI_ConstraintTrigger_a_19322 |19321
 19323 |   19313 | RI_ConstraintTrigger_a_19323 |19321
 19324 |   19318 | RI_ConstraintTrigger_c_19324 |19321
 19325 |   19318 | RI_ConstraintTrigger_c_19325 |19321
(4 rows)

lipa= SELECT * FROM pg_depend WHERE refobjid = 19321;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype 
-+---+--++--+-+-
2620 | 19322 |0 |   2606 |19321 |   0 | a
2620 | 19323 |0 |   2606 |19321 |   0 | a
2620 | 19324 |0 |   2606 |19321 |   0 | a
2620 | 19325 |0 |   2606 |19321 |   0 | a
(4 rows)



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance of foregn key constraints.

2014-04-01 Thread Fabrízio de Royes Mello
On Tue, Apr 1, 2014 at 9:13 AM, Andrzej Mazurkiewicz 
andr...@mazurkiewicz.org wrote:

 It seems that if the trigger is internal (tgisinternal = true) it is not
 visible to the DROP TRIGGER command. So it cannot be deleted using DROP
 TRIGGER command, although the dependency type is DEPENDENCY_AUTOMATIC
(ref. to
 the last SELECT).

 Please have a look at the following actions.

 They are performed by a lipa user. The lipa user is not a superuser;

 postgres=# CREATE USER lipa;
 CREATE ROLE
 postgres=# CREATE DATABASE lipa OWNER lipa;
 CREATE DATABASE


 postgres93@tata:~$ psql -W lipa lipa
 Password for user lipa:
 psql (9.3.3)
 Type help for help.

 lipa= SELECT CURRENT_USER;
  current_user
 --
  lipa
 (1 row)

 lipa= CREATE TABLE master (master_a int, CONSTRAINT pk_master PRIMARY KEY
 (master_a));
 CREATE TABLE
 lipa= CREATE TABLE detail (master_a int, detail_a int, CONSTRAINT
fk0_detail
 FOREIGN KEY (master_a) REFERENCES master(master_a));
 CREATE TABLE
 lipa= SELECT oid, tgrelid, tgname FROM pg_trigger ;
   oid  | tgrelid |tgname
 ---+-+--
  19322 |   19313 | RI_ConstraintTrigger_a_19322
  19323 |   19313 | RI_ConstraintTrigger_a_19323
  19324 |   19318 | RI_ConstraintTrigger_c_19324
  19325 |   19318 | RI_ConstraintTrigger_c_19325
 (4 rows)

 lipa= DROP TRIGGER RI_ConstraintTrigger_c_19322 ON master;
 ERROR:  trigger ri_constrainttrigger_c_19322 for table master does not
 exist
 lipa= DROP TRIGGER RI_ConstraintTrigger_c_19322 ON detail;
 ERROR:  trigger ri_constrainttrigger_c_19322 for table detail does not
 exist

 lipa= SELECT oid, tgrelid, tgname, tgconstraint FROM pg_trigger ;
   oid  | tgrelid |tgname| tgconstraint
 ---+-+--+--
  19322 |   19313 | RI_ConstraintTrigger_a_19322 |19321
  19323 |   19313 | RI_ConstraintTrigger_a_19323 |19321
  19324 |   19318 | RI_ConstraintTrigger_c_19324 |19321
  19325 |   19318 | RI_ConstraintTrigger_c_19325 |19321
 (4 rows)


Try using a quoted identifier:

DROP TRIGGER RI_ConstraintTrigger_c_19322 ON master;

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Inheritance of foregn key constraints.

2014-04-01 Thread Tom Lane
Andrzej Mazurkiewicz andr...@mazurkiewicz.org writes:
 After reading please comment if there are more objections for changing the 
 depedency type for trigger to constraint dependency from the 
 DEPENDENCY_INTERNAL to DEPENDENCY_AUTO.

I'm not sure which part of no you didn't understand, but we're not
doing that.

 lipa= DROP TRIGGER RI_ConstraintTrigger_c_19322 ON master;
 ERROR:  trigger ri_constrainttrigger_c_19322 for table master does not 
 exist

This has to do with case-folding and lack of double quotes, not anything
more subtle than that.  A correct test would've given results like this:

regression=# drop trigger RI_ConstraintTrigger_a_43528 on master;
ERROR:  cannot drop trigger RI_ConstraintTrigger_a_43528 on table master 
because constraint fk0_detail on table detail requires it
HINT:  You can drop constraint fk0_detail on table detail instead.

which is the behavior we need.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance of foregn key constraints.

2014-04-01 Thread Robert Haas
On Tue, Apr 1, 2014 at 8:13 AM, Andrzej Mazurkiewicz
andr...@mazurkiewicz.org wrote:
 That change is necessary to reduce scope of modifications necessary for an
 implementation of the inheritance of foregn key constraints, particularly for
 removing of objects.

Nobody here is going to accept that goal as a valid reason to set the
dependency type to the wrong value.  The value we assign for the
dependency type has important user-visible semantics which we are not
going to break for the purpose of making some feature simpler to
implement.  Of course, PostgreSQL is open source, so you can change
your own copy however you like.  But such modifications won't be
accepted here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance of foregn key constraints.

2014-03-22 Thread Andrzej Mazurkiewicz
Good Morning.

1. At the beginning some explanations.

I am a lazy person that tries not to reinvent a wheel.
So I try to use postgres way of automatic processing, i. e. automatic removing 
dependent objects (which I consider an elegant solution and I really like it). 
A a result, I have used the pg_depend table to force to remove dependent 
entries.

2. At the moment the following behavior is a standard one for postgres.
- a child table (inheriting form a parent table(s) no FK) can be dropped;
- a referred table (master) can be freely dropped with a CASCADE option 
(causing dropping of the FK);
- a referring table (detail) can be freely dropped (causing automatic dropping 
of the FK);
- a CHECK constraint is inherited and the inheritance can be removed freely 
although leaving the CHECK constraint (no FK);
- an inherited table with CHECK constraint can be freely dropped (no FK);
- inheritance can be added for existing tables and it can be removed (no FK).

3. The following decisions should be taken for the FK inheritance (partly 
common issues, however  I try to be precise).
- (GENERAL statement) Are modifications of a master side hierarchy (a referred 
side) allowed without dropping the FK?
- (GENERAL statement) Are modifications of a detail side hierarchy (a referred 
side) allowed without dropping the FK?
- Is detaching childs allowed in the master and detail hierarchy without 
dropping the FK?
- Is dropping tables allowed in the master and detail hierarchy without 
dropping the FK?
- Is adding inheritance allowed to the master and detail hierarchies without 
dropping the FK?
- Is creating inheriting tables allowed in the master and detail hierarchies 
without dropping the FK?

It would be good if the decisions were consistent with the existing behavior.

The consequences of the decisions are rather far going. For large databases 
adding the FK constraint might last hours or days or perhaps weeks.

For my databases, although such modification would last hours and sometimes I 
have strange and changing ideas - I can live with those hours.

Personally I would vote that the above modifications SHOULD BE ALLOWED. Simply, 
because we do not drop the whole master or detail hierarchy but modify it and 
it gives certain flexibility to manipulating the schema.

The above flexibility is similar to adding inheritance to the existing tables 
and removing inheritance for them. We do not need to create another inherited 
table and to move data into it from the existing table.

3. Perhaps , after making the above decisions, a discussion about an 
implementing changes should be continued.

4. 
  My patch need one change that might be of significance.
  A type of the depencencies (pg_depend) among the FK constraint
  (pg_constraint) and the corresponding RI_ConstraintTrigger triggers has
  to be changed from DEPENDENCY_INTERNAL to DEPENDENCY_AUTO.
 
 So in other words, somebody could (accidentally or maliciously) break the
 constraint by dropping one of its implementation triggers.  I doubt that's
 acceptable.

The present postgres behavior ALLOWS accidental or malicious break the 
constraint by dropping one of its implementation triggers. Please ref. to the 
following example.

The following script has been run by the postgres user.

CREATE DATABASE lipa;
\c lipa
CREATE TABLE master (master_a int, CONSTRAINT pk_master PRIMARY KEY 
(master_a));
CREATE TABLE detail (master_a int, detail_a int, CONSTRAINT fk0_detail FOREIGN 
KEY (master_a) REFERENCES master(master_a));
SELECT oid, tgrelid, tgname FROM pg_trigger ;
DELETE FROM pg_trigger WHERE oid = (SELECT min(oid) FROM pg_trigger WHERE 
tgname LIKE 'RI_ConstraintTrigger%' LIMIT 1);
SELECT oid, tgrelid, tgname FROM pg_trigger ;
DROP TABLE detail;
DROP TABLE master;
\c postgres
DROP DATABASE lipa;


The results of the run are as follows.

psql -f test-malicious-dropping-FK-triggers.sql postgres


CREATE DATABASE
You are now connected to database lipa as user postgres.
CREATE TABLE
CREATE TABLE
  oid  | tgrelid |tgname
---+-+--
 39898 |   39889 | RI_ConstraintTrigger_a_39898
 39899 |   39889 | RI_ConstraintTrigger_a_39899
 39900 |   39894 | RI_ConstraintTrigger_c_39900
 39901 |   39894 | RI_ConstraintTrigger_c_39901
(4 rows)

DELETE 1
  oid  | tgrelid |tgname
---+-+--
 39899 |   39889 | RI_ConstraintTrigger_a_39899
 39900 |   39894 | RI_ConstraintTrigger_c_39900
 39901 |   39894 | RI_ConstraintTrigger_c_39901
(3 rows)

psql:test-malicious-dropping-FK-triggers.sql:8: ERROR:  could not find tuple 
for trigger 39898
psql:test-malicious-dropping-FK-triggers.sql:9: ERROR:  could not find tuple 
for trigger 39898
You are now connected to database postgres as user postgres.
DROP DATABASE

 
  If this modification is not applied, the detail child table cannot be
  dropped without prevous dropping the whole FK constraint because the
  removing operation 

Re: [HACKERS] Inheritance of foregn key constraints.

2014-03-22 Thread Tom Lane
Andrzej Mazurkiewicz andr...@mazurkiewicz.org writes:
 So in other words, somebody could (accidentally or maliciously) break the
 constraint by dropping one of its implementation triggers.  I doubt that's
 acceptable.

 The present postgres behavior ALLOWS accidental or malicious break the 
 constraint by dropping one of its implementation triggers. Please ref. to the 
 following example.
 The following script has been run by the postgres user.

Well, right there you lost me, because superusers are exempt from all
permissions checks by definition; and in particular, direct manipulations
of the system catalogs by superusers are always out of scope for
discussions of what the system should try to protect itself against.
(Try delete from pg_proc; in a scratch database sometime.)

My point is that without the internal dependency, a normal user could do
standard SQL commands (ie DROP TRIGGER) and break the FK that way.
That's the case that's not acceptable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance of foregn key constraints.

2014-03-21 Thread Tom Lane
Andrzej Mazurkiewicz andr...@mazurkiewicz.org writes:
 My patch need one change that might be of significance.
 A type of the depencencies (pg_depend) among the FK constraint 
 (pg_constraint) 
 and the corresponding RI_ConstraintTrigger triggers has to be changed from 
 DEPENDENCY_INTERNAL to DEPENDENCY_AUTO.

So in other words, somebody could (accidentally or maliciously) break the
constraint by dropping one of its implementation triggers.  I doubt that's
acceptable.

 If this modification is not applied, the detail child table cannot be dropped 
 without prevous dropping the whole FK constraint because the removing 
 operation depend on the FK constraint of its parent table.

Dropping a child table is going to have much larger problems than that,
no?  What about the values in the child table --- don't you risk orphaning
referencing rows?  Or are you only supporting this on the referencing
side?

In any case, it seems like DROP TABLE could remove the dependency entries
for itself, rather than taking the risk of weakening the dependency type.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance and indexes

2014-01-14 Thread Marti Raudsepp
On Tue, Jan 14, 2014 at 12:07 PM, knizhnik knizh...@garret.ru wrote:
 But is it possible to use index for derived table at all?

Yes, the planner will do an index scan when it makes sense.

 Why sequential search is used for derived table in the example below:

 insert into derived_table values (2,2);
 create index derived_index on derived_table(x);
 explain select * from base_table where x=0;

With only 1 row in the table, the planner decides there's no point in
scanning the index. Try with more realistic data.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inheritance, CREATE TABLE LIKE, and partitioned tables

2006-06-26 Thread elein
On Mon, Jun 26, 2006 at 12:31:24PM -0400, Greg Stark wrote:
 
 Currently analyze.c and tablecmds.c both have some very similar code to handle
 copying columns from parent tables. As long as they were just copying
 columns and in the case of tablecmds.c copying check constraints that wasn't
 really that bad an idea, the code is pretty simple.
 
 However with partitioned tables it becomes important to copy more table
 attributes like foreign key constraints and hopefully one day indexes. And it
 would be awfully convenient to have CREATE TABLE LIKE have options to copy the
 same things that inherited tables get. And have exactly the same semantics.

People will turn around and immediately as for create table like without us
making the assumptions it wanted all of the extras that come with inheritance.
COPY is a copy of the table, not additional functionality.  The added 
flexibility
of adding only what is necessary makes more sense than going in to guess
what was added and removing it later if it is not needed.

This does not preclude adding a copy table like (with extras) though if you 
must.

 
 So I'm suggesting refactoring the code from analyze.c and tablecmds.c into
 functions to copy the columns, constraints, indexes etc.
 
 For example I see a functions like:
 
 List *CopyTableColumns(relation source, List *target_schema)
 List *CopyTableCheckConstraints(relation source, List *target_schema)
 ...
 
 To do this though might require some changes in the algorithm used for
 inherited tables. Currently it builds up the list of merged columns
 incrementally. I'm thinking it would be more natural to accumulate all the
 columns from parents and then remove duplicates in a single pass. I think it
 should be possible to maintain precisely the same semantics doing this though.

Be careful because the code may be taking into account multiple inheritance 
here.
I'm not sure about this. It should be looked at carefully.

 
 I may be able to make AddInherits a consumer for these functions as well,
 though it would be a bit awkward since it would have to construct a fake list
 of ColumnDefs to act as the target schema. It would have the side effect of
 making the constraint comparison use change_varattnos_of_a_node and then
 compare the binary representations rather than decompiling the constraints to
 do the comparison. I'm not sure if that's the same semantics.
Again, be sure it is the same semantics before going that way.
 
 To a certain degree I feel like this is just make-work. The existing code
 works fine and I can just happily keep additing functionality to both
 analyze.c and tablecmds.c. And it's possible we won't always want to have the
 two match.
 
 Has anyone looked at applying the ADD INHERITS patch yet? Would it be more or
 less likely to be accepted if it were a bigger patch that refactored all this
 stuff like I'm talking about?
 
 -- 
 greg
 
---elein
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Inheritance, CREATE TABLE LIKE, and partitioned tables

2006-06-26 Thread Greg Stark
elein [EMAIL PROTECTED] writes:

 People will turn around and immediately as for create table like without us
 making the assumptions it wanted all of the extras that come with inheritance.
 COPY is a copy of the table, not additional functionality.  The added 
 flexibility
 of adding only what is necessary makes more sense than going in to guess
 what was added and removing it later if it is not needed.
 
 This does not preclude adding a copy table like (with extras) though if you 
 must.

I'm not too sure what you're saying here. But just to be clear, the spec
specifies what CREATE TABLE LIKE does and does not copy. Any behaviour
different from the spec would have to be explicitly requested with an optional
clause.

-- 
greg


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Inheritance, CREATE TABLE LIKE, and partitioned tables

2006-06-26 Thread Bruce Momjian

If you want to merge those functions, please do it as a separate patch
now that the patch has been applied.  Having too much unrelated stuff in
a patch does confuse things.

---

Greg Stark wrote:
 
 Currently analyze.c and tablecmds.c both have some very similar code to handle
 copying columns from parent tables. As long as they were just copying
 columns and in the case of tablecmds.c copying check constraints that wasn't
 really that bad an idea, the code is pretty simple.
 
 However with partitioned tables it becomes important to copy more table
 attributes like foreign key constraints and hopefully one day indexes. And it
 would be awfully convenient to have CREATE TABLE LIKE have options to copy the
 same things that inherited tables get. And have exactly the same semantics.
 
 So I'm suggesting refactoring the code from analyze.c and tablecmds.c into
 functions to copy the columns, constraints, indexes etc.
 
 For example I see a functions like:
 
 List *CopyTableColumns(relation source, List *target_schema)
 List *CopyTableCheckConstraints(relation source, List *target_schema)
 ...
 
 To do this though might require some changes in the algorithm used for
 inherited tables. Currently it builds up the list of merged columns
 incrementally. I'm thinking it would be more natural to accumulate all the
 columns from parents and then remove duplicates in a single pass. I think it
 should be possible to maintain precisely the same semantics doing this though.
 
 I may be able to make AddInherits a consumer for these functions as well,
 though it would be a bit awkward since it would have to construct a fake list
 of ColumnDefs to act as the target schema. It would have the side effect of
 making the constraint comparison use change_varattnos_of_a_node and then
 compare the binary representations rather than decompiling the constraints to
 do the comparison. I'm not sure if that's the same semantics.
 
 To a certain degree I feel like this is just make-work. The existing code
 works fine and I can just happily keep additing functionality to both
 analyze.c and tablecmds.c. And it's possible we won't always want to have the
 two match.
 
 Has anyone looked at applying the ADD INHERITS patch yet? Would it be more or
 less likely to be accepted if it were a bigger patch that refactored all this
 stuff like I'm talking about?
 
 -- 
 greg
 
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings
 

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Inheritance, Primary Keys and Foreign Keys

2006-05-14 Thread Albert Cervera Areny
A Saturday 13 May 2006 08:33, Thomas Hallgren va escriure:
 Albert Cervera Areny wrote:
  Of course, that's an option for my case. Just wanted to know if this
  solution could be useful for PostgreSQL in general. Mainly because I'll
  add some triggers to check what maybe PostgreSQL should do itself but
  it's unimplemented.
 
  If that's not interesting or a proper solution for PostgreSQL I'll add it
  using the existing DDL in my application and that's all.
 
  What do you think?

 I think that if you want the database to improve its current inheritance
 behavior, then this trigger set is too limited. You need triggers that
 maintain both unique and primary keys and triggers that maintain cascade
 behavior.

True. I think those triggers should be used for all unique indexes, not only 
primary keys. What do you mean with triggers that maintain cascade behavior?


 In order to make it really good, you would also need to add some
 functionality to the mechanisms that maintain references. Today, they don't
 recognize inheritance at all.

Indeed, foreign keys should be inherited, as well as unique keys. And to look 
for the reference they should SELECT FROM instead of SELECT FROM ONLY.



 Personally, I use Hibernate. It tries to compensate for the lack of these
 features but since it is a middle-tier (or client) solution, it's not
 ideal. Another client can still violate the rules and to maintain integrity
 in the client is negative from a performance standpoint. I think it would
 be great if PostgreSQL could provide a more complete set of features that
 would enable inheritance. A good start would be to extend it with the
 functionality needed to maintain references, cascade actions, and enforce
 unique constraints.

 On the other hand, inheritance is a tricky business and a good OO-RDB
 mapper will give you several choices of how it should be mapped. There's no
 one size fits all. The best solution is probably if someone (you
 perhaps?) writes an external OO-RDB mapper module that executes in the
 backend. The author of such a tool would of course need some new nifty
 backend API's in order to do whats needed with references etc.

 I actually wrote something similar using Oracle a couple of years ago. It
 was based on type inheritance and views rather then tables and used
 'instead of' actions on all views (Oracles own mechanisms where far to
 limited). In some respect, I think that is a better solution. Inheritance
 and all that comes with it is more a 'type' thing then a 'table' thing in
 my world. A view is then used to _map_ the types to persistent storage,
 i.e. the 'tables'.

The library I'm developing (http://kandau.berlios.de) aims for very easy 
object persistency, and it offers a default O-R mapping schema. If the user 
wants, she can write her own, but as I'm working with PostgreSQL, I wanted to 
use the inheritance mechanism and extend it to fit the needs of this 
application. I think that inheritance at the database level as it's 
implemented in PostgreSQL is a very smart solution and I'd like it to be the 
default for my application.


 Regards,
 Thomas Hallgren

Thanks for your comments



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Inheritance, Primary Keys and Foreign Keys

2006-05-14 Thread Thomas Hallgren

Albert Cervera Areny wrote:
 ...What do you mean with triggers that maintain cascade behavior?


It ties on to how references are handled. Since they currently ignore the inheritance 
aspect, you need triggers that enforce 'on cascade delete/update'. They will become obsolete 
if that changes (i.e. SELECT FROM instead of SELECT FROM ONLY).


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Inheritance, Primary Keys and Foreign Keys

2006-05-14 Thread Hannu Krosing
Ühel kenal päeval, N, 2006-05-11 kell 23:28, kirjutas Albert Cervera
Areny:
 Of course, that's an option for my case. Just wanted to know if this solution 
 could be useful for PostgreSQL in general. Mainly because I'll add some 
 triggers to check what maybe PostgreSQL should do itself but it's 
 unimplemented. 
 
 If that's not interesting or a proper solution for PostgreSQL I'll add it 
 using the existing DDL in my application and that's all. 
 
 What do you think?
 
 A Tuesday 09 May 2006 21:43, Simon Riggs va escriure:
  On Tue, 2006-05-09 at 01:20 +0200, Albert Cervera Areny wrote:
 In my particular case (don't know about the SQL standard or other
   cases), it'd be enough if when an inherited table is created:
 - A primary key in the inherited table is created with the same columns
   as the super table.

This is the simple part.

 - A trigger is created in the new table that ensures that this primary
   key doesn't exist in the super table.

This grows very expensive fast, once you have more thean one inherited
table

 - A trigger is created in the super table that ensures that this primary
   key doesn't exist in it's sub tables.

Ditto

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com

NOTICE: This communication contains privileged or other confidential
information. If you have received it in error, please advise the sender
by reply email and immediately delete the message and any attachments
without copying or disclosing the contents.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Inheritance, Primary Keys and Foreign Keys

2006-05-13 Thread Thomas Hallgren

Albert Cervera Areny wrote:

Of course, that's an option for my case. Just wanted to know if this solution
could be useful for PostgreSQL in general. Mainly because I'll add some
triggers to check what maybe PostgreSQL should do itself but it's
unimplemented.

If that's not interesting or a proper solution for PostgreSQL I'll add it
using the existing DDL in my application and that's all.

What do you think?

I think that if you want the database to improve its current inheritance behavior, then this 
trigger set is too limited. You need triggers that maintain both unique and primary keys and 
triggers that maintain cascade behavior.


In order to make it really good, you would also need to add some functionality to the 
mechanisms that maintain references. Today, they don't recognize inheritance at all.


Personally, I use Hibernate. It tries to compensate for the lack of these features but since 
it is a middle-tier (or client) solution, it's not ideal. Another client can still violate 
the rules and to maintain integrity in the client is negative from a performance standpoint. 
I think it would be great if PostgreSQL could provide a more complete set of features that 
would enable inheritance. A good start would be to extend it with the functionality needed 
to maintain references, cascade actions, and enforce unique constraints.


On the other hand, inheritance is a tricky business and a good OO-RDB mapper will give you 
several choices of how it should be mapped. There's no one size fits all. The best 
solution is probably if someone (you perhaps?) writes an external OO-RDB mapper module that 
executes in the backend. The author of such a tool would of course need some new nifty 
backend API's in order to do whats needed with references etc.


I actually wrote something similar using Oracle a couple of years ago. It was based on type 
inheritance and views rather then tables and used 'instead of' actions on all views (Oracles 
own mechanisms where far to limited). In some respect, I think that is a better solution. 
Inheritance and all that comes with it is more a 'type' thing then a 'table' thing in my 
world. A view is then used to _map_ the types to persistent storage, i.e. the 'tables'.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Inheritance, Primary Keys and Foreign Keys

2006-05-13 Thread Albert Cervera Areny
Of course, that's an option for my case. Just wanted to know if this solution 
could be useful for PostgreSQL in general. Mainly because I'll add some 
triggers to check what maybe PostgreSQL should do itself but it's 
unimplemented. 

If that's not interesting or a proper solution for PostgreSQL I'll add it 
using the existing DDL in my application and that's all. 

What do you think?

A Tuesday 09 May 2006 21:43, Simon Riggs va escriure:
 On Tue, 2006-05-09 at 01:20 +0200, Albert Cervera Areny wrote:
  In my particular case (don't know about the SQL standard or other
  cases), it'd be enough if when an inherited table is created:
  - A primary key in the inherited table is created with the same columns
  as the super table.
  - A trigger is created in the new table that ensures that this primary
  key doesn't exist in the super table.
  - A trigger is created in the super table that ensures that this primary
  key doesn't exist in it's sub tables.

 Why not add these in your design rather than into the database?

 All of the above can be added using existing DDL and you can group
 things together in a transaction and call when required.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Inheritance, Primary Keys and Foreign Keys

2006-05-12 Thread Albert Cervera Areny
Of course, that's an option for my case. Just wanted to know if this solution
could be useful for PostgreSQL in general. Mainly because I'll add some
triggers to check what maybe PostgreSQL should do itself but it's
unimplemented.

If that's not interesting or a proper solution for PostgreSQL I'll add it
using the existing DDL in my application and that's all.

What do you think?

A Tuesday 09 May 2006 21:43, Simon Riggs va escriure:
 On Tue, 2006-05-09 at 01:20 +0200, Albert Cervera Areny wrote:
  In my particular case (don't know about the SQL standard or other
  cases), it'd be enough if when an inherited table is created:
  - A primary key in the inherited table is created with the same columns
  as the super table.
  - A trigger is created in the new table that ensures that this primary
  key doesn't exist in the super table.
  - A trigger is created in the super table that ensures that this primary
  key doesn't exist in it's sub tables.

 Why not add these in your design rather than into the database?

 All of the above can be added using existing DDL and you can group
 things together in a transaction and call when required.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Inheritance, Primary Keys and Foreign Keys

2006-05-09 Thread Simon Riggs
On Tue, 2006-05-09 at 01:20 +0200, Albert Cervera Areny wrote:

   In my particular case (don't know about the SQL standard or other 
 cases),
 it'd be enough if when an inherited table is created:
   - A primary key in the inherited table is created with the same columns 
 as
 the super table.
   - A trigger is created in the new table that ensures that this primary 
 key
 doesn't exist in the super table.
   - A trigger is created in the super table that ensures that this 
 primary key
 doesn't exist in it's sub tables.

Why not add these in your design rather than into the database?

All of the above can be added using existing DDL and you can group
things together in a transaction and call when required.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Inheritance

2002-09-07 Thread Oliver Elphick

On Fri, 2002-09-06 at 19:00, elein wrote:
 
 
 There was a comment earlier that was not really addressed.
 What can you do with table inheritance that you can not do
 with a relational implementation?  Or what would work *better*
 as inheritance?  (you define better)

There is nothing that you cannot do in some way; that way may not be
very convenient compared to the use of inheritance.  I consider
simplicity to be preferable to conceptual purity.

 This is a genuine question, not a snarky comment.  I really
 want to know.  This is the reason I can think of to use
 inheritance: Several tables have a common set of attributes and
 there is some reason for these tables to be separate AND there
 is some reason for the common columns to be queried en masse.
 What kinds of some reasons are there, though?  And if my
 condition for using table inheritance is lacking or misguided, what should
 be the criteria for using table inheritance?

I use it when a group of tables are closely related; they are all
members of some higher class.  For example:

   person ... address
  |
   +--+--+
   | |
 organisation   individual .. pay_tax
   | |
  +++  +-+-+
  |||  | | |
 customer  supplier ...etc...   staff  homeworker ...etc...
  |
 ++-+
 |  |
home_customerexport_customer

It is convenient to use a higher class when you are interested in all
its members and only in the attributes of the higher class.  So I can
say

   SELECT * FROM person,address
WHERE address.person = person.id AND
  address.town = 'London';

to get all rows for people in London.  I will only get those attributes
that are in person itself; if I want to know about credit limits, that
is only relevant in the customer hierarchy and I have to SELECT from
customer instead..

Similarly, I can use the whole customer hierarchy when changing or
reporting on outstanding customer balances.

If foreign key relations were valid against an inheritance tree, I could
implement it for a table of addresses referencing the highest level
(every person has an address) and of pay and tax records at the
individual level.  These don't change as you go down the hierarchy, but
a purely relational implementation has to be redone at each level.  A
reciprocal relation requires an extra table to hold all the hierarchy's
keys and that in turn needs triggers to keep that table maintained.
(I.e., person should have a FK reference to address and address to
person; instead, address needs a reference to person_keys, which I have
to create because FK against a hierarchy isn't valid.)  The lack of
inherited RI makes the design more complex and more difficult to
understand.

-- 
Oliver Elphick[EMAIL PROTECTED]
Isle of Wight, UK
http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
 
 For whosoever shall call upon the name of the Lord 
  shall be saved. Romans 10:13 


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Inheritance

2002-09-06 Thread Hannu Krosing

On Fri, 2002-09-06 at 07:37, Curt Sampson wrote:
 On 5 Sep 2002, Hannu Krosing wrote:
 
  Suppose you have a table CITIZEN with table-level constraint IS_GOOD
  which is defined as kills_not_others(CITIZEN). and there is table
  CIVIL_SERVANT (..) UNDER CITIZEN. Now you have just one table MILITARY
  (...) UNDER CIVIL_SERVANT, where you have other criteria for IS_GOOD
 
 This I very much disagree with.
 
 In most object-oriented languages (Eiffel being a notable exception, IIRC),
 you can't specify constraints on objects. But in a relational database,
 you can specify constraints on tables, and it should *never* *ever* be
 possible to violate those constraints, or the constraints are pointless.

That's not how real world (which data is supposed to model) operates ;)

As Greg already pointed out, there are two kinds of constraints -
database integrity constraints (foreign key, unique, not null, check),
which should never be overridden and business-rule constraints which
should be overridable in child tables.

one can argue that the latter are not constraints at all, but they sure
look like constraints to me ;)

To elaborate on Gregs example if you have table GOODS and under it a
table CAMPAIGN_GOODS then you may place a general overridable constraint
valid_prices on GOODS which checks that you dont sell cheaper than you
bought, but you still want sell CAMPAIGN_GOODS under aquiring price, so
you override the constraint for CAMPAIGN_GOODS.

 So if I have a constraint that says, no rows appearing in this
 table will ever violate constraint X, and then you go and create
 a way of inserting rows into that table that violate that constraint,
 I think you've just made the database into a non-relational database.

SQL standard constraints should be non-overridable. I still think that
Constraint triggers should be overridable/dynamic. 

Or maybe it is better to just make the check function should be
dynamically dispatched, so the constraint will always hold, it just can
mean different things for different types.

 I really don't want to break postgres' relational side for some
 inheritance features of dubious utility. Constraints should be explicitly
 removed from tables if they are no longer needed, not implicitly removed
 through the creation of another table.
 
 I think we should settle this point before going any further.

It seems that the dynamic dispatch of trigger function should be enough
for business-rule constraints. 

And it is also simpler and cleaner (both conceptually and to implement)
if constraints themselves are not overridable.

So in my CAMPAIGN_GOODS example you just have different
valid_prices(GOODS) and valid_prices(CAMPAIGN_GOODS), but one constraint
on GOODS which states that price must be valid . 

Doing it this way ensures that you are not able to have a record in
GOODS for which valid_price(ROW) does not hold.

If you don't want inherited tables to be able to override valid_price()
use it in CHECK constraint in GOODS, which should use the
valid_prices(cast(ROW as GOODS)) for any inherited type.

-
Hannu











---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Inheritance

2002-09-06 Thread Curt Sampson

On 6 Sep 2002, Hannu Krosing wrote:

  In most object-oriented languages (Eiffel being a notable exception, IIRC),
  you can't specify constraints on objects. But in a relational database,
  you can specify constraints on tables, and it should *never* *ever* be
  possible to violate those constraints, or the constraints are pointless.

 That's not how real world (which data is supposed to model) operates ;)

Sure it is. Please don't blame the language for being wrong when you
incorrectly model things for your purposes. To chose a much simpler
and more obvious example: if you stored birthdate as a date only, and
someone complained that you're not born all day, but at a particular
time on that day, you don't blame the language for having the date type
not store the time of day. You fix your problem to use both a date and a
time to store that value.

If the language specifies that contstraints on tables are not to be
violated, then don't use those constraints when you don't want them.

 To elaborate on Gregs example if you have table GOODS and under it a
 table CAMPAIGN_GOODS then you may place a general overridable constraint
 valid_prices on GOODS which checks that you dont sell cheaper than you
 bought, but you still want sell CAMPAIGN_GOODS under aquiring price, so
 you override the constraint for CAMPAIGN_GOODS.

This looks like a classic case of incorrect modelling to me. Does the
good itself change when it becomes a campaign_good? No. The price
changes, but that's obviously not an integral part of the good itself.
So separate your price information from your good information, and then
you can do things like have campaign prices, multiple prices per good
(since you probably want to keep the original price information as
well), and so on.

I'm really getting the feeling a lot of these applications that
want table inheritance want it just to be different, not because
it provides anything useful.

I am completely committed to object-oriented programming, and use
inheritance heavily, so it's not that I don't understand or like the
concepts. But just because a concept works well in one type of use does
not mean it will do any good, or even not do harm, when brought into a
completely different world.

 SQL standard constraints should be non-overridable. I still think that
 Constraint triggers should be overridable/dynamic.

I still don't like it. Eiffel had good reasons for making the
constraints non-overridable. Other OO languages don't have constraints,
or they would probably do the same.

That said, I could live with dynamic dispatch, if the default were
to make it non-dynamic, and you had to add a special flag to make it
dynamic. That way it would be obvious to the casual user or a DBA
familiar with other databases but not postgres that something unusual is
going on.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Inheritance

2002-09-06 Thread Hannu Krosing

On Fri, 2002-09-06 at 09:53, Curt Sampson wrote:
 
 If the language specifies that contstraints on tables are not to be
 violated, then don't use those constraints when you don't want them.

But what _should_ i use then if i want the same business rule on most
top-level types, but a changed one on some down the hierarchy ?
 
  To elaborate on Gregs example if you have table GOODS and under it a
  table CAMPAIGN_GOODS then you may place a general overridable constraint
  valid_prices on GOODS which checks that you dont sell cheaper than you
  bought, but you still want sell CAMPAIGN_GOODS under aquiring price, so
  you override the constraint for CAMPAIGN_GOODS.
 
 This looks like a classic case of incorrect modelling to me. Does the
 good itself change when it becomes a campaign_good? No. The price
 changes, but that's obviously not an integral part of the good itself.

Perhaps we mean different things by good. I meant a GOOD to be a THING 
bought with the purpose of reselling. Price (actually prices: 
selling_price and buying_price) is what makes it a GOOD and thus it is
an integral part of it.

 So separate your price information from your good information, and then
 you can do things like have campaign prices, multiple prices per good
 (since you probably want to keep the original price information as
 well), and so on.

It does not solve the problem described above - the price at which the
good is soled is still constrained differently for orninary and campaign
goods.

in standard relational model you would make the distinction inside the
constraint (CHECK (selling_price  buying_price) OR is_campaign_good)
but this localises the check in wrong place - in OO model I'd expect it
to be possible to define the constraint near the child type, not change
the parent constraint each time I derive new child types.

 I'm really getting the feeling a lot of these applications that
 want table inheritance want it just to be different, not because
 it provides anything useful.

As with any other inheritance, it is just a way to organize stuff.

In case of being able to override constraints for child tables it can
also be a significant performance boost - if you have 10 000 000 goods
in a table you don't want to change a constraint on GOODS to allow
campaign goods to be sold cheaper than bought as it would have to check
all goods for validity according to new constraint - putting the
constraint on just CAMPAIGN_GOODS will enable the DB engine to check
just tuples in CAMPAIGN_GOODS.

 I am completely committed to object-oriented programming, and use
 inheritance heavily, so it's not that I don't understand or like the
 concepts. But just because a concept works well in one type of use does
 not mean it will do any good, or even not do harm, when brought into a
 completely different world.

  Surely great caution is needed when defining the desired behaviour.

  SQL standard constraints should be non-overridable. I still think that
  Constraint triggers should be overridable/dynamic.
 
 I still don't like it. Eiffel had good reasons for making the
 constraints non-overridable. Other OO languages don't have constraints,
 or they would probably do the same.
 
 That said, I could live with dynamic dispatch, if the default were
 to make it non-dynamic, and you had to add a special flag to make it
 dynamic. That way it would be obvious to the casual user or a DBA
 familiar with other databases but not postgres that something unusual is
 going on.

That seems about the right compromise between constraining and developer
freedom.

-
Hannu








---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Inheritance

2002-09-06 Thread Greg Copeland

On Thu, 2002-09-05 at 15:51, Hannu Krosing wrote:
 On Fri, 2002-09-06 at 03:19, Greg Copeland wrote:
  
  What about the concept of columns being public or private?  That is,
  certain columns may not be inherited by a child?  Any thought to such a
  concept?  Perhaps different types of table inheritance can be considered
  in our model...has-a, is-a, etc...
 
 I can't fit this in my mental model of table inheritance for two reasons
 
 1) all parent table columns must be present in child

Okay, I must admit, I'm not really sure why.  If we look at it in a
physical versus logical manner, even if it's physically there, why must
it be logically exposed?  Can you help me understand why it would even
need to physically be there.  After all, if a child can't update it,
they don't need to see it.

 
 2) granting some right to parent should automatically allow selecting
 from children

Unless the parent deemed it inappropriate access (private)?

If a column were deemed private, that would have a couple of
stipulations on it.  That is, it would have to ensure that NOT NULL
where not one of the constraints, or, if it did, ensure that a default
value were also provided.

 
 both are required for select/insert/update/delete to work on table and
 its children (i.e. without ONLY)
 
 
 But maybe i just need to think more about it ;)
 

Well, I guess I'm lagging behind you on this manner.  Perhaps holding
my hand and explaining it a bit will allow you to work through it some
more and help bring me in line with what you're thinking.

Greg




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Inheritance

2002-09-06 Thread Greg Copeland

On Fri, 2002-09-06 at 07:53, Hannu Krosing wrote:
 On Fri, 2002-09-06 at 09:53, Curt Sampson wrote:
  This looks like a classic case of incorrect modelling to me. Does the
  good itself change when it becomes a campaign_good? No. The price
  changes, but that's obviously not an integral part of the good itself.
 
 Perhaps we mean different things by good. I meant a GOOD to be a THING 
 bought with the purpose of reselling. Price (actually prices: 
 selling_price and buying_price) is what makes it a GOOD and thus it is
 an integral part of it.

No matter now you look at the example, invalidating it does not address
the issue raised as it still exists.  Either way, Hannu and I seem to
agree that some class of constraints need to be able to be overridden.

 In case of being able to override constraints for child tables it can
 also be a significant performance boost - if you have 10 000 000 goods
 in a table you don't want to change a constraint on GOODS to allow
 campaign goods to be sold cheaper than bought as it would have to check
 all goods for validity according to new constraint - putting the
 constraint on just CAMPAIGN_GOODS will enable the DB engine to check
 just tuples in CAMPAIGN_GOODS.

I had not considered this before.  Does that still hold true if we go
with a parent contains all columns implementation?  Of are you simply
saying that it doesn't matter as when the constraint were applied it
would only scan the rows the below to the child?  Perhaps this doesn't
matter for this portion of the conversation.  But hey, I was curious. 
:)

 
   SQL standard constraints should be non-overridable. I still think that
   Constraint triggers should be overridable/dynamic.
  
  I still don't like it. Eiffel had good reasons for making the
  constraints non-overridable. Other OO languages don't have constraints,
  or they would probably do the same.

Well Curt, as you outlined above (clipped out) about it being a
different world...I think also applies here.  IMO, we are treading
lightly on new and perhaps thin ground so we need to be careful that we
apply common parallels and idioms only we are certain that they need
apply.  What I'm trying to say is, just because it's not allowed in
Eiffel does have to mean the same applies here.

  
  That said, I could live with dynamic dispatch, if the default were
  to make it non-dynamic, and you had to add a special flag to make it
  dynamic. That way it would be obvious to the casual user or a DBA
  familiar with other databases but not postgres that something unusual is
  going on.
 
 That seems about the right compromise between constraining and developer
 freedom.
 

I agree.  That does appear to be pointing us in a conservatively sane
and safe direction.


Greg




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Inheritance

2002-09-06 Thread cbbrowne

 On Fri, 2002-09-06 at 07:37, Curt Sampson wrote:
  On 5 Sep 2002, Hannu Krosing wrote:
  
   Suppose you have a table CITIZEN with table-level constraint IS_GOOD
   which is defined as kills_not_others(CITIZEN). and there is table
   CIVIL_SERVANT (..) UNDER CITIZEN. Now you have just one table MILITARY
   (...) UNDER CIVIL_SERVANT, where you have other criteria for IS_GOOD
  
  This I very much disagree with.
  
  In most object-oriented languages (Eiffel being a notable exception, IIRC),
  you can't specify constraints on objects. But in a relational database,
  you can specify constraints on tables, and it should *never* *ever* be
  possible to violate those constraints, or the constraints are pointless.
 
 That's not how real world (which data is supposed to model) operates ;)
 
 As Greg already pointed out, there are two kinds of constraints -
 database integrity constraints (foreign key, unique, not null, check),
 which should never be overridden and business-rule constraints which
 should be overridable in child tables.
 
 one can argue that the latter are not constraints at all, but they sure
 look like constraints to me ;)
 
 To elaborate on Gregs example if you have table GOODS and under it a
 table CAMPAIGN_GOODS then you may place a general overridable constraint
 valid_prices on GOODS which checks that you dont sell cheaper than you
 bought, but you still want sell CAMPAIGN_GOODS under aquiring price, so
 you override the constraint for CAMPAIGN_GOODS.

What that tells me is that the constraint, valid_prices, shouldn't have been 
on GOODS in the first place.  If it is not a legitimate constraint for the 
children, then it is not a legitimate constraint for the parent.

In human inheritance, if you marry someone with funny coloured skin, you 
don't get to choose that your children won't have funny coloured skin.  
That's a pretty forcible constraint.  :-).

For the GOODS situation, the constraint ought not to be on GOODS in the first 
place.  There ought to be a table ORDINARY_GOODS, or some such thing, to which 
the constraint applies, and from which CAMPAIGN_GOODS will _not_ be inheriting.

  So if I have a constraint that says, no rows appearing in this
  table will ever violate constraint X, and then you go and create
  a way of inserting rows into that table that violate that constraint,
  I think you've just made the database into a non-relational database.
 
 SQL standard constraints should be non-overridable. I still think that
 Constraint triggers should be overridable/dynamic. 
 
 Or maybe it is better to just make the check function should be
 dynamically dispatched, so the constraint will always hold, it just can
 mean different things for different types.

Or maybe if someone is doing an Object Oriented design, and making extensive 
use of inheritance, they'll need to apply constraints in a manner that allow 
them to be properly inherited.
--
(concatenate 'string aa454 @freenet.carleton.ca)
http://cbbrowne.com/info/
If a cow laughed, would milk come out its nose? 



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Inheritance

2002-09-06 Thread Greg Copeland

On Fri, 2002-09-06 at 08:57, [EMAIL PROTECTED] wrote:
  On Fri, 2002-09-06 at 07:37, Curt Sampson wrote:
   On 5 Sep 2002, Hannu Krosing wrote:
  
  To elaborate on Gregs example if you have table GOODS and under it a
  table CAMPAIGN_GOODS then you may place a general overridable constraint
  valid_prices on GOODS which checks that you dont sell cheaper than you
  bought, but you still want sell CAMPAIGN_GOODS under aquiring price, so
  you override the constraint for CAMPAIGN_GOODS.
 
 What that tells me is that the constraint, valid_prices, shouldn't have been 
 on GOODS in the first place.  If it is not a legitimate constraint for the 
 children, then it is not a legitimate constraint for the parent.
 

I don't agree with you on that point.  This concept is common to many
OO-implementations.  Unless you can come up with a powerful argument as
to why our to-be picture should never do this, I'm less than
convinced.

 In human inheritance, if you marry someone with funny coloured skin, you 
 don't get to choose that your children won't have funny coloured skin.  
 That's a pretty forcible constraint.  :-).
 

Fine, but that only works for YOUR specific example.  In that example,
the color constraint should be non-virtual, meaning, the child should
not be able to change it.  On the other hand, if I replace human with
metal product, hopefully I won't be stuck with gun metal gray for
every derived product.  Hopefully, somewhere along the lines, I'll be
able to override the parent's color constraint.

  Or maybe it is better to just make the check function should be
  dynamically dispatched, so the constraint will always hold, it just can
  mean different things for different types.
 
 Or maybe if someone is doing an Object Oriented design, and making extensive 
 use of inheritance, they'll need to apply constraints in a manner that allow 
 them to be properly inherited.

The problem with that assumption is that there is normally nothing wrong
with having seemingly mutually exclusive sets of *business rules* for a
parent and child.

Greg




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Inheritance

2002-09-06 Thread cbbrowne

Oops! [EMAIL PROTECTED] (Greg Copeland) was seen spray-painting on a wall:
 --=-eu74lKXry3SVx8eZ/qBD
 Content-Type: text/plain
 Content-Transfer-Encoding: quoted-printable
 On Fri, 2002-09-06 at 08:57, [EMAIL PROTECTED] wrote:
  On Fri, 2002-09-06 at 07:37, Curt Sampson wrote:
   On 5 Sep 2002, Hannu Krosing wrote:
  To elaborate on Gregs example if you have table GOODS and under it a
  table CAMPAIGN_GOODS then you may place a general overridable constraint
  valid_prices on GOODS which checks that you dont sell cheaper than you
  bought, but you still want sell CAMPAIGN_GOODS under aquiring price, so
  you override the constraint for CAMPAIGN_GOODS.

 What that tells me is that the constraint, valid_prices, shouldn't
 have been on GOODS in the first place.  If it is not a legitimate
 constraint for the children, then it is not a legitimate constraint
 for the parent.

 I don't agree with you on that point.  This concept is common to
 many OO-implementations.  Unless you can come up with a powerful
 argument as to why our to-be picture should never do this, I'm
 less than convinced.

If the plan is for table CAMPAIGN_GOODS to virtually be a view on GOODS,
then I'd say it _is_ necessary.

 In human inheritance, if you marry someone with funny coloured skin, yo=
 u=20
 don't get to choose that your children won't have funny coloured skin.=
 =20=20
 That's a pretty forcible constraint.  :-).
=20

Is there something broken with your mailer?  It's reformatting quotes
rather horribly...

 Fine, but that only works for YOUR specific example.  In that
 example, the color constraint should be non-virtual, meaning, the
 child should not be able to change it.  On the other hand, if I
 replace human with metal product, hopefully I won't be stuck
 with gun metal gray for every derived product.  Hopefully, somewhere
 along the lines, I'll be able to override the parent's color
 constraint.

That happens by _adding_ an additional characteristic, presumably that
of what kind of paint the metal is covered with.  That doesn't
override the fundamental constraint that if it's a metal product,
there _will_ be metallic properties.

If you decide to add in some non-metallic products, then it would be
_silly_ to have them inherit all their characteristics from
METAL_PRODUCTS; they should head back up the class hierarchy and
inherit their basic characteristics from the _appropriate_ parent.

Reality, with the GOODS/CAMPAIGN_GOODS example, is that GOODS isn't
the appropriate parent class for CAMPAIGN_GOODS.  Both should be
inheriting the common characteristics from some common ancestor.  If
that is done, then there's nothing to override.

  Or maybe it is better to just make the check function should be
  dynamically dispatched, so the constraint will always hold, it just can
  mean different things for different types.
=20
 Or maybe if someone is doing an Object Oriented design, and making extens=
 ive=20
 use of inheritance, they'll need to apply constraints in a manner that al=
 low=20
 them to be properly inherited.

 The problem with that assumption is that there is normally nothing
 wrong with having seemingly mutually exclusive sets of *business
 rules* for a parent and child.

If the rules are totally different, it begs the question of why they
_should_ be considered to be related in a parent/child relationship.

It may well be that they _aren't_ related as parent/child.  They may
merely be cousins, sharing some common ancestors.
-- 
(concatenate 'string chris @cbbrowne.com)
http://cbbrowne.com/info/spreadsheets.html
Note that if I can get you  to `su and say' something just by asking,
you have a very serious security problem on your system and you should
look into it.  -- Paul Vixie, vixie-cron 3.0.1 installation notes

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Inheritance

2002-09-06 Thread elein



There was a comment earlier that was not really addressed.
What can you do with table inheritance that you can not do
with a relational implementation?  Or what would work *better*
as inheritance?  (you define better)

This is a genuine question, not a snarky comment.  I really
want to know.  This is the reason I can think of to use
inheritance: Several tables have a common set of attributes and
there is some reason for these tables to be separate AND there
is some reason for the common columns to be queried en masse.
What kinds of some reasons are there, though?  And if my
condition for using table inheritance is lacking or misguided, what should
be the criteria for using table inheritance?

Creating indexes across tables is a project.  Is it the most important
project?  Will it benefit the most users?  Will it benefit any users?
Theory is great and important, but if no one uses the functionality,
who cares?  If these changes will enable people to use the functionality
that until now had been too much of a PITA then it might be worth
it.  However,  I suspect the majority of people who would use these
changes are participating in these discussions.

These features were never widely used in Illustra nor Informix although
their implementations were a little smoother imho.

To weigh in on the constraints issues, it seems problematic
that currently some constraints (check) are inherited and
others are not (foreign keys).  The chcheers,oice of which ones are
or aren't is clear to people familiar with the implementation
but what about the rest of the world who just want some
consistent rule.

I also agree with the people who say, if we inherit constrainsts,
then we must be able to override them in the subtables.
I like the suggested LOCAL keyword, myself.

cheers,

elein


:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:
 [EMAIL PROTECTED]   (510)543-6079
 Taking a Trip. Not taking a Trip. --anonymous
:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:~:


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Inheritance

2002-09-06 Thread Greg Copeland

On Fri, 2002-09-06 at 11:05, [EMAIL PROTECTED] wrote:
 Oops! [EMAIL PROTECTED] (Greg Copeland) was seen spray-painting on a wall:
  That's a pretty forcible constraint.  :-).
 =20
 
 Is there something broken with your mailer?  It's reformatting quotes
 rather horribly...

Hmm...not that I know off.  Never had complaints before anyways.  Looks
like an issue with MIME contents...perhaps your mailer doesn't properly
parse some MIME and/or mine is hosing it some how.  Not really sure.

 Reality, with the GOODS/CAMPAIGN_GOODS example, is that GOODS isn't
 the appropriate parent class for CAMPAIGN_GOODS.  Both should be
 inheriting the common characteristics from some common ancestor.  If
 that is done, then there's nothing to override.
 

You can complain about and redefine the model to suit your needs all day
long and get no where.  It doesn't change the need for it.  Fact is, it
would be nice to allow.  Fact is, OO-implementations tend to allow
this.  I'm quite happy to let you go to every OO computer language camp
and inform them that they've done it all wrong.  ;)

Citing that a specific example is all wrong hardly invalidates the
concept.  Since we are pretty much at the conceptual stage, I welcome a
conceptual argument on why this is bad and should never be done. 
Please, be high level and generic.  After all, I too can give you a
hundred specific reasons why a cat is not dog (i.e. bad model)...but it
does nothing to facilitate the topic at hand.

  The problem with that assumption is that there is normally nothing
  wrong with having seemingly mutually exclusive sets of *business
  rules* for a parent and child.
 
 If the rules are totally different, it begs the question of why they
 _should_ be considered to be related in a parent/child relationship.

Because this is how the real world works.  Often there are exceptions to
the rules.  When these rules differ, I've not seen a valid high level
conceptual reason that should prevent it.

Example:

animal
quadruped (has 4-trunk limbs)
dog
injuredDog (has 0 or more trunk limbs)

Hopefully we can agree that a dog is still a dog even if it only has
three legs?  Hopefully you'll realize this was given to illustrate an
example and to prove a point.  Sometimes a model needs to allow for
exceptions to the rule.  You can argue that a three-legged dog is no
longer a quadruped but I prefer to believe that it is a quadruped which
just happens to be an exception to the rule.

 
 It may well be that they _aren't_ related as parent/child.  They may
 merely be cousins, sharing some common ancestors.

Yes, it's true.  Sometimes the wrong model is applied but that hardly
invalidates the concept or alleviates the need.

Regards,

Greg Copeland





signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Inheritance

2002-09-05 Thread Hannu Krosing

On Thu, 2002-09-05 at 03:57, Curt Sampson wrote:
 On Tue, 3 Sep 2002, Bruce Momjian wrote:
 
  Yep, this is where we are stuck;  having an index span multiple tables
  in some way.
 
 Or implementing it by keeping all data in the table in which it
 was declared. (I.e., supertable holds all rows; subtable holds
 only the primary key and those columns of the row that are not
 in the supertable.)

How would you do it for _multiple_ inheritance ?

When implementing it on top of standard relational model you have more
or less two ways to slice the problem 

1) the way you describe (parent holding common columns + child tables
for added child columns), which makes it easy to define constraints but
hard to do inserts/updates/deletes on inherited tables

2) the postgresql way (a new table for each child), which makes it hard
to define constraints but easy to do inserts/updates/deletes.

 From looking at the various discussions of this in books, and what
 it appears to me that the SQL standard says, it seems that their
 overall vision of table inheritance is to be consistent with the
 implementation that I described above.

Yes. The SQL99 standard specifies only _single_ inheritance for tables +
LIKE in column definition part, making the model somewhat similar to
Java's (single inheritance + interfaces).

This way it could probably be done even more effectively than you
describe by:

1) keeping _all_ (not only the inherited columns)  the data for
inheritance hierarchy in the same physical file.

2) having partial indexes (involving tableoid=thiskindoftable) for
possible speeding up of SELECT .. ONLY queries.

3) no changes to (unique) indexes - they still reference simple TID's
without additional table part.

4) update/delete of all child tables are trivial as they are actually
done in the same table and not using joins


It seems that single inheritance avoids other conceptual problems, like
what to do with primary keys when inheriting from two tables that have
them.


Hannu



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Inheritance

2002-09-05 Thread Curt Sampson

On 5 Sep 2002, Hannu Krosing wrote:

 On Thu, 2002-09-05 at 03:57, Curt Sampson wrote:

  Or implementing it by keeping all data in the table in which it
  was declared. (I.e., supertable holds all rows; subtable holds
  only the primary key and those columns of the row that are not
  in the supertable.)

 How would you do it for _multiple_ inheritance ?

Exactly the same way. Each column resides in only one physical table,
so you need only find the table it resides in, and do the insert there.
I'll be happy to provide an example if this is not clear.

 1) the way you describe (parent holding common columns + child tables
 for added child columns), which makes it easy to define constraints but
 hard to do inserts/updates/deletes on inherited tables

I wouldn't say it makes it hard to do inserts, updates and deletes.
Postgres already has pretty near all of the code it needs to support
these updates, because these are the semantic equivalant of the separate
actions applied to the separate tables within one transaction.

 2) the postgresql way (a new table for each child), which makes it hard
 to define constraints but easy to do inserts/updates/deletes.

I agree that making constraints work in this model is very difficult and
a lot of work.

 This way it could probably be done even more effectively than you
 describe by:

 1) keeping _all_ (not only the inherited columns)  the data for
 inheritance hierarchy in the same physical file.

You appear to have delved into a different database layer than one
I'm looking at, here. I was examining storage on the table level,
which is unrelated to files. (E.g., postgres sometimes stores a
table in one file, sometimes in more than one. MS SQL Server stores
many tables in one file. It doesn't matter which approach is used when
discussing the two inheritance implementation options above.)

 4) update/delete of all child tables are trivial as they are actually
 done in the same table and not using joins

Or are you talking about storing all of the columns in a single
table? That's a possibility, but wouldn't it be costly to update
the entire table every time you add a new child table? And table
scans on child tables would certainly be more costly if you had
many of them, becuase the effective row width would be much wider.
But it might be worth thinking about.

 It seems that single inheritance avoids other conceptual problems, like
 what to do with primary keys when inheriting from two tables that have
 them.

I don't see where there's a conceptual problem here, either. With
multiple inheritance you can simply demote both keys to candidate
keys, and continue on as normal. (The only difference between a
primary key and a candidate key is that you can leave out the column
names when declaring foreign keys in another table.)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Inheritance

2002-09-05 Thread Hannu Krosing

On Thu, 2002-09-05 at 09:28, Curt Sampson wrote:
 On 5 Sep 2002, Hannu Krosing wrote:
 
  On Thu, 2002-09-05 at 03:57, Curt Sampson wrote:
 
   Or implementing it by keeping all data in the table in which it
   was declared. (I.e., supertable holds all rows; subtable holds
   only the primary key and those columns of the row that are not
   in the supertable.)
 
  How would you do it for _multiple_ inheritance ?
 
 Exactly the same way. Each column resides in only one physical table,
 so you need only find the table it resides in, and do the insert there.
 I'll be happy to provide an example if this is not clear.
 
  1) the way you describe (parent holding common columns + child tables
  for added child columns), which makes it easy to define constraints but
  hard to do inserts/updates/deletes on inherited tables
 
 I wouldn't say it makes it hard to do inserts, updates and deletes.
 Postgres already has pretty near all of the code it needs to support
 these updates, because these are the semantic equivalant of the separate
 actions applied to the separate tables within one transaction.

What I meant was that it is relatively more costly to update several
physical tables than updating one .

  2) the postgresql way (a new table for each child), which makes it hard
  to define constraints but easy to do inserts/updates/deletes.
 
 I agree that making constraints work in this model is very difficult and
 a lot of work.

But again this is not _conceptually_ hard, just hard to implement
efficiently.

  This way it could probably be done even more effectively than you
  describe by:
 
  1) keeping _all_ (not only the inherited columns)  the data for
  inheritance hierarchy in the same physical file.
 
 You appear to have delved into a different database layer than one
 I'm looking at, here.

probably. I was describing to a way to efficiently implement single
inheritance. 

The layer was somewhere between physical files and logical tables, i.e.
above splitting stuff into main/toast and also above splitting big files
to 1Gb chunks, but below logical tables, which are (or are not when
omitting ONLY ;) still separate logically.

Perhaps it could be named logical file.

 I was examining storage on the table level, which is unrelated to files.

 (E.g., postgres sometimes stores a table in one file, sometimes in more
 than one. MS SQL Server stores many tables in one file.
 It doesn't matter which approach is used when
 discussing the two inheritance implementation options above.)

It does not matter in case you are assuming that the storage model can't
be changed. The trick with inherited tables is that in some sense they
are the same table and in another sense they are separate tables.

  4) update/delete of all child tables are trivial as they are actually
  done in the same table and not using joins
 
 Or are you talking about storing all of the columns in a single
 table? That's a possibility, but wouldn't it be costly to update
 the entire table every time you add a new child table?

You should not need it, as the storage for existing tuples does not
change - even now you can do ADD COLUMN without touching existing
tuples.

 And table
 scans on child tables would certainly be more costly if you had
 many of them, becuase the effective row width would be much wider.

It would not be noticably wider (only 1 bit/column) even if I did
propose storing all columns.

What I was actually trying to describe was that the tuple format would
be what it is currently, just stored in the same table with parent.

 But it might be worth thinking about.
 
  It seems that single inheritance avoids other conceptual problems, like
  what to do with primary keys when inheriting from two tables that have
  them.
 
 I don't see where there's a conceptual problem here, either. With
 multiple inheritance you can simply demote both keys to candidate
 keys, and continue on as normal. (The only difference between a
 primary key and a candidate key is that you can leave out the column
 names when declaring foreign keys in another table.)

That's one possibility. The other would be to keep the one from the
first table as primary and demote onlly the other primary keys.

With single inheritance you don't even have to think about it.

-
Hannu




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Inheritance

2002-09-05 Thread Curt Sampson

On 5 Sep 2002, Hannu Krosing wrote:

 What I meant was that it is relatively more costly to update several
 physical tables than updating one .

Oh, I see. Not that this is that big a deal, I think. Given that
it doesn't work correctly at the moment, making it work fast is a
definite second priority, I would think.

Once it's working right, one can always replace the internals with
something else that does the same job but is more efficient.

  I agree that making constraints work in this model is very difficult and
  a lot of work.

 But again this is not _conceptually_ hard, just hard to implement
 efficiently.

No, it's conceptually hard. Not all constraints are implemented with
just a unique index you know. And changing a constraint means you have
to check all the child tables, etc. etc. It's difficult just to track
down down all the things you have to try to preserve. Not to mention,
there's always the question of what happens to triggers and suchlike
when handed a tuple with extra columns from what it expects, and having
it modify the insert into a different table.

The beauty of storing all supertable columns in the supertable itself is
that the behaviour is automatically correct.

 What I was actually trying to describe was that the tuple format would
 be what it is currently, just stored in the same table with parent.

So what you're saying is that each tuple in the table would have a
format appropriate for its subtype, and the table would be full of
tuples of varying types? At first blush, that seems like a reasonable
approach, if it can be done.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Inheritance

2002-09-05 Thread Jeff Davis

I have a question about inheritance:

You have 2 tables: Programmer and employee. Programmer inherits employee. You 
put in a generic employee record for someone, but then she becomes a 
programmer. What do you do? (I borrowed this example from a book by C.J. 
Date, who posed this question). Do you DELETE then INSERT? Something seems 
wrong with that somehow. Are the postgres developers agreed upon how that 
situation should be handled? What about the database users, and their 
expectations of the behavior? 

I am not advocating that we remove inheritence (I say this because this topic 
has generated some significant discussion about that). However, I will stick 
to the well-defined relational model until I see something useful from the 
inheritance system that is as well-defined. I agree it saves a few keystrokes 
(and can help organize things for you, as do objects in a programming 
language), but mind is more at peace when I am actually sure of what's 
happening. I can always throw more rules/views/triggers at the situation 
until I have a nice set of things to work with in the application.

Or, I suppose, if someone shows me something that I can't do in the relational 
model, but can with inheritance, I might be convinced otherwise.

Regards,
Jeff Davis



On Thursday 05 September 2002 01:05 am, Hannu Krosing wrote:
 On Thu, 2002-09-05 at 03:57, Curt Sampson wrote:
  On Tue, 3 Sep 2002, Bruce Momjian wrote:
   Yep, this is where we are stuck;  having an index span multiple tables
   in some way.
 
  Or implementing it by keeping all data in the table in which it
  was declared. (I.e., supertable holds all rows; subtable holds
  only the primary key and those columns of the row that are not
  in the supertable.)

 How would you do it for _multiple_ inheritance ?

 When implementing it on top of standard relational model you have more
 or less two ways to slice the problem

 1) the way you describe (parent holding common columns + child tables
 for added child columns), which makes it easy to define constraints but
 hard to do inserts/updates/deletes on inherited tables

 2) the postgresql way (a new table for each child), which makes it hard
 to define constraints but easy to do inserts/updates/deletes.

  From looking at the various discussions of this in books, and what
  it appears to me that the SQL standard says, it seems that their
  overall vision of table inheritance is to be consistent with the
  implementation that I described above.

 Yes. The SQL99 standard specifies only _single_ inheritance for tables +
 LIKE in column definition part, making the model somewhat similar to
 Java's (single inheritance + interfaces).

 This way it could probably be done even more effectively than you
 describe by:

 1) keeping _all_ (not only the inherited columns)  the data for
 inheritance hierarchy in the same physical file.

 2) having partial indexes (involving tableoid=thiskindoftable) for
 possible speeding up of SELECT .. ONLY queries.

 3) no changes to (unique) indexes - they still reference simple TID's
 without additional table part.

 4) update/delete of all child tables are trivial as they are actually
 done in the same table and not using joins


 It seems that single inheritance avoids other conceptual problems, like
 what to do with primary keys when inheriting from two tables that have
 them.

 
 Hannu



 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?

 http://archives.postgresql.org


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Inheritance

2002-09-05 Thread Curt Sampson

On Thu, 5 Sep 2002, Jeff Davis wrote:

 You have 2 tables: Programmer and employee. Programmer inherits employee. You
 put in a generic employee record for someone, but then she becomes a
 programmer. What do you do? (I borrowed this example from a book by C.J.
 Date, who posed this question). Do you DELETE then INSERT? Something seems
 wrong with that somehow.

This is not so wrong. If you think about it, you have the same
problem in most object-oriented programming languages: a person
object can't generally easily become a subclass of itself after
being created.

This is a case, I would say, where you simply don't want to use
inheritance. A person has-a job, not is-a job.

 What about the database users, and their expectations of the behavior?

Nobody really knows; table inheritance in databases is not well-defined.
(Though perhaps the latest SQL spec. changes that.)

 However, I will stick to the well-defined relational model until I see
 something useful from the inheritance system that is as well-defined.

Amen! :-)

 Or, I suppose, if someone shows me something that I can't do in the
 relational model, but can with inheritance, I might be convinced
 otherwise.

I think that most people are at this point agreed that table
inheritance, at least as currently implemented in any known system,
doesn't offer anything that can't easily be done relationally.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Inheritance

2002-09-05 Thread Hannu Krosing

On Thu, 2002-09-05 at 10:52, Curt Sampson wrote:
 On 5 Sep 2002, Hannu Krosing wrote:
 
  What I meant was that it is relatively more costly to update several
  physical tables than updating one .
 
 Oh, I see. Not that this is that big a deal, I think. Given that
 it doesn't work correctly at the moment, making it work fast is a
 definite second priority, I would think.

But choosing an implementation that _can_be_ made to work fast is not.

 Once it's working right, one can always replace the internals with
 something else that does the same job but is more efficient.

I still think that choosing the right implementation can also help in
making it work right.

   I agree that making constraints work in this model is very difficult and
   a lot of work.
 
  But again this is not _conceptually_ hard, just hard to implement
  efficiently.
 
 No, it's conceptually hard. Not all constraints are implemented with
 just a unique index you know. And changing a constraint means you have
 to check all the child tables, etc. etc. It's difficult just to track
 down down all the things you have to try to preserve.

It may be a lot of work, but not _conceptually_ hard. Conceptually you
have to do the same thing as for a single table, but just for all
inherited tables.

 Not to mention,
 there's always the question of what happens to triggers and suchlike
 when handed a tuple with extra columns from what it expects, and having
 it modify the insert into a different table.

IMHO that the trigger should not be aware of underlying implementation -
so it needs not worry about modifying the insert into a different table.

 The beauty of storing all supertable columns in the supertable itself is
 that the behaviour is automatically correct.

But automatically correct may not be what you want ;)

What about trigger that generates a cached printname using function
printname(row) that is different for each table - here you definitely do
not want to run the function defined for base table for anything
inherited.

  What I was actually trying to describe was that the tuple format would
  be what it is currently, just stored in the same table with parent.
 
 So what you're saying is that each tuple in the table would have a
 format appropriate for its subtype, and the table would be full of
 tuples of varying types? At first blush, that seems like a reasonable
 approach, if it can be done.

At least it makes some parts easier ;)


Hannu









---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Inheritance

2002-09-05 Thread Curt Sampson

On 5 Sep 2002, Hannu Krosing wrote:

  Oh, I see. Not that this is that big a deal, I think. Given that
  it doesn't work correctly at the moment, making it work fast is a
  definite second priority, I would think.

 But choosing an implementation that _can_be_ made to work fast is not.

I would say it definitely is. A correctly working implementation
can be replaced. An incorrectly working implementation destroys
data integrety.

Which is more important for PostgreSQL? Speed or maintaining data
integrity?

  Not to mention,
  there's always the question of what happens to triggers and suchlike
  when handed a tuple with extra columns from what it expects, and having
  it modify the insert into a different table.

 IMHO that the trigger should not be aware of underlying implementation -
 so it needs not worry about modifying the insert into a different table.

I agree.

  The beauty of storing all supertable columns in the supertable itself is
  that the behaviour is automatically correct.

 But automatically correct may not be what you want ;)

 What about trigger that generates a cached printname using function
 printname(row) that is different for each table - here you definitely do
 not want to run the function defined for base table for anything
 inherited.

Right. But that will be automatically correct when you store all
base data in the base table. It's when you start storing those data
in other tables that the trigger can get confused.

Or are you saying that when I insert a row into just a child
table, the trigger shouldn't be invoked on the parent table
portion of that insert? If so, I'd strongly disagree. If that
trigger is acting as an integrety constraint on the base table,
you might destroy the table's integrity.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Inheritance

2002-09-05 Thread Jeff Davis


 This is not so wrong. If you think about it, you have the same
 problem in most object-oriented programming languages: a person
 object can't generally easily become a subclass of itself after
 being created.

 This is a case, I would say, where you simply don't want to use
 inheritance. A person has-a job, not is-a job.


But a person is-a employee (allow me to momentarily step aside from the rules 
of english grammer, if you would), and a person is-a programmer. That's why I 
didn't call my table job :)  [1]

I don't like the way some OO programming languages handle objects, if they 
mean to say you can't change an object's type without performing a logical 
data copy to a new object. If you don't use some kind of extra layer of 
abstraction in C, you will end up with that problem: you'd need to copy all 
that RAM over to change from one struct to another. Most people would rather 
take that RAM copying hit than all the hits for allowing room to expand (at 
least in some applications). However, postgres needs to provide that room to 
expand for each tuple anyway, so to go through the same copying seems bad 
(especially since we're no longer just talking RAM). 

Take as an example python... it's easy to emulate other objects: just assign 
to the attribute, even if it's not there yet, it'll add the attribute. Same 
with python, it's providing room to expand for it's objects already, so why 
do all the copying? Now compare with Java, and see why you'd be annoyed. It 
has the facilities to change the objects all around, but you can't do it.

Even if you disregard all implementation details, and assume that the database 
is intelligent enough to not redundantly write data (and if you could name 
one such database, I would like to know), you're still doing something that 
doesn't logically make sense: you're deleting and inserting atomically, when 
the more obvious logical path is to expand on the data you already carry 
about an entity.

I like entities to be mutable, at least as far as makes sense to an 
application. Try telling an employee that as part of a promotion, they're 
going to be fired, lose their workstation, then be re-hired, and get a new 
workstation; I bet he'd have an interesting expression on his face (hey, at 
least postgres guarantees the A in ACID, or else bad things could happen to 
that poor guy :)

Thanks for responding, and I agreed with everything else you said. As you 
might have guessed, I don't much like most object-oriented languages if 
that's what they're going to try to tell me I have to do. Python works 
nicely, however :)

Regards,
Jeff Davis

[1] Come to think of it, the JOIN operator seems to, at least on a first 
thought, represent the has-a relationship you describe. You could have the 
tuples manager and programmer in the table job and join with a people 
table. Don't ask about inheritance yet for this model, I'm still thinking 
about that one (does has-a even have an analogue to inheriteance?). Send me 
your thoughts about this, if you should have any.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Inheritance

2002-09-05 Thread Hannu Krosing

On Thu, 2002-09-05 at 11:34, Curt Sampson wrote:
 On 5 Sep 2002, Hannu Krosing wrote:
 
   Oh, I see. Not that this is that big a deal, I think. Given that
   it doesn't work correctly at the moment, making it work fast is a
   definite second priority, I would think.
 
  But choosing an implementation that _can_be_ made to work fast is not.
 
 I would say it definitely is. A correctly working implementation
 can be replaced. An incorrectly working implementation destroys
 data integrety.
 
 Which is more important for PostgreSQL? Speed or maintaining data
 integrity?

Both of course. The lack of one often makes the other unusable.

But as MySQL experience suggest, often people select speed over data
integrity. OTOH sometimes you happily accept a 10sec delay in updates to
have data integrity (like when doing a money transfer over internet;)

   Not to mention,
   there's always the question of what happens to triggers and suchlike
   when handed a tuple with extra columns from what it expects, and having
   it modify the insert into a different table.
 
  IMHO that the trigger should not be aware of underlying implementation -
  so it needs not worry about modifying the insert into a different table.
 
 I agree.
 
   The beauty of storing all supertable columns in the supertable itself is
   that the behaviour is automatically correct.
 
  But automatically correct may not be what you want ;)
 
  What about trigger that generates a cached printname using function
  printname(row) that is different for each table - here you definitely do
  not want to run the function defined for base table for anything
  inherited.
 
 Right. But that will be automatically correct when you store all
 base data in the base table. It's when you start storing those data
 in other tables that the trigger can get confused.
 
 Or are you saying that when I insert a row into just a child
 table, the trigger shouldn't be invoked on the parent table
 portion of that insert? If so, I'd strongly disagree.

Conceptually there are no portions of table - the trigger is invoked
on one _tuple_ exactly (pg has only row-level triggers), and each tuple
belongs to only one table regardless how it is implemented internally.

 If that
 trigger is acting as an integrety constraint on the base table,
 you might destroy the table's integrity.

What I try to say is that you should have the same freedom with triggers
that you have with select/insert/update/delete - you must be able to
choose if the trigger is on the parent table ONLY or on parent and all
children. 

And you should be able to override a trigger for child table even if it
is defined on parent as applying to all children - I guess that
overriding by trigger _name_ would be what most people expect.

Suppose you have a table CITIZEN with table-level constraint IS_GOOD
which is defined as kills_not_others(CITIZEN). and there is table
CIVIL_SERVANT (..) UNDER CITIZEN. Now you have just one table MILITARY
(...) UNDER CIVIL_SERVANT, where you have other criteria for IS_GOOD so
you must either be able to override the trigger for that table (and its
children) or make sure that the functions used are dynamically mached to
the actual tuple type (header in Relational Model parlance) so that
kills_not_others(MILITARY) will be used, which presents the system
MILITARYs view of the being good ;)

What I'm after here is dynamic (and automatic) row level dispach of the
right function based on row type - so that for rows in CITIZEN or
CIVIL_SERVANT the function kills_not_others(CITIZEN) will be used but
for rows in MILITAY the kills_not_others(MILITARY) is used.

-
  Hannu



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Inheritance

2002-09-05 Thread Hannu Krosing

On Thu, 2002-09-05 at 12:29, Jeff Davis wrote:
 
  This is not so wrong. If you think about it, you have the same
  problem in most object-oriented programming languages: a person
  object can't generally easily become a subclass of itself after
  being created.
 
  This is a case, I would say, where you simply don't want to use
  inheritance. A person has-a job, not is-a job.
 
 
 But a person is-a employee (allow me to momentarily step aside from the rules 
 of english grammer, if you would), and a person is-a programmer. That's why I 
 didn't call my table job :)  [1]
 
 I don't like the way some OO programming languages handle objects, if they 
 mean to say you can't change an object's type without performing a logical 
 data copy to a new object. If you don't use some kind of extra layer of 
 abstraction in C, you will end up with that problem: you'd need to copy all 
 that RAM over to change from one struct to another. Most people would rather 
 take that RAM copying hit than all the hits for allowing room to expand (at 
 least in some applications). However, postgres needs to provide that room to 
 expand for each tuple anyway, so to go through the same copying seems bad 
 (especially since we're no longer just talking RAM). 

I would like to have UPDATEs both up and down the inheritance hierarchy,
so that when I have hierarchy

OBJECT(id serial primary key)
 + HUMAN(name text,age int)
   + EMPLOYEE(salary numeric)
 + ENGINEER(workstation computer)
 + PHB(laptop computer)

and ENGINEER named Bob

I could do

UPDATE ENGINEER 
TO PHB
   SET salary = salary * 2 + age * 1000,
   laptop.disksize = max(laptop.disksize ,
 workstation.disksize + 100)
 WHERE name='Bob'
;

to promote Bob from an engineer to phb, give him a salary rise and a
laptop with default configuration ensuring big enough disk to keep all
his old files, but still keep all FK related records.

 Take as an example python... it's easy to emulate other objects: just assign 
 to the attribute, even if it's not there yet, it'll add the attribute. Same 
 with python, it's providing room to expand for it's objects already, so why 
 do all the copying?

that's unless you use the new-style objects and __slots__

 class myobj(object):
... __slots__ = ['a','b']
...
 M = myobj()
 M.a =1
 M.c =1
Traceback (most recent call last):
  File stdin, line 1, in ?
AttributeError: 'myobj' object has no attribute 'c'
 

 Same with python, it's providing room to expand for it's objects already,
 so why do all the copying?


 [1] Come to think of it, the JOIN operator seems to, at least on a first 
 thought, represent the has-a relationship you describe. You could have the 
 tuples manager and programmer in the table job and join with a people 
 table. Don't ask about inheritance yet for this model, I'm still thinking 
 about that one (does has-a even have an analogue to inheriteance?).

Not in inheritance, but in OO world attributes are used to express has-a
relations. So

 bob = people(name='Bob')
 bob.job = job('Manager')

makes an has-a relation between Bob and his job in python

BTW, good programming guidelines in python tell you not to test if bob
is-a something but rather test if the interface for something exists -
to see if you can iterate over bob you do not test if bob is a sequence
but just try it:

try:
for idea in bob:
examine(idea)
except TypeError:
print 'Failed to iterate over %s %s !' % (bob,job.name, bob.name)

---
Hannu






---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Inheritance

2002-09-05 Thread Tom Lane

I really like Hannu's idea of storing an entire (single-inheritance)
hierarchy in a single file.

I guess the question we need to ask ourselves is if we're prepared to
abandon support of multiple inheritance.  Personally I am, but...

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Inheritance

2002-09-05 Thread Ross J. Reedstrom

On Thu, Sep 05, 2002 at 10:23:02AM -0400, Tom Lane wrote:
 I really like Hannu's idea of storing an entire (single-inheritance)
 hierarchy in a single file.

Wouldn't this require solving the ALTER TABLE ADD COLUMN (to parent)
column ordering problem? 

 I guess the question we need to ask ourselves is if we're prepared to
 abandon support of multiple inheritance.  Personally I am, but...

No opinion - I've not used the inheritance much, since I'm not willing to
give up referential integrity.

Ross

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Inheritance

2002-09-05 Thread Hannu Krosing

On Thu, 2002-09-05 at 19:23, Tom Lane wrote:
 I really like Hannu's idea of storing an entire (single-inheritance)
 hierarchy in a single file.
 
 I guess the question we need to ask ourselves is if we're prepared to
 abandon support of multiple inheritance.  Personally I am, but...

So am I, but I think we should move in stages -

1) first implement the SQL99 standard 
   CREATE TABLE mytable() UNDER parenttable ;
   using the above idea and make it work right vs constraints,
   triggers, functions, etc.

   This should include the ability to include other table structures
   using LIKE :

   CREATE TABLE engine(...);
   CREATE TABLE vehicule(...);
   CREATE TABLE car (
  model text,
  wheels  wheel[],
  LIKE engine,
   ) UNDER vehicule;

   which could then hopefully be used for migrating most code of form

   CREATE TABLE car (
  model text primary key,
  wheels  wheel[]
   ) INHERITS (vehicule, engine);

   it would be nice (maybe even neccessary) to keep the current
   functionality that columns introduced by LIKE are automatically
   added/renamed/deleted when LIKE's base table changes.

2) when it is working announce non-SQL99-standard-and-broken INHERITS
   to be deprecated and removed in future.

3) give people time for some releases to move over to UNDER + LIKE .
   Or if someone comes up with bright ideas/impementations for fixing
   multiple inheritance, then un-deprecate and keep it.

4) else try to remove INHERITS.

5) if too many people object, goto 3)  ;)

---
Hannu



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Inheritance

2002-09-05 Thread Curt Sampson

On Thu, 5 Sep 2002, Jeff Davis wrote:

 But a person is-a employee (allow me to momentarily step aside from
 the rules of english grammer, if you would), and a person is-a
 programmer. That's why I didn't call my table job :) [1]

Certainly it's not the case that a person is-a job, by virtue of the
fact that a person can have no job. Nor is it the case that a person
is-a programmer; not all people are programmers.

Perhaps you're reversing the sense of is-a? One says subtype is-a
supertype, not supertype is-a subtype.

But even reversing these, it's not the case that job is-a person, by
virtue of the fact that you cannot use a job anywhere you can use a
person. (A person can file his tax return, a job can't.) That might be a
matter of bad mappings of object names to real-world concepts, though.

As for programmer is-a person, yes, you could model things that way if
you really wanted to. But it's a bad way to do it because, as you point
out, a person can change his job, or not have a job. Now what do you do
with that programmer-subtype-of-person object you created? I think in
this case English misled you: we do say that he is a programmer, but
what we really mean is that one of the characteristics of that person
is that he programs. So create a separate characteristic type and have
the person object have-a as many or as few of those characteristics as
you need.

 I don't like the way some OO programming languages handle objects, if they
 mean to say you can't change an object's type without performing a logical
 data copy to a new object.

That's not a problem with the programming language; that's you
modelling things badly.

 Take as an example python... it's easy to emulate other objects: just assign
 to the attribute, even if it's not there yet, it'll add the attribute. Same
 with python, it's providing room to expand for it's objects already, so why
 do all the copying? Now compare with Java, and see why you'd be annoyed. It
 has the facilities to change the objects all around, but you can't do it.

Yes, you can't do it in Java because you Can't Do It in a language where
you can specify static typing. If I have field that holds a String, I'm
given a guarantee that, if I can put a reference in that field, it is
and always will be a String.

In non-statically-typed languages that give you the option of changing
types, you might give a referenc to a string, change the objects type on
me, and then I might blow up when I try to use it later. These bugs tend
to be quite difficult to track down because the source and manifestation
of the problem can be widely separated in code and in time. That's why
most languages don't allow this.

 ...when the more obvious logical path is to expand on the data you
 already carry about an entity.

Yes, that's the perfectly obvious path. And that's just what the
relational model lets us do, and do very well.

Why do you want to use an ill-fitting, error-prone model when you've
already got something that works better?

 [1] Come to think of it, the JOIN operator seems to, at least on a first
 thought, represent the has-a relationship you describe.

You bet! Hey, this relational stuff doesn't suck so badly after
all, does it? Especially for a 30-year old theory. :-)

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Inheritance

2002-09-05 Thread Greg Copeland

On Thu, 2002-09-05 at 08:15, Hannu Krosing wrote:
 On Thu, 2002-09-05 at 11:34, Curt Sampson wrote:
  On 5 Sep 2002, Hannu Krosing wrote:

  If that
  trigger is acting as an integrety constraint on the base table,
  you might destroy the table's integrity.
 
 What I try to say is that you should have the same freedom with triggers
 that you have with select/insert/update/delete - you must be able to
 choose if the trigger is on the parent table ONLY or on parent and all
 children. 

Sounds like a mechanism to make the distinction between virtual (child
can override parent) and non-virtual (child is constrained by the
parent) constraints are needed.

After all, there are two basic needs for constraints.  One is for
relational integrity and the other is business rule integrity.  That is,
one seeks to ensure that the database makes sense in respect to the data
model (a shoe is a product) while the other is to enforce business rules
(products are never free).  Seems like the DBA should be able to dictate
which domain his constraint falls into in some manner.

 
 And you should be able to override a trigger for child table even if it
 is defined on parent as applying to all children - I guess that
 overriding by trigger _name_ would be what most people expect.
 

That's the reason I used virtual and non-virtual above.  If we think
using C++ idioms, the child is stuck with it if it's deemed
non-virtual.  Generally speaking, if someone designed something with
that expectation in mind, there's probably a good reason for it.  In
this case, we could assume that such non-virtual constraints would be to
help ensure proper RI.  Something that otherwise, IMO, would be tossed
out with the bath water.

 What I'm after here is dynamic (and automatic) row level dispach of the
 right function based on row type - so that for rows in CITIZEN or
 CIVIL_SERVANT the function kills_not_others(CITIZEN) will be used but
 for rows in MILITAY the kills_not_others(MILITARY) is used.

I think we're touching on some form of RTTI information here.  That is,
triggers and even functions may need to be able to dynamically determine
the row type that is actively being worked on.

If we're on the same page, I think that seemingly makes a lot of sense.

What about the concept of columns being public or private?  That is,
certain columns may not be inherited by a child?  Any thought to such a
concept?  Perhaps different types of table inheritance can be considered
in our model...has-a, is-a, etc...


Regards,

Greg Copeland




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Inheritance

2002-09-05 Thread Hannu Krosing

On Fri, 2002-09-06 at 03:19, Greg Copeland wrote:
 On Thu, 2002-09-05 at 08:15, Hannu Krosing wrote:
  On Thu, 2002-09-05 at 11:34, Curt Sampson wrote:
   On 5 Sep 2002, Hannu Krosing wrote:
 
   If that
   trigger is acting as an integrety constraint on the base table,
   you might destroy the table's integrity.
  
  What I try to say is that you should have the same freedom with triggers
  that you have with select/insert/update/delete - you must be able to
  choose if the trigger is on the parent table ONLY or on parent and all
  children. 
 
 Sounds like a mechanism to make the distinction between virtual (child
 can override parent) and non-virtual (child is constrained by the
 parent) constraints are needed.
 
 After all, there are two basic needs for constraints.  One is for
 relational integrity and the other is business rule integrity.  That is,
 one seeks to ensure that the database makes sense in respect to the data
 model (a shoe is a product) while the other is to enforce business rules
 (products are never free).  Seems like the DBA should be able to dictate
 which domain his constraint falls into in some manner.

  And you should be able to override a trigger for child table even if
it
  is defined on parent as applying to all children - I guess that
  overriding by trigger _name_ would be what most people expect.
  
 
 That's the reason I used virtual and non-virtual above.  If we think
 using C++ idioms, the child is stuck with it if it's deemed
 non-virtual.  Generally speaking, if someone designed something with
 that expectation in mind, there's probably a good reason for it.  In
 this case, we could assume that such non-virtual constraints would be to
 help ensure proper RI.  Something that otherwise, IMO, would be tossed
 out with the bath water.

I agree to this.

What I described (making overriding decision solely in child) is
probably a bad idea.

  What I'm after here is dynamic (and automatic) row level dispach of the
  right function based on row type - so that for rows in CITIZEN or
  CIVIL_SERVANT the function kills_not_others(CITIZEN) will be used but
  for rows in MILITAY the kills_not_others(MILITARY) is used.
 
 I think we're touching on some form of RTTI information here.  That is,
 triggers and even functions may need to be able to dynamically determine
 the row type that is actively being worked on.

Should be easy if the row comes directly from a table : just use
tableoid column.

 If we're on the same page, I think that seemingly makes a lot of sense.
 
 What about the concept of columns being public or private?  That is,
 certain columns may not be inherited by a child?  Any thought to such a
 concept?  Perhaps different types of table inheritance can be considered
 in our model...has-a, is-a, etc...

I can't fit this in my mental model of table inheritance for two reasons

1) all parent table columns must be present in child

2) granting some right to parent should automatically allow selecting
from children

both are required for select/insert/update/delete to work on table and
its children (i.e. without ONLY)


But maybe i just need to think more about it ;)

--
Hannu












---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Inheritance

2002-09-05 Thread Curt Sampson

On 5 Sep 2002, Hannu Krosing wrote:

 Suppose you have a table CITIZEN with table-level constraint IS_GOOD
 which is defined as kills_not_others(CITIZEN). and there is table
 CIVIL_SERVANT (..) UNDER CITIZEN. Now you have just one table MILITARY
 (...) UNDER CIVIL_SERVANT, where you have other criteria for IS_GOOD

This I very much disagree with.

In most object-oriented languages (Eiffel being a notable exception, IIRC),
you can't specify constraints on objects. But in a relational database,
you can specify constraints on tables, and it should *never* *ever* be
possible to violate those constraints, or the constraints are pointless.

So if I have a constraint that says, no rows appearing in this
table will ever violate constraint X, and then you go and create
a way of inserting rows into that table that violate that constraint,
I think you've just made the database into a non-relational database.
I really don't want to break postgres' relational side for some
inheritance features of dubious utility. Constraints should be explicitly
removed from tables if they are no longer needed, not implicitly removed
through the creation of another table.

I think we should settle this point before going any further.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Inheritance

2002-09-05 Thread Curt Sampson

On 5 Sep 2002, Greg Copeland wrote:

 Sounds like a mechanism to make the distinction between virtual (child
 can override parent) and non-virtual (child is constrained by the
 parent) constraints are needed.

Oh, I should mention that I have no problem with being able to declare a
constraint overridable by subtables, so long as it's not the default,
and it's clear from the table definition that it might be overridden.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Inheritance

2002-09-04 Thread Curt Sampson

On Tue, 3 Sep 2002, Bruce Momjian wrote:

 Yep, this is where we are stuck;  having an index span multiple tables
 in some way.

Or implementing it by keeping all data in the table in which it
was declared. (I.e., supertable holds all rows; subtable holds
only the primary key and those columns of the row that are not
in the supertable.)

From looking at the various discussions of this in books, and what
it appears to me that the SQL standard says, it seems that their
overall vision of table inheritance is to be consistent with the
implementation that I described above.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Inheritance

2002-09-03 Thread Peter Gulutzan

The August draft of the SQL:200n standard (9075-2 Foundation) says in
Section 4.17.2: Every table constraint specified for base table T is
implicitly a constraint on every subtable of T, by virtue of the fact
that every row in a subtable is considered to have a corresponding
superrow in every one of its supertables.

Peter Gulutzan
Co-Author, SQL-99 Complete, Really
Co-Author, SQL Performance Tuning

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Inheritance

2002-09-03 Thread Bruce Momjian

Peter Gulutzan wrote:
 The August draft of the SQL:200n standard (9075-2 Foundation) says in
 Section 4.17.2: Every table constraint specified for base table T is
 implicitly a constraint on every subtable of T, by virtue of the fact
 that every row in a subtable is considered to have a corresponding
 superrow in every one of its supertables.

Yep, this is where we are stuck;  having an index span multiple tables
in some way.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Inheritance

2002-08-19 Thread Zeugswetter Andreas SB SD


  Seems with above you are not able to constrain what qualifies for a
  supertable row, you would only be able to specify constraints that
  apply to all it's subtables.
 
 Yes, that's the whole point. If I have a constraint on a table, I think
 it should *never* be possible for that constraint to be violated. If a
 subtable should not have constraint the supertable has, it shouldn't
 inherit from the supertable.

If you want that, you simply need to only create constraints that apply to
all tables in the hierarchy. Note that you *can* do this. It should imho be 
the default behavior.

 
 To do otherwise breaks the relational model.


That is probably a point of argument. Imho the inheritance feature
is something orthogonal to the relational model. It is something else, and 
thus cannot break the relational model.

 
  The SQL inheritance is a class/subclass thing. All tables have
  instances (==rows) that are not (by itself) related. (Even if
  they happen to share all attribute values with another row of a
  supertable.) If you want that, then you need to resort to 3NF (or
  ROWREF's which iirc is another SQL99 feature).
 
 As I understand it, SQL99 has the restriction that a row with the same
 primary key appearing in a supertable and/or any of its subtables must
 be the result of a single INSERT statement. Thus, SQL99 doesn't allow
 what you're saying, if I understand what you're saying. (I'm not sure
 that I do.)

I was not talking about primary key, not all tables have a primary key.
If the supertable has a PK, then of course an exact match of columns is not 
possible in supertable/subtable rows.
 
 Am I to take it that you think the inheritance should be inheritance
 of type information only? That is, if I have supertable A and
 subtable A', inserting a row into A' does not make a row appear in
 A? If so, I've got not real problem with that at present, but it's
 not what postgres currently does, nor would it conform to SQL99.

No, not at all. All I am saying is that I want to be able to create a 
constraint that only applies to the supertable rows, and not the
subtable rows. I would *not* want this as default behavior when creating a
constraint though.

Andreas

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Inheritance

2002-08-19 Thread Curt Sampson

On Mon, 19 Aug 2002, Zeugswetter Andreas SB SD wrote:

  Yes, that's the whole point. If I have a constraint on a table, I think
  it should *never* be possible for that constraint to be violated. If a
  subtable should not have constraint the supertable has, it shouldn't
  inherit from the supertable.

 If you want that, you simply need to only create constraints that apply to
 all tables in the hierarchy. Note that you *can* do this. It should imho be
 the default behavior.

So what you're saying is that constraints shouldn't be inherited?

  To do otherwise breaks the relational model.

 That is probably a point of argument. Imho the inheritance feature
 is something orthogonal to the relational model. It is something else, and
 thus cannot break the relational model.

So then constraints must be inherited. The relational model, if I
am not incorrect here, says that, given a table definition such as
this:

CREATE TABLE my_table (
my_key int PRIMARY KEY,
my_value text UNIQUE,
my_other_value int CHECK (my_other_value  0)
)

You will never, ever, when selecting from this table, have returned to you

1. two rows with the same value of my_key but different values
for the other columns,

2. two rows with the same value of my_value but different values
for the other columns, or

3. a row in which the value of my_other_value is not greater than zero.

Breaking these sorts of guarantees under any circumstances really
doesn't do it for me; what's the point of having guarantees if they
aren't guarantees?

  As I understand it, SQL99 has the restriction that a row with the same
  primary key appearing in a supertable and/or any of its subtables must
  be the result of a single INSERT statement. Thus, SQL99 doesn't allow
  what you're saying, if I understand what you're saying. (I'm not sure
  that I do.)

 I was not talking about primary key, not all tables have a primary key.

Well, for those that do

Also, I should amend that; I suspect (though I could well be wrong,
knowing how screwed up SQL is at times) that this really applies
to all candidate keys in the table.

(And this is one of my complaints about SQL; it's possible for a table to
exist without candidate keys. So much for set theory!)

 No, not at all. All I am saying is that I want to be able to create a
 constraint that only applies to the supertable rows, and not the
 subtable rows.

I would strongly object to that. It should not be possible to SELECT
data from a table that violates the constraints that that table is
guaranteeing on the data.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Inheritance

2002-08-19 Thread Zeugswetter Andreas SB SD


   Yes, that's the whole point. If I have a constraint on a table, I think
   it should *never* be possible for that constraint to be violated. If a
   subtable should not have constraint the supertable has, it shouldn't
   inherit from the supertable.
 
  If you want that, you simply need to only create constraints that apply to
  all tables in the hierarchy. Note that you *can* do this. It should imho be
  the default behavior.
 
 So what you're saying is that constraints shouldn't be inherited?

No. I even said that inheriting should be the default.
 
   To do otherwise breaks the relational model.
 
  That is probably a point of argument. Imho the inheritance feature
  is something orthogonal to the relational model. It is something else, and
  thus cannot break the relational model.
 
 So then constraints must be inherited. The relational model, if I
 am not incorrect here, says that, given a table definition such as
 this:
 
 CREATE TABLE my_table (
   my_key int PRIMARY KEY,
   my_value text UNIQUE,
   my_other_value int CHECK (my_other_value  0)
 )

A local constraint should be made obvious from looking at the schema, 
a possible syntax (probably both ugly :-):
CHECK my_table ONLY (my_other_value  0)
or
CHECK LOCAL (my_other_value  0)

 
 You will never, ever, when selecting from this table, have 
 returned to you
 
 1. two rows with the same value of my_key but different values
 for the other columns,
 
 2. two rows with the same value of my_value but different values
 for the other columns, or
 
 3. a row in which the value of my_other_value is not 
 greater than zero.
 

Well, that is where I do not think this is flexible enough, and keep in mind 
that all triggers and rules would then also need such restrictions. 

 I would strongly object to that.

Regardless whether your objection is *strong* or not :-)
If you don't like the feature (to add a local constraint), don't use it.
(Remember you are talking about removing an implemented feature)

Andreas

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Inheritance

2002-08-19 Thread Curt Sampson

On Mon, 19 Aug 2002, Zeugswetter Andreas SB SD wrote:

  So what you're saying is that constraints shouldn't be inherited?

 No. I even said that inheriting should be the default.

Ah. So you think it should be possible not to inherit constraints.

 A local constraint should be made obvious from looking at the schema,

Ok, this now I could live with. Though I'm not sure that its
theoretically very defensible, or worth the effort. Other languages
that offer constraints, such as Eiffel (and soon Java), do not allow
constraints that are not inherited, as far as I know. Do you have some
counterexamples.

 Well, that is where I do not think this is flexible enough, and keep in mind
 that all triggers and rules would then also need such restrictions.

Yes, all triggers, rules, and everything else would have to be inherited.

 Regardless whether your objection is *strong* or not :-)
 If you don't like the feature (to add a local constraint), don't use it.
 (Remember you are talking about removing an implemented feature)

1. It's not exactly an implemented feature, it's an accident of an
incomplete implementation of inheritance done in a certain way.

2. Should we change the way we decide to implement inheritance,
perhaps to make fixing the current problems much easier, it might
be a lot of work to add this.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Inheritance

2002-08-19 Thread Oliver Elphick

On Mon, 2002-08-19 at 15:42, Curt Sampson wrote:
  A local constraint should be made obvious from looking at the schema,
 
 Ok, this now I could live with. Though I'm not sure that its
 theoretically very defensible, or worth the effort. Other languages
 that offer constraints, such as Eiffel (and soon Java), do not allow
 constraints that are not inherited, as far as I know. Do you have some
 counterexamples.

In Eiffel, at least, I can say invariant feature_x and redefine
feature_x in a descendant class, thus effectively redefining the
constraint.  If we decide to inherit constraints unconditionally, the
application writer can achieve similar flexibility by moving the logic
of the constraint into a function whose behaviour depends on which table
it is used on.  This would put the burden on the application rather than
requiring additional syntax in PostgreSQL.

-- 
Oliver Elphick[EMAIL PROTECTED]
Isle of Wight, UK
http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
 
 For every one that asketh receiveth; and he that 
  seeketh findeth; and to him that knocketh it shall be 
  opened. Luke 11:10 


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Inheritance

2002-08-19 Thread Greg Copeland

On Mon, 2002-08-19 at 09:42, Curt Sampson wrote:
 On Mon, 19 Aug 2002, Zeugswetter Andreas SB SD wrote:
 
   So what you're saying is that constraints shouldn't be inherited?
 
  No. I even said that inheriting should be the default.
 
 Ah. So you think it should be possible not to inherit constraints.

I've been silent for a bit because I wanted to kick the concept around
in my head.  After some thought, I say that I support children
inheriting constraints.  In a more abstract sense, we are really setting
conditions for all entities of a given type (class) which must be met to
classify as a defined type.  Again, in an abstract sense, if I say all
candies (type/class, candy) must have sugar (constraint), and I go on
to create a subclass of candy which I desire not to have sugar, then
there is a fundamental problem.  Either I incorrectly identified my
problem domain and didn't properly create my entities which address my
domain needs or what I'm trying to express really isn't a candy at all. 
In other words, it sounds like candy should of been a subclass of a more
abstract base entity.  Likewise, the newly desired class which doesn't
have sugar should also inherit from the newly created base class and not
be derived from candy at all.


 
  A local constraint should be made obvious from looking at the schema,
 
 Ok, this now I could live with. Though I'm not sure that its
 theoretically very defensible, or worth the effort. Other languages
 that offer constraints, such as Eiffel (and soon Java), do not allow
 constraints that are not inherited, as far as I know. Do you have some
 counterexamples.

I tend to agree.  Constraints should be inherited.  See above.

 
  Well, that is where I do not think this is flexible enough, and keep in mind
  that all triggers and rules would then also need such restrictions.
 
 Yes, all triggers, rules, and everything else would have to be inherited.

Agreed.

 
  Regardless whether your objection is *strong* or not :-)
  If you don't like the feature (to add a local constraint), don't use it.
  (Remember you are talking about removing an implemented feature)
 
 1. It's not exactly an implemented feature, it's an accident of an
 incomplete implementation of inheritance done in a certain way.
 
 2. Should we change the way we decide to implement inheritance,
 perhaps to make fixing the current problems much easier, it might
 be a lot of work to add this.
 

I'm still trying to figure out if subclasses should be allowed to have
localized constraints.  I tend to think yes even though it's certainly
possible to create seemingly illogical/incompatible/conflicting
constraints with parent classes.  Then again, my gut feeling is, that's
more and an architectural/design issue rather than a fundamental issue
with the concept.


--Greg Copeland





signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Inheritance

2002-08-17 Thread Curt Sampson

On Fri, 16 Aug 2002, Zeugswetter Andreas SB SD wrote:

  Note that the other obvious way to solve this would be to store all of
  the information inherited from the parent in the parent table, so that
  you don't have to do anything special to make all of the constraints and
  whatnot apply.

 Seems with above you are not able to constrain what qualifies for a
 supertable row, you would only be able to specify constraints that
 apply to all it's subtables.

Yes, that's the whole point. If I have a constraint on a table, I think
it should *never* be possible for that constraint to be violated. If a
subtable should not have constraint the supertable has, it shouldn't
inherit from the supertable.

To do otherwise breaks the relational model.

 The SQL inheritance is a class/subclass thing. All tables have
 instances (==rows) that are not (by itself) related. (Even if
 they happen to share all attribute values with another row of a
 supertable.) If you want that, then you need to resort to 3NF (or
 ROWREF's which iirc is another SQL99 feature).

As I understand it, SQL99 has the restriction that a row with the same
primary key appearing in a supertable and/or any of its subtables must
be the result of a single INSERT statement. Thus, SQL99 doesn't allow
what you're saying, if I understand what you're saying. (I'm not sure
that I do.)

Am I to take it that you think the inheritance should be inheritance
of type information only? That is, if I have supertable A and
subtable A', inserting a row into A' does not make a row appear in
A? If so, I've got not real problem with that at present, but it's
not what postgres currently does, nor would it conform to SQL99.

What do others think of this idea?

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Inheritance

2002-08-16 Thread Zeugswetter Andreas SB SD


  It's nonlocal constraints that are the problem, and here foreign keys
  and UNIQUE constraints are certainly the canonical examples.  Both of
  these would be largely solved with table-spanning indexes I think.
 
 Note that the other obvious way to solve this would be to store all of
 the information inherited from the parent in the parent table, so that
 you don't have to do anything special to make all of the constraints and
 whatnot apply.

Seems with above you are not able to constrain what qualifies for a supertable row,
you would only be able to specify constraints that apply to all it's subtables.
To me, the current implementation looks superior and more efficient.

The SQL inheritance is a class/subclass thing. 
All tables have instances (==rows) that are not (by itself) related.
(Even if they happen to share all attribute values with another row of a supertable.)
If you want that, then you need to resort to 3NF (or ROWREF's which iirc is another
SQL99 feature).

Andreas

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Inheritance

2002-08-14 Thread Don Baccus

Bruce Momjian wrote:
 Christopher Kings-Lynne wrote:
 
1. The current implementation is broken.

2. We have no proper description of how a fixed implementation
should work.

Surely 99% of the implementation problems could be solved with an index type
that can span tables?
 
 
 Right.  Instead of talking in circles, let's figure out how to do it.
 If the issue is only sequence numbers, can we force a column to _only_
 get values from the sequence counter,

Even if primary keys were forced to be generated from a sequence (a very 
artificial restriction), unique constraints are also implemented by 
index.  And people also join on columns other than their primary key so 
will want indexes on these columns to span tables, also.


-- 
Don Baccus
Portland, OR
http://donb.photo.net, http://birdnotes.net, http://openacs.org


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Inheritance

2002-08-14 Thread Tom Lane

Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Is it theoretically possible to add support to btree for storing table along
 with the indexed value?

That's what we need, all right.

 This would obviously add overhead, so it would only
 be done for spanning indexes.  The index would also take up more space on
 disk I guess.
 When a new inherited table is created, all parent indices would be dropped
 and recreated as spanning indices and vice versa.

Seems like the hard way.  Instead use a t_infomask bit in indextuples to
indicate that the index entry points to a table other than the one its
index is nominally associated with; if and only if this bit is set, the
table OID follows the indextuple header.  This way, you don't have to
reindex just to create a child table, and you also don't pay any extra
space cost for index entries that in fact point at the parent.

There are a veritable ton of other issues to be resolved --- like how do
we (efficiently) find all the indexes relevant to a given child table
--- but the physical storage doesn't seem too complicated.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Inheritance

2002-08-14 Thread Greg Copeland

On Wed, 2002-08-14 at 08:59, Tom Lane wrote:
 There are a veritable ton of other issues to be resolved --- like how do
 we (efficiently) find all the indexes relevant to a given child table
 --- but the physical storage doesn't seem too complicated.


Tom, seems we have yet another false start.  Thanks for offering your
comments on the topic at hand.  Since you seem to have a good grasp on
the the is case is, would you be willing to offer up some additional
details on what you feel the (veritable ton of) outstanding issues
are?

Seems everyone clearly wants a cure and is itching to get there, yet I
don't fully understand the disease.  I suspect that there are others in
the same boat.  I feel that this is important for us all of understand. 
I think we need to understand what our to-be picture is as well as
what points need to be addressed before we can say we've arrived.

Willing to help spell this out?

Regards,
Greg Copeland




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Inheritance

2002-08-14 Thread Greg Copeland

On Tue, 2002-08-13 at 23:43, Curt Sampson wrote:
 Just my opinion of course, but I think it would be best to have a
 detailed description of how everything in inheritance is supposed to
 work, write a set of tests from that, and then fix the implementation to
 conform to the tests.
 
 And I think a detailed description comes most easily when you have
 a logical model to work from.

I completely agree.  This is why I want/wanted to pursue the theory and
existing implementations angle.

Seems like everyone trying to jump on index spanning is premature.

Doesn't Oracle have table inheritance?  Hmmm...I might have to go do
some reading to find out one way or anther...  ;)

Sign,
Greg Copeland



signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Inheritance

2002-08-14 Thread Ross J. Reedstrom

On Wed, Aug 14, 2002 at 09:39:06AM -0500, Greg Copeland wrote:
 On Tue, 2002-08-13 at 23:43, Curt Sampson wrote:
  Just my opinion of course, but I think it would be best to have a
  detailed description of how everything in inheritance is supposed to
  work, write a set of tests from that, and then fix the implementation to
  conform to the tests.
  
  And I think a detailed description comes most easily when you have
  a logical model to work from.
 
 I completely agree.  This is why I want/wanted to pursue the theory and
 existing implementations angle.

In theory, it sounds like a good idea. In practice ... ;-)

 Seems like everyone trying to jump on index spanning is premature.

Seems like some people haven't looked at the history of the OO
implementation in PostgreSQL.

Actually, I think you'll find that once a PostgreSQL DBA gets to
the point of designing a sufficently complex schema that inheritance
might be useful, they quickly bump up against the lack of index and
constraint spanning (most notably, referential integrity), and stop
right there. This means that there is little community experience with
the existing implementation, beyond the OO die hards. ;-)

I'm not sure, but Bruce's suggestion of getting index spanning working
first might move the existing implementation over the hump from
'interesting toy' to 'less than perfect implementation'. Then, the
community can get some real world experience.

Bruce has archived some of the emails - check your local pgsql source tree,
under $PGSQLHOME/doc/TODO.detail/inheritance

There was also some theoretical OO discussion, back when the change for
default SELECT behavior on an inhertiance tree was made. (You used to
have to say: SELECT foo from parent* to get foo from the parent and all
children) Take a look at the archives and see if there's anything in that
discussion that interests you: providing summary posts of old discussions
is often a good way to restart and move an unresolved topic along.

Ross

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Inheritance

2002-08-14 Thread Joe Conway

Ross J. Reedstrom wrote:
 Actually, I think you'll find that once a PostgreSQL DBA gets to
 the point of designing a sufficently complex schema that inheritance
 might be useful, they quickly bump up against the lack of index and
 constraint spanning (most notably, referential integrity), and stop
 right there. This means that there is little community experience with
 the existing implementation, beyond the OO die hards. ;-)

I'd have to agree wholeheartedly with this, because this was exactly my 
experience the one time I wanted to use inherited tables.

FWIW, one thought I've had before related to inheritance (but pretty 
much orthognal to this discussion) is this: if inheritance included 
shared indexes and constraints, we would be not too far from having 
Oracle style table partitioning.

Joe


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Inheritance

2002-08-14 Thread Greg Copeland

On Wed, 2002-08-14 at 10:17, Ross J. Reedstrom wrote:
 On Wed, Aug 14, 2002 at 09:39:06AM -0500, Greg Copeland wrote:
  I completely agree.  This is why I want/wanted to pursue the theory and
  existing implementations angle.
 
 In theory, it sounds like a good idea. In practice ... ;-)
 

LOL.  :)

  Seems like everyone trying to jump on index spanning is premature.
 
 Seems like some people haven't looked at the history of the OO
 implementation in PostgreSQL.

[waving hand...]

 
 Bruce has archived some of the emails - check your local pgsql source tree,
 under $PGSQLHOME/doc/TODO.detail/inheritance
 
 There was also some theoretical OO discussion, back when the change for
 default SELECT behavior on an inhertiance tree was made. (You used to
 have to say: SELECT foo from parent* to get foo from the parent and all
 children) Take a look at the archives and see if there's anything in that
 discussion that interests you: providing summary posts of old discussions
 is often a good way to restart and move an unresolved topic along.

Thanks!  I briefly read something about that in the archives.  Excellent
pointers.  I'll check that out.  If I have time, I'll try to summarize
and post.


Greg Copeland




signature.asc
Description: This is a digitally signed message part


  1   2   >