Re: [HACKERS] Partitioning WIP patch

2015-03-03 Thread Bruce Momjian
On Fri, Feb 27, 2015 at 09:09:35AM +0900, Amit Langote wrote:
 On 27-02-2015 AM 03:24, Andres Freund wrote:
  On 2015-02-26 12:15:17 +0900, Amit Langote wrote:
  On 26-02-2015 AM 05:15, Josh Berkus wrote:
  I would love to have it for 9.5, but I guess the
  patch isn't nearly baked enough for that?
  
  I'm not quite sure what would qualify as baked enough for 9.5 though we
  can surely try to reach some consensus on various implementation aspects
  and perhaps even get it ready in time for 9.5.
  
  I think it's absolutely unrealistic to get this into 9.5. There's barely
  been any progress on the current (last!) commitfest - where on earth
  should the energy come to make this patch ready? And why would that be
  fair against all the others that have submitted in time?
  
 
 I realize and I apologize that it was irresponsible of me to have said
 that; maybe got a bit too excited. I do not want to unduly draw people's
 time on something that's not quite ready while there are other things
 people have worked hard on to get in time. In all earnestness, I say we
 spend time perfecting those things.
 
 I'll add this into CF-JUN'15. I will keep posting updates meanwhile so
 that when that commitfest finally starts, we will have something worth
 considering.

I am _very_ glad you have started on this.  There is a huge need for
this, and I am certainly excited about it.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-26 Thread Jim Nasby

On 2/26/15 3:22 AM, Andres Freund wrote:

On 2015-02-26 02:20:21 -0600, Jim Nasby wrote:

The reason I'd like to do this with partitioning vs plain inheritance is
presumably as we build out partitioning we'll get very useful things like
the ability to have FKs to properly partitioned tables. Insert tuple routing
could also be useful.


The problem there imo isn't so much inheritance, but lack of working
unique checks across partitions. That's something we can implement
independent of this, it's just not trivial.


There's been discussion of allowing for uniqueness when we can guarantee 
no overlap between partitions, and the partition key is part of the 
unique constraint. That's the particular use case I was thinking of.


I suspect there's other partitioning features that would be useful in a 
generic inheritance setup as well; that's why I'd love to see both 
features work together... but I fear there's enough work to get there 
that it may not happen, and I don't want us to accidentally start mixing 
the two and have users start relying on it.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-26 Thread Jim Nasby

On 2/26/15 3:09 AM, Amit Langote wrote:

Yes. If it helps, the exact use-case I have in mind is using list-based
partitioning + additional columns in some/all children (different
between children). For example, if you need to track different types of
customer payment methods, you'd have a payment parent table, a list
partition for credit  debit cards, a different list partition for bank
accounts, etc.

The reason I'd like to do this with partitioning vs plain inheritance is
presumably as we build out partitioning we'll get very useful things
like the ability to have FKs to properly partitioned tables. Insert
tuple routing could also be useful.


Unless I'm missing something again, isn't allowing partitions to have
heterogeneous rowtypes a problem in the long run? I'm afraid I'm
confused as to your stand regarding inheritance vs. new partitioning. To
be specific, children with heterogeneous schemas sounds much like what
inheritance would be good for as you say. But then isn't that why we
have to plan children individually which you said new partitioning
should get away from?


Apologies if I haven't been clear enough. What I'd like to see is the 
best of both worlds; fast partitioning when not using inheritance, and 
perhaps somewhat slower when using inheritance, but still with the 
features partitioning gives you.


But my bigger concern from a project standpoint is that we not put 
ourselves in a position of supporting something that we really don't 
want to support (a partitioning system that's got inheritance mixed in). 
As much as I'd personally like to have both features together, I think 
it would be bad for the community to go down that road without careful 
thought.



With explicit partitioning, shouldn't we go in direction where we remove
some restrictions imposed by inheritance (think multiple inheritance)? I
recall a link Alvaro had started the discussion with think link to a
Tom's remark about something very related:

http://www.postgresql.org/message-id/1598.1399826...@sss.pgh.pa.us


That post looks like Tom figured out a way to eliminate a problem that
hurts inheritance, so that's good.

My fear is that at some point we'll hit a problem with partitioning that
we can't solve in the inheritance model. If we allow inheritance
features into partitioning now we'll painted into a corner. If we
disallow those features now we can always re-enable them if we get to
the point where we're in the clear.

Does that make sense?

Yes, it does. In fact, I do intend to keep them separate the first
attempt of which is to choose to NOT transform a PARTITION OF parent
clause into INHERITS parent. Any code that may look like it's trying to
do that is because the patch is not fully baked yet.


Ok, good to know. That's why I was asking about ALTER TABLE ADD COLUMN 
on a partition. If we release something without that being restricted 
it'll probably cause trouble later on.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-26 Thread Josh Berkus
On 02/25/2015 07:15 PM, Amit Langote wrote:
 On 26-02-2015 AM 05:15, Josh Berkus wrote:
 On 02/24/2015 12:13 AM, Amit Langote wrote:
 Here is an experimental patch that attempts to implement this.

 This looks awesome. 
 
 Thanks!
 
 I would love to have it for 9.5, but I guess the
 patch isn't nearly baked enough for that?

 
 I'm not quite sure what would qualify as baked enough for 9.5 though we
 can surely try to reach some consensus on various implementation aspects
 and perhaps even get it ready in time for 9.5.

Well, we don't have long at all to do that.  I guess I'm asking what
kind of completeness of code we have; is this basically done pending API
changes and bugs, or are there major bits (like, say, pg_dump  and
EXPLAIN support) which are completely unimplemented?

 where key_spec consists of partition key column names and optional
 operator class per column. Currently, there are restrictions on the
 key_spec such as allowing only column names (not arbitrary expressions
 of them), only one column for list strategy, etc.

 What's the obstacle to supporting expressions and/or IMMUTABLE
 functions?  I think it's fine to add this feature without them
 initially, I'm just asking about the roadmap for eventually supporting
 expressions in the key spec.

 
 Only one concern I can remember someone had raised is that having to
 evaluate an expression for every row during bulk-inserts may end up
 being pretty expensive. Though, we might have to live with that.

Well, it's not more expensive than having to materialize the value from
a trigger and store it on disk.  The leading one here would be functions
over timestamp; for example, the data has a timestamptz, but you want to
partition by week.

 
 I think one idea is to learn from ability to use expressions in indexes.

Sure.  So a feature to implement for the 2nd release.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-26 Thread Andres Freund
On 2015-02-26 12:15:17 +0900, Amit Langote wrote:
 On 26-02-2015 AM 05:15, Josh Berkus wrote:
  On 02/24/2015 12:13 AM, Amit Langote wrote:
  Here is an experimental patch that attempts to implement this.

  I would love to have it for 9.5, but I guess the
  patch isn't nearly baked enough for that?

 I'm not quite sure what would qualify as baked enough for 9.5 though we
 can surely try to reach some consensus on various implementation aspects
 and perhaps even get it ready in time for 9.5.

I think it's absolutely unrealistic to get this into 9.5. There's barely
been any progress on the current (last!) commitfest - where on earth
should the energy come to make this patch ready? And why would that be
fair against all the others that have submitted in time?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-26 Thread Amit Langote
On 26-02-2015 PM 05:20, Jim Nasby wrote:
 On 2/25/15 7:57 PM, Amit Langote wrote:
 AIUI, as far as we stay with a design where partitions (children) are
 individually planned, that might be OK. But, I guess things will get
 more complicated. I think the role of a parent in planning would remain
 limited to drive partition-pruning. Am I missing something?
 
 Isn't the point of adding explicit partitioning to make it faster than
 plain inheritance? Presumably as part of that we'll eventually want to
 NOT plan children individually.
 

Yes, we'd definitely want to get to a point where planning children
individually is not necessary. But I am afraid we will have to get there
a step at a time. IMHO, solving one problem of partition-pruning would
be a good start. And that will definitely be part of parent's planning
using partition bounds list (not pruning children one-by-one with
relation_excluded_by_constraints()).

 I would certainly prefer that we support the capabilities of inheritance
 along with partitioning (because in some cases you want both). But it's
 going to limit what we can do internally.

 Just to clarify are you referring to inheritance relationship between a
 partitioned table and partitions?
 
 Yes. If it helps, the exact use-case I have in mind is using list-based
 partitioning + additional columns in some/all children (different
 between children). For example, if you need to track different types of
 customer payment methods, you'd have a payment parent table, a list
 partition for credit  debit cards, a different list partition for bank
 accounts, etc.
 
 The reason I'd like to do this with partitioning vs plain inheritance is
 presumably as we build out partitioning we'll get very useful things
 like the ability to have FKs to properly partitioned tables. Insert
 tuple routing could also be useful.
 

Unless I'm missing something again, isn't allowing partitions to have
heterogeneous rowtypes a problem in the long run? I'm afraid I'm
confused as to your stand regarding inheritance vs. new partitioning. To
be specific, children with heterogeneous schemas sounds much like what
inheritance would be good for as you say. But then isn't that why we
have to plan children individually which you said new partitioning
should get away from?

 With explicit partitioning, shouldn't we go in direction where we remove
 some restrictions imposed by inheritance (think multiple inheritance)? I
 recall a link Alvaro had started the discussion with think link to a
 Tom's remark about something very related:

 http://www.postgresql.org/message-id/1598.1399826...@sss.pgh.pa.us
 
 That post looks like Tom figured out a way to eliminate a problem that
 hurts inheritance, so that's good.
 
 My fear is that at some point we'll hit a problem with partitioning that
 we can't solve in the inheritance model. If we allow inheritance
 features into partitioning now we'll painted into a corner. If we
 disallow those features now we can always re-enable them if we get to
 the point where we're in the clear.
 
 Does that make sense?

Yes, it does. In fact, I do intend to keep them separate the first
attempt of which is to choose to NOT transform a PARTITION OF parent
clause into INHERITS parent. Any code that may look like it's trying to
do that is because the patch is not fully baked yet.

Regards,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-26 Thread Andres Freund
On 2015-02-26 02:20:21 -0600, Jim Nasby wrote:
 The reason I'd like to do this with partitioning vs plain inheritance is
 presumably as we build out partitioning we'll get very useful things like
 the ability to have FKs to properly partitioned tables. Insert tuple routing
 could also be useful.

The problem there imo isn't so much inheritance, but lack of working
unique checks across partitions. That's something we can implement
independent of this, it's just not trivial.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-26 Thread Jim Nasby

On 2/25/15 7:57 PM, Amit Langote wrote:

On 26-02-2015 AM 10:31, Jim Nasby wrote:

On 2/25/15 7:24 PM, Amit Langote wrote:

Does ALTER TABLE parent_monthly_x_201401 ADD COLUMN foo still
operate the same as today? I'd like to see us continue to support that,
but perhaps it would be wise to not paint ourselves into that corner
just yet.

Nothing prevents that from working, at least at the moment.


Ok, but is that what we really want? If we release it that way we'll be
stuck with it forever.



AIUI, as far as we stay with a design where partitions (children) are
individually planned, that might be OK. But, I guess things will get
more complicated. I think the role of a parent in planning would remain
limited to drive partition-pruning. Am I missing something?


Isn't the point of adding explicit partitioning to make it faster than 
plain inheritance? Presumably as part of that we'll eventually want to 
NOT plan children individually.



I would certainly prefer that we support the capabilities of inheritance
along with partitioning (because in some cases you want both). But it's
going to limit what we can do internally.


Just to clarify are you referring to inheritance relationship between a
partitioned table and partitions?


Yes. If it helps, the exact use-case I have in mind is using list-based 
partitioning + additional columns in some/all children (different 
between children). For example, if you need to track different types of 
customer payment methods, you'd have a payment parent table, a list 
partition for credit  debit cards, a different list partition for bank 
accounts, etc.


The reason I'd like to do this with partitioning vs plain inheritance is 
presumably as we build out partitioning we'll get very useful things 
like the ability to have FKs to properly partitioned tables. Insert 
tuple routing could also be useful.



With explicit partitioning, shouldn't we go in direction where we remove
some restrictions imposed by inheritance (think multiple inheritance)? I
recall a link Alvaro had started the discussion with think link to a
Tom's remark about something very related:

http://www.postgresql.org/message-id/1598.1399826...@sss.pgh.pa.us


That post looks like Tom figured out a way to eliminate a problem that 
hurts inheritance, so that's good.


My fear is that at some point we'll hit a problem with partitioning that 
we can't solve in the inheritance model. If we allow inheritance 
features into partitioning now we'll painted into a corner. If we 
disallow those features now we can always re-enable them if we get to 
the point where we're in the clear.


Does that make sense?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-26 Thread Michael Paquier
On Fri, Feb 27, 2015 at 3:24 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2015-02-26 12:15:17 +0900, Amit Langote wrote:
 On 26-02-2015 AM 05:15, Josh Berkus wrote:
  On 02/24/2015 12:13 AM, Amit Langote wrote:
  Here is an experimental patch that attempts to implement this.

  I would love to have it for 9.5, but I guess the
  patch isn't nearly baked enough for that?

 I'm not quite sure what would qualify as baked enough for 9.5 though we
 can surely try to reach some consensus on various implementation aspects
 and perhaps even get it ready in time for 9.5.

 I think it's absolutely unrealistic to get this into 9.5. There's barely
 been any progress on the current (last!) commitfest - where on earth
 should the energy come to make this patch ready? And why would that be
 fair against all the others that have submitted in time?

+1. There are many other patches pending the in CF app waiting for
feedback, while this one showed up after the last CF deadline for 9.5
and needs design and spec decisions that should not be taken lightly
at the end of a major release development cycle. Please let's not rush
into something we may regret.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-26 Thread Amit Langote
On 27-02-2015 AM 03:18, Josh Berkus wrote:
 On 02/25/2015 07:15 PM, Amit Langote wrote:
 I'm not quite sure what would qualify as baked enough for 9.5 though we
 can surely try to reach some consensus on various implementation aspects
 and perhaps even get it ready in time for 9.5.
 
 Well, we don't have long at all to do that.  I guess I'm asking what
 kind of completeness of code we have; is this basically done pending API
 changes and bugs, or are there major bits (like, say, pg_dump  and
 EXPLAIN support) which are completely unimplemented?
 

I would say I am not entirely sure/satisfied about some decisions I have
made (or not) when writing even the basic patch. Yes,
pg_dump/EXPLAIN/psql, etc. are not touched. So, it seems it might not be
fair to claim it's actually something for 9.5. Let me just call it WIP
for a while while keep I working on it and receive feedback.

 Only one concern I can remember someone had raised is that having to
 evaluate an expression for every row during bulk-inserts may end up
 being pretty expensive. Though, we might have to live with that.
 
 Well, it's not more expensive than having to materialize the value from
 a trigger and store it on disk.  The leading one here would be functions
 over timestamp; for example, the data has a timestamptz, but you want to
 partition by week.
 

 I think one idea is to learn from ability to use expressions in indexes.
 
 Sure.  So a feature to implement for the 2nd release.

Actually, I'm trying to add that and see how it works. I will post an
updated patch soon if it looks good enough.

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-26 Thread Amit Langote
On 27-02-2015 AM 03:24, Andres Freund wrote:
 On 2015-02-26 12:15:17 +0900, Amit Langote wrote:
 On 26-02-2015 AM 05:15, Josh Berkus wrote:
 I would love to have it for 9.5, but I guess the
 patch isn't nearly baked enough for that?
 
 I'm not quite sure what would qualify as baked enough for 9.5 though we
 can surely try to reach some consensus on various implementation aspects
 and perhaps even get it ready in time for 9.5.
 
 I think it's absolutely unrealistic to get this into 9.5. There's barely
 been any progress on the current (last!) commitfest - where on earth
 should the energy come to make this patch ready? And why would that be
 fair against all the others that have submitted in time?
 

I realize and I apologize that it was irresponsible of me to have said
that; maybe got a bit too excited. I do not want to unduly draw people's
time on something that's not quite ready while there are other things
people have worked hard on to get in time. In all earnestness, I say we
spend time perfecting those things.

I'll add this into CF-JUN'15. I will keep posting updates meanwhile so
that when that commitfest finally starts, we will have something worth
considering.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-26 Thread Amit Langote
On 27-02-2015 AM 03:01, Jim Nasby wrote:
 On 2/26/15 3:09 AM, Amit Langote wrote:
 Unless I'm missing something again, isn't allowing partitions to have
 heterogeneous rowtypes a problem in the long run? I'm afraid I'm
 confused as to your stand regarding inheritance vs. new partitioning. To
 be specific, children with heterogeneous schemas sounds much like what
 inheritance would be good for as you say. But then isn't that why we
 have to plan children individually which you said new partitioning
 should get away from?
 
 Apologies if I haven't been clear enough. What I'd like to see is the
 best of both worlds; fast partitioning when not using inheritance, and
 perhaps somewhat slower when using inheritance, but still with the
 features partitioning gives you.
 

I get the distinction, thanks.

Actually I wasn't quite thinking of altering the way any part of the
current partitioning based on inheritance works nor am I proposing to
get rid of it. It all stays as is. Not sure how we could say if it will
support features of the new partitioning before those features actually
begin to materialize.

 But my bigger concern from a project standpoint is that we not put
 ourselves in a position of supporting something that we really don't
 want to support (a partitioning system that's got inheritance mixed in).
 As much as I'd personally like to have both features together, I think
 it would be bad for the community to go down that road without careful
 thought.
 
 Yes, it does. In fact, I do intend to keep them separate the first
 attempt of which is to choose to NOT transform a PARTITION OF parent
 clause into INHERITS parent. Any code that may look like it's trying to
 do that is because the patch is not fully baked yet.
 
 Ok, good to know. That's why I was asking about ALTER TABLE ADD COLUMN
 on a partition. If we release something without that being restricted
 it'll probably cause trouble later on.

Yes, I agree. More generally, I think the patch/approach is in need of a
clear separation of internal implementation concerns and user-facing
notions even at this point. This may be one of them. For example, with
the patch, a partition is defined as:

CREATE TABLE name PARTITION OF parent ...

Unless that turns into something like:

CREATE PARTITION name OF parent ...

we may not be able to put all the restrictions we'd want to put on a
partition for the sake of what would be partitioning internals.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-25 Thread Josh Berkus
On 02/24/2015 12:13 AM, Amit Langote wrote:
 Here is an experimental patch that attempts to implement this.

This looks awesome.  I would love to have it for 9.5, but I guess the
patch isn't nearly baked enough for that?

 It implements the following syntax:
 
 * Syntax for defining partition key:
 CREATE TABLE table_name(columns)PARTITION BY {RANGE|LIST} ON (key_spec);
 
 where key_spec consists of partition key column names and optional
 operator class per column. Currently, there are restrictions on the
 key_spec such as allowing only column names (not arbitrary expressions
 of them), only one column for list strategy, etc.

What's the obstacle to supporting expressions and/or IMMUTABLE
functions?  I think it's fine to add this feature without them
initially, I'm just asking about the roadmap for eventually supporting
expressions in the key spec.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-25 Thread Amit Langote
On 26-02-2015 AM 10:31, Jim Nasby wrote:
 On 2/25/15 7:24 PM, Amit Langote wrote:
 Does ALTER TABLE parent_monthly_x_201401 ADD COLUMN foo still
 operate the same as today? I'd like to see us continue to support that,
 but perhaps it would be wise to not paint ourselves into that corner
 just yet.
 Nothing prevents that from working, at least at the moment.
 
 Ok, but is that what we really want? If we release it that way we'll be
 stuck with it forever.
 

AIUI, as far as we stay with a design where partitions (children) are
individually planned, that might be OK. But, I guess things will get
more complicated. I think the role of a parent in planning would remain
limited to drive partition-pruning. Am I missing something?

 I would certainly prefer that we support the capabilities of inheritance
 along with partitioning (because in some cases you want both). But it's
 going to limit what we can do internally.

Just to clarify are you referring to inheritance relationship between a
partitioned table and partitions?

With explicit partitioning, shouldn't we go in direction where we remove
some restrictions imposed by inheritance (think multiple inheritance)? I
recall a link Alvaro had started the discussion with think link to a
Tom's remark about something very related:

http://www.postgresql.org/message-id/1598.1399826...@sss.pgh.pa.us

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-25 Thread Amit Langote
On 26-02-2015 AM 05:15, Josh Berkus wrote:
 On 02/24/2015 12:13 AM, Amit Langote wrote:
 Here is an experimental patch that attempts to implement this.
 
 This looks awesome. 

Thanks!

 I would love to have it for 9.5, but I guess the
 patch isn't nearly baked enough for that?
 

I'm not quite sure what would qualify as baked enough for 9.5 though we
can surely try to reach some consensus on various implementation aspects
and perhaps even get it ready in time for 9.5.

 where key_spec consists of partition key column names and optional
 operator class per column. Currently, there are restrictions on the
 key_spec such as allowing only column names (not arbitrary expressions
 of them), only one column for list strategy, etc.
 
 What's the obstacle to supporting expressions and/or IMMUTABLE
 functions?  I think it's fine to add this feature without them
 initially, I'm just asking about the roadmap for eventually supporting
 expressions in the key spec.
 

Only one concern I can remember someone had raised is that having to
evaluate an expression for every row during bulk-inserts may end up
being pretty expensive. Though, we might have to live with that.

I think one idea is to learn from ability to use expressions in indexes.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch (was: Partitioning: issues/ideas)

2015-02-25 Thread Jim Nasby

On 2/24/15 2:13 AM, Amit Langote wrote:

-- a plain table
CREATE TABLE parent_monthly(year int, month int, day int);

-- a partitioned table
-- x: number of partitions
CREATE TABLE parent_monthly_x(LIKE parent_monthly) PARTITION BY
RANGE ON(year, month);


To be clear, in this example parent_table_x is in no way related to 
parent_monthly, just like a normal CREATE TABLE (LIKE table), right?



-- partitions
CREATE TABLE parent_monthly_x_201401 PARTITION OF
parent_monthly_00100_201401 FOR VALUES BETWEEN (2014, 1) AND (2014, 2);


And the partitions are still inheritance children?

Does ALTER TABLE parent_monthly_x_201401 ADD COLUMN foo still 
operate the same as today? I'd like to see us continue to support that, 
but perhaps it would be wise to not paint ourselves into that corner 
just yet.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-25 Thread Jim Nasby

On 2/25/15 7:24 PM, Amit Langote wrote:

Does ALTER TABLE parent_monthly_x_201401 ADD COLUMN foo still
operate the same as today? I'd like to see us continue to support that,
but perhaps it would be wise to not paint ourselves into that corner
just yet.

Nothing prevents that from working, at least at the moment.


Ok, but is that what we really want? If we release it that way we'll be 
stuck with it forever.


I would certainly prefer that we support the capabilities of inheritance 
along with partitioning (because in some cases you want both). But it's 
going to limit what we can do internally.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-25 Thread Amit Langote
On 26-02-2015 AM 09:28, Jim Nasby wrote:
 On 2/24/15 2:13 AM, Amit Langote wrote:
 -- a plain table
 CREATE TABLE parent_monthly(year int, month int, day int);

 -- a partitioned table
 -- x: number of partitions
 CREATE TABLE parent_monthly_x(LIKE parent_monthly) PARTITION BY
 RANGE ON(year, month);
 
 To be clear, in this example parent_table_x is in no way related to
 parent_monthly, just like a normal CREATE TABLE (LIKE table), right?
 

Yes, there is no relation at all. I was maybe just trying to save few
keystrokes. Sorry, that may be confusing.

parent_monthly is just a regular table, part of the example.

 -- partitions
 CREATE TABLE parent_monthly_x_201401 PARTITION OF
 parent_monthly_00100_201401 FOR VALUES BETWEEN (2014, 1) AND (2014, 2);
 
 And the partitions are still inheritance children?
 

At this point, they *almost* are; more for the sake of Append. Though
sooner than later, we will have to invent a version of Append for
partitioned tables that does more than just append the outputs of
underlying plans. For example, it would use partitioninfo cached in
relation descriptor of the parent to drive partition-pruning for
starters. Pruning child relations individually by way of
constraint_exclusion doesn't scale as is well known.

To clarify things a bit more, transformCreateStmt() transforms PARTITION
OF into LIKE INCLUDING ALL, not INHERITS. And as mentioned, for Append
to work, I have made ATExecAddInherit() to do some of the things
ATExecAttachPartition() does. Again, that is a temporary arrangement.

 Does ALTER TABLE parent_monthly_x_201401 ADD COLUMN foo still
 operate the same as today? I'd like to see us continue to support that,
 but perhaps it would be wise to not paint ourselves into that corner
 just yet.

Nothing prevents that from working, at least at the moment.

CREATE TABLE parent_monthly_00012(LIKE parent_monthly) PARTITION BY
RANGE ON(year, month);

CREATE TABLE parent_monthly_00012_201401 PARTITION OF
parent_monthly_00012 FOR VALUES BETWEEN (2014, 1) AND (2014, 2);

snip

CREATE TABLE parent_monthly_00012_201412 PARTITION OF
parent_monthly_00012 FOR VALUES BETWEEN (2014, 12) AND (2015, 1);

# INSERT INTO parent_monthly_00012 VALUES (2014, 07, 01);
INSERT 0 1

# SELECT * FROM parent_monthly_00012;
 year | month | day
--+---+-
 2014 | 7 |   1
(1 row)

# INSERT INTO parent_monthly_00012 VALUES (2014, 08, 01);
INSERT 0 1

# ALTER TABLE parent_monthly_00012_201408 ADD COLUMN hour int;
ALTER TABLE

# INSERT INTO parent_monthly_00012_201408 VALUES (2014, 08, 01, 10);
INSERT 0 1

# SELECT * FROM parent_monthly_00012;
 year | month | day
--+---+-
 2014 | 7 |   1
 2014 | 8 |   1
 2014 | 8 |   1
(3 rows)

# SELECT * FROM parent_monthly_00012_201408;
 year | month | day | hour
--+---+-+--
 2014 | 8 |   1 |
 2014 | 8 |   1 |   10
(2 rows)

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch

2015-02-25 Thread Amit Langote
On 26-02-2015 AM 10:24, Amit Langote wrote:
 To clarify things a bit more, transformCreateStmt() transforms PARTITION
 OF into LIKE INCLUDING ALL, not INHERITS. And as mentioned, for Append
 to work, I have made ATExecAddInherit() to do some of the things
 ATExecAttachPartition() does. Again, that is a temporary arrangement.
 

I misspoke. Should have said:

... for Append to work, I have made ATExecAttachPartition() to do some
of the things ATExecAddInherit() does. Again, that is a temporary
arrangement.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch (was: Partitioning: issues/ideas)

2015-02-25 Thread Amit Langote
On 24-02-2015 PM 05:13, Amit Langote wrote:
 On 21-01-2015 PM 07:26, Amit Langote wrote:

 Ok, I will limit myself to focusing on following things at the moment:

 * Provide syntax in CREATE TABLE to declare partition key
 * Provide syntax in CREATE TABLE to declare a table as partition of a
 partitioned table and values it contains
 * Arrange to have partition key and values stored in appropriate
 catalogs (existing or new)
 * Arrange to cache partitioning info of partitioned tables in relcache

 
 Here is an experimental patch that attempts to implement this.

I divided the patch into two for convenience:

1) 0001_partition_syntax_catalog - adds commands, catalog and
partitioned table relation descriptor related WIP code

2) 0002_tuple-routing-poc - an experimental patch to test how well
binary search approach works for tuple routing in ExecInsert().

Please take a look.

Thanks,
Amit
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index a403c64..bf02730 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -14,7 +14,7 @@ OBJS = catalog.o dependency.o heap.o index.o indexing.o namespace.o aclchk.o \
objectaccess.o objectaddress.o pg_aggregate.o pg_collation.o \
pg_constraint.o pg_conversion.o \
pg_depend.o pg_enum.o pg_inherits.o pg_largeobject.o pg_namespace.o \
-   pg_operator.o pg_proc.o pg_range.o pg_db_role_setting.o pg_shdepend.o \
+   pg_operator.o pg_partition.o pg_proc.o pg_range.o pg_db_role_setting.o pg_shdepend.o \
pg_type.o storage.o toasting.o
 
 BKIFILES = postgres.bki postgres.description postgres.shdescription
@@ -41,6 +41,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
 	pg_foreign_data_wrapper.h pg_foreign_server.h pg_user_mapping.h \
 	pg_foreign_table.h pg_policy.h \
 	pg_default_acl.h pg_seclabel.h pg_shseclabel.h pg_collation.h pg_range.h \
+	pg_partitioned_rel.h pg_partition.h\
 	toasting.h indexing.h \
 )
 
diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index bacb242..8bdb34b 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -43,6 +43,8 @@
 #include catalog/pg_opclass.h
 #include catalog/pg_operator.h
 #include catalog/pg_opfamily.h
+#include catalog/pg_partition.h
+#include catalog/pg_partitioned_rel.h
 #include catalog/pg_policy.h
 #include catalog/pg_proc.h
 #include catalog/pg_rewrite.h
@@ -157,7 +159,9 @@ static const Oid object_classes[MAX_OCLASS] = {
 	DefaultAclRelationId,		/* OCLASS_DEFACL */
 	ExtensionRelationId,		/* OCLASS_EXTENSION */
 	EventTriggerRelationId,		/* OCLASS_EVENT_TRIGGER */
-	PolicyRelationId			/* OCLASS_POLICY */
+	PolicyRelationId,			/* OCLASS_POLICY */
+	PartitionedRelRelationId,	/* OCLASS_PARTITIONED_REL */
+	PartitionRelationId			/* OCLASS_PARTITION */
 };
 
 
@@ -1265,6 +1269,14 @@ doDeletion(const ObjectAddress *object, int flags)
 			RemovePolicyById(object-objectId);
 			break;
 
+		case OCLASS_PARTITIONED_REL:
+			RemovePartitionKeyByRelId(object-objectId);
+			break;
+
+		case OCLASS_PARTITION:
+			RemovePartitionDefByRelId(object-objectId);
+			break;
+
 		default:
 			elog(ERROR, unrecognized object class: %u,
  object-classId);
@@ -2373,6 +2385,12 @@ getObjectClass(const ObjectAddress *object)
 
 		case PolicyRelationId:
 			return OCLASS_POLICY;
+
+		case PartitionedRelRelationId:
+			return OCLASS_PARTITIONED_REL;
+
+		case PartitionRelationId:
+			return OCLASS_PARTITION;
 	}
 
 	/* shouldn't get here */
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 17f7266..0308c0b 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -47,6 +47,8 @@
 #include catalog/pg_foreign_table.h
 #include catalog/pg_inherits.h
 #include catalog/pg_namespace.h
+#include catalog/pg_partition.h
+#include catalog/pg_partitioned_rel.h
 #include catalog/pg_statistic.h
 #include catalog/pg_tablespace.h
 #include catalog/pg_type.h
@@ -110,6 +112,7 @@ static Node *cookConstraint(ParseState *pstate,
 			   Node *raw_constraint,
 			   char *relname);
 static List *insert_ordered_unique_oid(List *list, Oid datum);
+static void SetRelationIsPartitioned(Oid relationId, bool relispartitioned);
 
 
 /* 
@@ -2968,3 +2971,306 @@ insert_ordered_unique_oid(List *list, Oid datum)
 	lappend_cell_oid(list, prev, datum);
 	return list;
 }
+
+/*
+ * StorePartitionKey
+ *
+ * Store the partition key of relation rel into system catalog
+ * pg_partitioned_rel
+ */
+void
+StorePartitionKey(Relation rel, int nattrs,
+AttrNumber *partKeyAttrNumbers,
+Oid *partClassOids,
+char strategy)
+{
+	int i;
+	int2vector *partkey;
+	oidvector  *partclass;
+	Datum		values[Natts_pg_partitioned_rel];
+	bool		nulls[Natts_pg_partitioned_rel];
+	Relation	pg_partitioned_rel;
+	HeapTuple	tuple;
+	ObjectAddress   myself;
+	ObjectAddress   target;
+
+	/*
+	 * this check is currently unused 

[HACKERS] Partitioning WIP patch (was: Partitioning: issues/ideas)

2015-02-24 Thread Amit Langote
On 21-01-2015 PM 07:26, Amit Langote wrote:

 Ok, I will limit myself to focusing on following things at the moment:

 * Provide syntax in CREATE TABLE to declare partition key
 * Provide syntax in CREATE TABLE to declare a table as partition of a
 partitioned table and values it contains
 * Arrange to have partition key and values stored in appropriate
 catalogs (existing or new)
 * Arrange to cache partitioning info of partitioned tables in relcache


Here is an experimental patch that attempts to implement this.

It implements the following syntax:

* Syntax for defining partition key:
CREATE TABLE table_name(columns)PARTITION BY {RANGE|LIST} ON (key_spec);

where key_spec consists of partition key column names and optional
operator class per column. Currently, there are restrictions on the
key_spec such as allowing only column names (not arbitrary expressions
of them), only one column for list strategy, etc.

* Syntax for declaring a table as partition of a partitioned table:
CREATE TABLE table_name PARTITION OF parent_name FOR VALUES values_clause;

where values_clause can be:

IN (list_of_values), or
BETWEEN (range_lower_bounds) AND (range_upper_bounds);

The semantics for a range is [range_lower_bounds,range_upper_bounds),
that is, lower inclusive, upper exclusive. (this might later change
subject to choice regarding preferred/desired syntax)

Additionally, a partition can itself be further partitioned (though I
have not worked on the implementation details of multilevel partitioning
yet):

CREATE TABLE table_name PARTITION OF parent_name PARTITION BY
{RANGE|LIST} ON(key_columns) FOR VALUES values_clause;

There are two system catalogs pg_partitioned_rel and pg_partition which
store partition key spec and partition value spec, respectively.
(remember to initdb if interested in trying)

Please note that the above syntax and/or catalog may not be very
appealing nor that they won't change/disappear. I am posting the patch
more for examining the approach of internal representation of the
metadata for partitioning and get some general comments.

The approach I have implemented in this patch consists of loading
the partition key info and a list of partitions into the relation
descriptor for a partitioned table. In case of range partitioning, this
list is sorted on the range max bound. To see if that works any good, I
hacked ExecInsert() to make it find a partition for a tuple by
adding a ExecFindPartition() just before heap_insert(). It accepts
resultRelInfo of the parent and a tuple slot. It binary-searches for and
returns the descriptor of the chosen partition which ExecInsert() then
uses to perform heap_insert() and inserting index tuples. If no
partition is found, parent relation itself is used. heap_insert() and
ExecInsertIndexTuples() are the only things for which partition relation
is used. All of this is just experimental and most probably wrong in
details; but is done just to see what kind of performance to expect from
the chosen internal representation. Another thing is the approach that
tuple-routing ( partition-pruning) is a function of partitioned
relation and the tuple (or restrict quals). It will be more significant
when we'll get to implementing a partition-pruning function.

See below an example to show that having an extra ExecFindPartition()
does not degrade the performance of inserting a tuple much:

-- a plain table
CREATE TABLE parent_monthly(year int, month int, day int);

-- a partitioned table
-- x: number of partitions
CREATE TABLE parent_monthly_x(LIKE parent_monthly) PARTITION BY
RANGE ON(year, month);

-- partitions
CREATE TABLE parent_monthly_x_201401 PARTITION OF
parent_monthly_00100_201401 FOR VALUES BETWEEN (2014, 1) AND (2014, 2);

CREATE TABLE parent_monthly_x_201402 PARTITION OF
parent_monthly_00100_201402 FOR VALUES BETWEEN (2014, 2) AND (2014, 3);

CREATE TABLE parent_monthly_x_201403 PARTITION OF
parent_monthly_00100_201403 FOR VALUES BETWEEN (2014, 3) AND (2014, 4);

snip

CREATE TABLE parent_monthly_x_mm PARTITION OF
parent_monthly_00100_mm FOR VALUES BETWEEN (, mm AND (, mm);

-- insert 1 tuple into the plain table
INSERT INTO parent_monthly VALUES (2013, 12, 01);
INSERT 0 1
Time: 3.303 ms


-- insert 1 tuple into the partitioned table
-- #part: number of partitions
-- case 1: find no valid partition
-- case 2: find a valid partition

 #parts  case 1  case 2

10  3.248 ms3.509 ms
100 3.546 ms3.269 ms
500 3.497 ms3.048 ms
10003.364 ms5.379 ms
1   4.943 ms5.076 ms

Thoughts?

Thanks,
Amit
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index a403c64..bf02730 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -14,7 +14,7 @@ OBJS = catalog.o dependency.o heap.o index.o indexing.o namespace.o aclchk.o \

Re: [HACKERS] Partitioning WIP patch (was: Partitioning: issues/ideas)

2015-02-24 Thread Amit Langote
On 24-02-2015 PM 05:13, Amit Langote wrote:
 -- partitions
 CREATE TABLE parent_monthly_x_201401 PARTITION OF
 parent_monthly_00100_201401 FOR VALUES BETWEEN (2014, 1) AND (2014, 2);
 
 CREATE TABLE parent_monthly_x_201402 PARTITION OF
 parent_monthly_00100_201402 FOR VALUES BETWEEN (2014, 2) AND (2014, 3);
 
 CREATE TABLE parent_monthly_x_201403 PARTITION OF
 parent_monthly_00100_201403 FOR VALUES BETWEEN (2014, 3) AND (2014, 4);
 
 snip
 
 CREATE TABLE parent_monthly_x_mm PARTITION OF
 parent_monthly_00100_mm FOR VALUES BETWEEN (, mm AND (, mm);
 

Oops, hand-edited PARTITION OF parent_monthly_ wrongly. Supposed to be -

CREATE TABLE parent_monthly_x_201401 PARTITION OF
parent_monthly_x FOR VALUES BETWEEN (2014, 1) AND (2014, 2);

and so on.

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch (was: Partitioning: issues/ideas)

2015-02-24 Thread Amit Langote
On 24-02-2015 PM 05:13, Amit Langote wrote:
 Additionally, a partition can itself be further partitioned (though I
 have not worked on the implementation details of multilevel partitioning
 yet):
 
 CREATE TABLE table_name PARTITION OF parent_name PARTITION BY
 {RANGE|LIST} ON(key_columns) FOR VALUES values_clause;

One more blunder, supposed to be:

CREATE TABLE table_name PARTITION OF parent_name FOR VALUES
values_clause PARTITION BY {RANGE|LIST} ON(key_columns);

Thanks,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partitioning WIP patch (was: Partitioning: issues/ideas)

2015-02-24 Thread Corey Huinker
I think it's confusing to use BETWEEN to mean [low,high) when it already
means [low,high] in WHERE clauses.

Why not leverage range notation instead?

CREATE TABLE parent_monthly_x_201401 PARTITION OF
parent_monthly_x FOR VALUES IN RANGE '[2014-04-01,2014-05-01)'

IN RANGE could easily be WITHIN RANGE or WITHIN or something else.

Clearly, this example above assumes that the partitioning is on a single
column.

For partitioning on a set of columns you're essentially creating a custom
composite type with major-minor collation, could that custom type be
created at table creation time? Could an existing composite type be
declared as the partition key?

CREATE TYPE year_month( year int, month int );

(CREATE OPERATOR... for  =  )

CREATE TABLE parent_monthly(year int, month int, day int) PARTITION BY
RANGE ON year_month(year, month);







On Tue, Feb 24, 2015 at 5:53 AM, Amit Langote langote_amit...@lab.ntt.co.jp
 wrote:

 On 24-02-2015 PM 05:13, Amit Langote wrote:
  Additionally, a partition can itself be further partitioned (though I
  have not worked on the implementation details of multilevel partitioning
  yet):
 
  CREATE TABLE table_name PARTITION OF parent_name PARTITION BY
  {RANGE|LIST} ON(key_columns) FOR VALUES values_clause;

 One more blunder, supposed to be:

 CREATE TABLE table_name PARTITION OF parent_name FOR VALUES
 values_clause PARTITION BY {RANGE|LIST} ON(key_columns);

 Thanks,
 Amit



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Partitioning WIP patch (was: Partitioning: issues/ideas)

2015-02-24 Thread Amit Langote
On 25-02-2015 AM 01:13, Corey Huinker wrote:
 I think it's confusing to use BETWEEN to mean [low,high) when it already
 means [low,high] in WHERE clauses.
 

Yeah, I'm not really attached to that syntax.

 Why not leverage range notation instead?
 
 CREATE TABLE parent_monthly_x_201401 PARTITION OF
 parent_monthly_x FOR VALUES IN RANGE '[2014-04-01,2014-05-01)'
 
 IN RANGE could easily be WITHIN RANGE or WITHIN or something else.
 
 Clearly, this example above assumes that the partitioning is on a single
 column.
 
 For partitioning on a set of columns you're essentially creating a custom
 composite type with major-minor collation, could that custom type be
 created at table creation time? Could an existing composite type be
 declared as the partition key?
 

The answer to the latter is yes as long as there is an operator class
that supports a strategy compatible with the chosen partitioning
strategy. For example, record/composite type has built-in support for
various btree strategies via record_ops.

As for the former, I tend to think creating new user-space
types/operators transparently might not be a very good idea. One concern
would be pg_dump related. Though I also agree there is some duplication
of major-minor collation logic in case of multi-column keys.

 CREATE TYPE year_month( year int, month int );
 
 (CREATE OPERATOR... for  =  )
 
 CREATE TABLE parent_monthly(year int, month int, day int) PARTITION BY
 RANGE ON year_month(year, month);
 

Regards,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers