On Mon, Feb 9, 2015 at 7:54 PM, Amit Langote <langote_amit...@lab.ntt.co.jp> wrote: >> Well, that's debatable IMO (especially your claim that variable-size >> partitions would be needed by a majority of users). But in any case, >> partitioning behavior that is emergent from a bunch of independent pieces >> of information scattered among N tables seems absolutely untenable from >> where I sit. Whatever we support, the behavior needs to be described by >> *one* chunk of information --- a sorted list of bin bounding values, >> perhaps. > > I'm a bit confused here. I got an impression that partitioning formula > as you suggest would consist of two pieces of information - an origin > point & a bin width. Then routing a tuple consists of using exactly > these two values to tell a bin number and hence a partition in O(1) time > assuming we've made all partitions be exactly bin-width wide. > > You mention here a sorted list of bin bounding values which we can very > well put together for a partitioned table in its relation descriptor > based on whatever information we stored in catalog. That is, we can > always have a *one* chunk of partitioning information as *internal* > representation irrespective of how generalized we make our on-disk > representation. We can get O(log N) if not O(1) from that I'd hope. In > fact, that's what I had in mind about this.
Sure, we can always assemble data into a relation descriptor from across multiple catalog entries. I think the question is whether there is any good reason to split up the information across multiple relations or whether it might not be better, as I have suggested multiple times, to serialize it using nodeToString() and stuff it in a single column in pg_class. There may be such a reason, but if you said what it was, I missed that. This thread started as a discussion about using range types, and I think it's pretty clear that's a bad idea, because: 1. There's no guarantee that a range type for the datatype exists at all. 2. If it does, there's no guarantee that it uses the same opclass that we want to use for partitioning, and I certainly think it would be strange if we refused to let the user pick the opclass she wants to use. 3. Even if there is a suitable range type available, it's a poor representational choice here, because it will be considerably more verbose than just storing a sorted list of partition bounds. In the common case where the ranges are adjacent, you'll end up storing two copies of every bound but the first and last for no discernable benefit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers