On Thu, 2 Mar 2017 18:33:42 +0530 amul sul <sula...@gmail.com> wrote:
Thank you for the patch. This is very interesting. I'm going to look into your code and write a feedback later. > On Wed, Mar 1, 2017 at 3:50 PM, Yugo Nagata <nag...@sraoss.co.jp> wrote: > > > [....] > > > > I Agree that it is unavoidable partitions number in modulo hashing, > > > but we can do in other hashing technique. Have you had thought about > > > Linear hashing[1] or Consistent hashing[2]? This will allow us to > > > add/drop > > > partition with minimal row moment. > > > > Thank you for your information of hash technique. I'll see them > > and try to allowing the number of partitions to be changed. > > > > > Thanks for showing interest, I was also talking about this with Robert Haas > and > hacking on this, here is what we came up with this. > > If we want to introduce hash partitioning without syntax contort and minimal > movement while changing hash partitions (ADD-DROP/ATTACH-DETACH operation), > at start I thought we could pick up linear hashing, because of in both the > hashing we might need to move approx tot_num_of_tuple/tot_num_of_partitions > tuples at adding new partition and no row moment required at dropping > partitioning. > > With further thinking and talking through the idea of using linear hashing > with my team, we realized that has some problems specially during pg_dump > and pg_upgrade. Both a regular pg_dump and the binary-upgrade version of > pg_dump which is used by pg_restore need to maintain the identity of the > partitions. We can't rely on things like OID order which may be unstable, or > name order which might not match the order in which partitions were added. > So > somehow the partition position would need to be specified explicitly. > > So later we came up with some syntax like this (just fyi, this doesn't add > any new keywords): > > create table foo (a integer, b text) partition by hash (a); > create table foo1 partition of foo with (modulus 4, remainder 0); > create table foo2 partition of foo with (modulus 8, remainder 1); -- > legal, modulus doesn't need to match > create table foo3 partition of foo with (modulus 8, remainder 4); -- > illegal, overlaps foo1 > > Here we need to enforce a rule that every modulus must be a factor of the > next > larger modulus. So, for example, if you have a bunch of partitions that all > have > modulus 5, you can add a new partition with modulus 10 or a new partition > with > modulus 15, but you cannot add both a partition with modulus 10 and a > partition > with modulus 15, because 10 is not a factor of 15. However, you could > simultaneously use modulus 4, modulus 8, modulus 16, and modulus 32 if you > wished, because each modulus is a factor of the next larger one. You could > also use modulus 10, modulus 20, and modulus 60. But you could not use > modulus > 10, modulus 15, and modulus 60, because while both of the smaller module are > factors of 60, it is not true that each is a factor of the next. > > Other advantages with this rule are: > > 1. Dropping (or detaching) and adding (or attaching) a partition can never > cause the rule to be violated. > > 2. We can easily build a tuple-routing data structure based on the largest > modulus. > > For example: If the user has > partition 1 with (modulus 2, remainder 1), > partition 2 with (modulus 4, remainder 2), > partition 3 with (modulus 8, remainder 0) and > partition 4 with (modulus 8, remainder 4), > > then we can build the following tuple routing array in the relcache: > > == lookup table for hashvalue % 8 == > 0 => p3 > 1 => p1 > 2 => p2 > 3 => p1 > 4 => p4 > 5 => p1 > 6 => p2 > 7 => p1 > > 3. It's also quite easy to test with a proposed new partition overlaps with > any > existing partition. Just build the mapping array and see if you ever end up > trying to assign a partition to a slot that's already been assigned to some > other partition. > > We can still work on the proposed syntax - and I am open for suggestions. > One > more thought is to use FOR VALUES HAVING like: > CREATE TABLE foo1 PARTITION OF foo FOR VALUES HAVING (modulus 2, remainder > 1); > > But still more thoughts/inputs welcome here. > > Attached patch implements former syntax, here is quick demonstration: > > 1.CREATE : > create table foo (a integer, b text) partition by hash (a); > create table foo1 partition of foo with (modulus 2, remainder 1); > create table foo2 partition of foo with (modulus 4, remainder 2); > create table foo3 partition of foo with (modulus 8, remainder 0); > create table foo4 partition of foo with (modulus 8, remainder 4); > > 2. Display parent table info: > postgres=# \d+ foo > Table "public.foo" > Column | Type | Collation | Nullable | Default | Storage | Stats > target | Description > --------+---------+-----------+----------+---------+----------+--------------+------------- > a | integer | | | | plain | > | > b | text | | | | extended | > | > Partition key: HASH (a) > Partitions: foo1 WITH (modulus 2, remainder 1), > foo2 WITH (modulus 4, remainder 2), > foo3 WITH (modulus 8, remainder 0), > foo4 WITH (modulus 8, remainder 4) > > 3. Display child table info: > postgres=# \d+ foo1 > Table "public.foo1" > Column | Type | Collation | Nullable | Default | Storage | Stats > target | Description > --------+---------+-----------+----------+---------+----------+--------------+------------- > a | integer | | | | plain | > | > b | text | | | | extended | > | > Partition of: foo WITH (modulus 2, remainder 1) > > 4. INSERT: > postgres=# insert into foo select i, 'abc' from generate_series(1,10) i; > INSERT 0 10 > > postgres=# select tableoid::regclass as part, * from foo; > part | a | b > ------+----+----- > foo1 | 3 | abc > foo1 | 4 | abc > foo1 | 7 | abc > foo1 | 10 | abc > foo2 | 1 | abc > foo2 | 2 | abc > foo2 | 9 | abc > foo3 | 6 | abc > foo4 | 5 | abc > foo4 | 8 | abc > (10 rows) > > TODOs. > 1. Maybe need some work in the CREATE TABLE .. PARTITION OF .. syntax. > 2. Trim regression tests (if require). > 3. Documentation > > Thoughts/Comments? -- Yugo Nagata <nag...@sraoss.co.jp> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers