On 2016/11/17 20:27, Amit Langote wrote: > On 2016/11/16 4:21, Robert Haas wrote: >> Have you done any performance testing on the tuple routing code? >> Suppose we insert a million (or 10 million) tuples into an >> unpartitioned table, a table with 10 partitions, a table with 100 >> partitions, a table with 1000 partitions, and a table that is >> partitioned into 10 partitions each of which has 10 subpartitions. >> Ideally, the partitioned cases would run almost as fast as the >> unpartitioned case, but probably there will be some overhead. >> However, it would be useful to know how much. Also, it would be >> useful to set up the same cases with inheritance using a PL/pgsql ON >> INSERT trigger for tuple routing and compare. Hopefully the tuple >> routing code is far faster than a trigger, but we should make sure >> that's the case and look for optimizations if not. Also, it would be >> useful to know how much slower the tuple-mapping-required case is than >> the no-tuple-mapping-required case. > > OK, I will share the performance results soon.
Sorry about the delay; here are some numbers with the following partitioning schema: # plain table create table plain (a date, b int, c int); # partitioned table create table ptab (a date, b int, c int) partition by range (a, b); Partitions (not the full commands): ptab_00001 for values from ('2016-11-29', 1) to ('2016-11-29', 1000); ptab_00002 for values from ('2016-11-29', 1000) to ('2016-11-29', 2000); ... ptab_00005 for values from ('2016-11-29', 4000) to ('2016-11-29', 5000); ptab_00006 for values from ('2016-11-30', 1) to ('2016-11-30', 1000); ... ... ptab_NNNNN for values from ('20XX-XX-XX', 4000) to ('20XX-XX-XX', 5000); # inheritance partitioned table create table itab (a date, b int, c int); create table itab_00001 ( check part_check check (a = '2016-11-29' and b >= 1 and b < 1000) ) inherits (itab); ... create table itab_00005 ( check part_check check (a = '2016-11-29' and b >= 4000 and b < 5000) ) inherits (itab); create table itab_0006 ( check part_check check (a = '2016-11-30' and b >= 1and b < 1000) ) inherits (itab); ... ... create table itab_NNNNN ( check part_check check (a = '2016-11-29' and b >= 4000 and b < 5000) ) inherits (itab); The BR trigger (on itab) procedure as follows: CREATE OR REPLACE FUNCTION itab_ins_trig() RETURNS TRIGGER AS $$ DECLARE partno text; BEGIN SELECT to_char((NEW.a - '2016-11-29'::date) * 5 + NEW.b / 1000 + 1, 'fm00000') INTO partno; EXECUTE 'INSERT INTO itab_' || partno || ' SELECT $1.*' USING NEW; RETURN NULL; END; $$ LANGUAGE plpgsql; Note that the tuple-routing procedure above assumes a fixed-stride range partitioning scheme (shown as tg-direct-map below). In other cases, the simplest approach involves defining a if-else ladder, which I tried too (shown as tg-if-else below), but reporting times only for up to 200 partitions at most (I'm sure there might be ways to be smarter there somehow, but I didn't; the point here may only be to compare the new tuple-routing code's overhead vs. trigger overhead in the traditional method). # All times in seconds (on my modestly-powerful development VM) # # nrows = 10,000,000 generated using: # # INSERT INTO $tab # SELECT '$last'::date - ((s.id % $maxsecs + 1)::bigint || 's')::interval, # (random() * 5000)::int % 4999 + 1, # case s.id % 10 # when 0 then 'a' # when 1 then 'b' # when 2 then 'c' # ... # when 9 then 'j' # end # FROM generate_series(1, $nrows) s(id) # ORDER BY random(); # # The first item in the select list is basically a date that won't fall # outside the defined partitions. Time for a plain table = 98.1 sec #part parted tg-direct-map tg-if-else ===== ====== ============= ========== 10 114.3 1483.3 742.4 50 112.5 1476.6 2016.8 100 117.1 1498.4 5386.1 500 125.3 1475.5 -- 1000 129.9 1474.4 -- 5000 137.5 1491.4 -- 10000 154.7 1480.9 -- Then for a 2-level partitioned table with each of the above partitions partitioned by list (c), with 10 sub-partitions each as follows: ptab_NNNNN_a for values in ('a'); ptab_NNNNN_b for values in ('b'); ... ptab_NNNNN_k for values in ('j'); I didn't include the times for inheritance table with a routing trigger in this case, as it seems that the results would look something like the above: Time for a plain table = 98.1 sec #part (sub-)parted ===== ============ 10 127.0 50 152.3 100 156.6 500 191.8 1000 187.3 Regarding tuple-mapping-required vs no-tuple-mapping-required, all cases currently require tuple-mapping, because the decision is based on the result of comparing parent and partition TupleDesc using equalTupleDescs(), which fails so quickly because TupleDesc.tdtypeid are not the same. Anyway, I simply commented out the tuple-mapping statement in ExecInsert() to observe just slightly improved numbers as follows (comparing with numbers in the table just above): #part (sub-)parted ===== ================= 10 113.9 (vs. 127.0) 100 135.7 (vs. 156.6) 500 182.1 (vs. 191.8) Thanks, Amit -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers