Re: [HACKERS] Writing Trigger Functions in C
Date: Sat, 29 Dec 2012 23:45:06 -0500 Subject: Re: [HACKERS] Writing Trigger Functions in C From: robertmh...@gmail.com To: charle...@outlook.com CC: cbbro...@gmail.com; pgsql-hackers@postgresql.org On Mon, Dec 24, 2012 at 10:43 AM, Charles Gomes charle...@outlook.com wrote: And I had no performance improvements at all. Took the same time as with the previous EXECUTE statement; I don't see what am I doing wrong. You might not be doing anything wrong. Triggers ARE slow. If you have perf on your system, you could use perf top or perf record -a to find out where the CPU time is going while you're doing stuff that fires this trigger. That might provide some clues about how to optimize. But it may be that you'll get a completely flat profile, or something that otherwise boils down to ... triggers are slow. In answer to your original question, there is a C language trigger in contrib/tcn. But, without some proof that the use of PL/pgsql is the problem, I don't know how far down that road it's worth going. It might be worth writing a C trigger that does nothing but return the original tuple, or even a PL/pgsql one. This obviously wouldn't accomplish anything as far as partitioning goes, but it would let you measure the overhead of calling a no-op trigger, which could be a useful thing to know. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company I've translated the trigger to C and performance had not increased, just like you guys said. I've created an article with the trigger and the metrics in case anyone becomes interested in the future http://www.charlesrg.com/linux/71-postgresql-partitioning-the-database-the-fastest-way -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writing Trigger Functions in C
On Mon, Dec 24, 2012 at 10:43 AM, Charles Gomes charle...@outlook.com wrote: And I had no performance improvements at all. Took the same time as with the previous EXECUTE statement; I don't see what am I doing wrong. You might not be doing anything wrong. Triggers ARE slow. If you have perf on your system, you could use perf top or perf record -a to find out where the CPU time is going while you're doing stuff that fires this trigger. That might provide some clues about how to optimize. But it may be that you'll get a completely flat profile, or something that otherwise boils down to ... triggers are slow. In answer to your original question, there is a C language trigger in contrib/tcn. But, without some proof that the use of PL/pgsql is the problem, I don't know how far down that road it's worth going. It might be worth writing a C trigger that does nothing but return the original tuple, or even a PL/pgsql one. This obviously wouldn't accomplish anything as far as partitioning goes, but it would let you measure the overhead of calling a no-op trigger, which could be a useful thing to know. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writing Trigger Functions in C
From: charle...@outlook.com To: cbbro...@gmail.com CC: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Writing Trigger Functions in C Date: Fri, 21 Dec 2012 12:27:26 -0500 Date: Fri, 21 Dec 2012 11:56:25 -0500 Subject: Re: [HACKERS] Writing Trigger Functions in C From: cbbro...@gmail.com To: charle...@outlook.com CC: pgsql-hackers@postgresql.org On Fri, Dec 21, 2012 at 11:25 AM, Charles Gomes charle...@outlook.commailto:charle...@outlook.com wrote: Hello guys, I've been finding performance issues when using a trigger to modify inserts on a partitioned table. If using the trigger the total time goes from 1 Hour to 4 hours. The trigger is pretty simple: CREATE OR REPLACE FUNCTION quotes_insert_trigger() RETURNS trigger AS $ BEGIN EXECUTE 'INSERT INTO quotes_'|| to_char(new.received_time,'_MM_DD') ||' VALUES (($1).*)' USING NEW ; RETURN NULL; END; $ LANGUAGE plpgsql; I've seen that some of you guys have worked on writing triggers in C. Does anyone have had an experience writing a trigger for partitioning in C ? I'd want to be very careful about assuming that implementing the trigger function in C would necessarily improve performance. It's pretty likely that it wouldn't help much, as a fair bit of the cost of firing a trigger have to do with figuring out which function to call, marshalling arguments, and calling the function, none of which would magically disappear by virtue of implementing in C. A *major* cost that your existing implementation has is that it's re-planning the queries for every single invocation. This is an old, old problem from the Lisp days, EVAL considered evil http://stackoverflow.com/questions/2571401/why-exactly-is-eval-evil The EXECUTE winds up replanning queries every time the trigger fires. If you can instead enumerate the partitions explicitly, putting them into (say) a CASE clause, the planner could generate the plan once, rather than a million times, which would be a HUGE savings, vastly greater than you could expect from recoding into C. The function might look more like: create or replace function quotes_insert_trigger () returns trigger as $$ declare c_rt text; begin c_rt := to_char(new.received_time, '_MM_DD'); case c_rt when '2012_03_01' then insert into 2012_03_01 values (NEW.*) using new; when '2012_03_02' then insert into 2012_03_02 values (NEW.*) using new; else raise exception 'Need a new partition function for %', c_rt; end case; end $$ language plpgsql; You'd periodically need to change the function to reflect the existing set of partitions, but that's cheaper than creating a new partition. The case statement gets more expensive (in effect O(n) on the number of partitions, n) as the number of partitions increases. You could split the date into pieces (e.g. - years, months, days) to diminish that cost. But at any rate, this should be *way* faster than what you're running now, and not at any heinous change in development costs (as would likely be the case reimplementing using SPI). -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this? I will change and implement it this way, I was not aware of such optimization. Will post back after my benchmark runs. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers I've to have add 2 weeks of data at a time, therefore I had to keep two weeks of case statements Replaced the short trigger function to: CREATE OR REPLACE FUNCTION quotes_insert_trigger() RETURNS trigger AS $$ DECLARE r_date text; BEGIN r_date = to_char(new.received_time, '_MM_DD'); case r_date when '2012_09_10' then insert into quotes_2012_09_10 values (NEW.*) using new; return; when '2012_09_11' then insert into quotes_2012_09_11 values (NEW.*) using new; return; when '2012_09_12' then insert into quotes_2012_09_12 values (NEW.*) using new; return; when '2012_09_13' then insert into quotes_2012_09_13 values (NEW.*) using new; return; when '2012_09_14' then insert into quotes_2012_09_14 values (NEW.*) using new; return; when '2012_09_15' then insert into quotes_2012_09_15 values (NEW.*) using new; return; when '2012_09_16' then insert into quotes_2012_09_16 values (NEW.*) using new; return; when '2012_09_17' then insert into quotes_2012_09_17 values (NEW.*) using new; return; when '2012_09_18' then insert into quotes_2012_09_18 values (NEW.*) using new; return; when '2012_09_19' then insert
Re: [HACKERS] Writing Trigger Functions in C
On Fri, Dec 21, 2012 at 10:25 AM, Charles Gomes charle...@outlook.com wrote: Hello guys, I've been finding performance issues when using a trigger to modify inserts on a partitioned table. If using the trigger the total time goes from 1 Hour to 4 hours. The trigger is pretty simple: CREATE OR REPLACE FUNCTION quotes_insert_trigger() RETURNS trigger AS $$ BEGIN EXECUTE 'INSERT INTO quotes_'|| to_char(new.received_time,'_MM_DD') ||' VALUES (($1).*)' USING NEW ; RETURN NULL; END; $$ LANGUAGE plpgsql; I've seen that some of you guys have worked on writing triggers in C. Does anyone have had an experience writing a trigger for partitioning in C ? If you have some code to paste so I can start from I will really appreciate. Honestly I'd leave the trigger alone and modify the client code in performance sensitive places to insert directly to the correct partition table. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writing Trigger Functions in C
Hello you can find lot of examples in PostgreSQL source code - see postgresql/contrib/spi directory and documentation http://www.postgresql.org/docs/9.0/static/trigger-example.html Regards Pavel Stehule 2012/12/21 Charles Gomes charle...@outlook.com: Hello guys, I've been finding performance issues when using a trigger to modify inserts on a partitioned table. If using the trigger the total time goes from 1 Hour to 4 hours. The trigger is pretty simple: CREATE OR REPLACE FUNCTION quotes_insert_trigger() RETURNS trigger AS $$ BEGIN EXECUTE 'INSERT INTO quotes_'|| to_char(new.received_time,'_MM_DD') ||' VALUES (($1).*)' USING NEW ; RETURN NULL; END; $$ LANGUAGE plpgsql; I've seen that some of you guys have worked on writing triggers in C. Does anyone have had an experience writing a trigger for partitioning in C ? If you have some code to paste so I can start from I will really appreciate. Thanks -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writing Trigger Functions in C
On 12/21/2012 08:39 AM, Merlin Moncure wrote: On Fri, Dec 21, 2012 at 10:25 AM, Charles Gomes charle...@outlook.com wrote: Hello guys, I've been finding performance issues when using a trigger to modify inserts on a partitioned table. If using the trigger the total time goes from 1 Hour to 4 hours. The trigger is pretty simple: CREATE OR REPLACE FUNCTION quotes_insert_trigger() RETURNS trigger AS $$ BEGIN EXECUTE 'INSERT INTO quotes_'|| to_char(new.received_time,'_MM_DD') ||' VALUES (($1).*)' USING NEW ; RETURN NULL; END; $$ LANGUAGE plpgsql; I've seen that some of you guys have worked on writing triggers in C. Does anyone have had an experience writing a trigger for partitioning in C ? If you have some code to paste so I can start from I will really appreciate. Honestly I'd leave the trigger alone and modify the client code in performance sensitive places to insert directly to the correct partition table. I second that recommendation -- your performance will be much, much, better. Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writing Trigger Functions in C
On Fri, Dec 21, 2012 at 11:25 AM, Charles Gomes charle...@outlook.com wrote: Hello guys, I've been finding performance issues when using a trigger to modify inserts on a partitioned table. If using the trigger the total time goes from 1 Hour to 4 hours. The trigger is pretty simple: CREATE OR REPLACE FUNCTION quotes_insert_trigger() RETURNS trigger AS $ BEGIN EXECUTE 'INSERT INTO quotes_'|| to_char(new.received_time,'_MM_DD') ||' VALUES (($1).*)' USING NEW ; RETURN NULL; END; $ LANGUAGE plpgsql; I've seen that some of you guys have worked on writing triggers in C. Does anyone have had an experience writing a trigger for partitioning in C ? I'd want to be very careful about assuming that implementing the trigger function in C would necessarily improve performance. It's pretty likely that it wouldn't help much, as a fair bit of the cost of firing a trigger have to do with figuring out which function to call, marshalling arguments, and calling the function, none of which would magically disappear by virtue of implementing in C. A *major* cost that your existing implementation has is that it's re-planning the queries for every single invocation. This is an old, old problem from the Lisp days, EVAL considered evil http://stackoverflow.com/questions/2571401/why-exactly-is-eval-evil The EXECUTE winds up replanning queries every time the trigger fires. If you can instead enumerate the partitions explicitly, putting them into (say) a CASE clause, the planner could generate the plan once, rather than a million times, which would be a HUGE savings, vastly greater than you could expect from recoding into C. The function might look more like: create or replace function quotes_insert_trigger () returns trigger as $$ declare c_rt text; begin c_rt := to_char(new.received_time, '_MM_DD'); case c_rt when '2012_03_01' then insert into 2012_03_01 values (NEW.*) using new; when '2012_03_02' then insert into 2012_03_02 values (NEW.*) using new; else raise exception 'Need a new partition function for %', c_rt; end case; end $$ language plpgsql; You'd periodically need to change the function to reflect the existing set of partitions, but that's cheaper than creating a new partition. The case statement gets more expensive (in effect O(n) on the number of partitions, n) as the number of partitions increases. You could split the date into pieces (e.g. - years, months, days) to diminish that cost. But at any rate, this should be *way* faster than what you're running now, and not at any heinous change in development costs (as would likely be the case reimplementing using SPI). -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this?
Re: [HACKERS] Writing Trigger Functions in C
2012/12/21 Joe Conway m...@joeconway.com: On 12/21/2012 08:39 AM, Merlin Moncure wrote: On Fri, Dec 21, 2012 at 10:25 AM, Charles Gomes charle...@outlook.com wrote: Hello guys, I've been finding performance issues when using a trigger to modify inserts on a partitioned table. If using the trigger the total time goes from 1 Hour to 4 hours. The trigger is pretty simple: CREATE OR REPLACE FUNCTION quotes_insert_trigger() RETURNS trigger AS $$ BEGIN EXECUTE 'INSERT INTO quotes_'|| to_char(new.received_time,'_MM_DD') ||' VALUES (($1).*)' USING NEW ; RETURN NULL; END; $$ LANGUAGE plpgsql; I've seen that some of you guys have worked on writing triggers in C. Does anyone have had an experience writing a trigger for partitioning in C ? If you have some code to paste so I can start from I will really appreciate. Honestly I'd leave the trigger alone and modify the client code in performance sensitive places to insert directly to the correct partition table. I second that recommendation -- your performance will be much, much, better. sure Pavel Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Writing Trigger Functions in C
Date: Fri, 21 Dec 2012 11:56:25 -0500 Subject: Re: [HACKERS] Writing Trigger Functions in C From: cbbro...@gmail.com To: charle...@outlook.com CC: pgsql-hackers@postgresql.org On Fri, Dec 21, 2012 at 11:25 AM, Charles Gomes charle...@outlook.commailto:charle...@outlook.com wrote: Hello guys, I've been finding performance issues when using a trigger to modify inserts on a partitioned table. If using the trigger the total time goes from 1 Hour to 4 hours. The trigger is pretty simple: CREATE OR REPLACE FUNCTION quotes_insert_trigger() RETURNS trigger AS $ BEGIN EXECUTE 'INSERT INTO quotes_'|| to_char(new.received_time,'_MM_DD') ||' VALUES (($1).*)' USING NEW ; RETURN NULL; END; $ LANGUAGE plpgsql; I've seen that some of you guys have worked on writing triggers in C. Does anyone have had an experience writing a trigger for partitioning in C ? I'd want to be very careful about assuming that implementing the trigger function in C would necessarily improve performance. It's pretty likely that it wouldn't help much, as a fair bit of the cost of firing a trigger have to do with figuring out which function to call, marshalling arguments, and calling the function, none of which would magically disappear by virtue of implementing in C. A *major* cost that your existing implementation has is that it's re-planning the queries for every single invocation. This is an old, old problem from the Lisp days, EVAL considered evil http://stackoverflow.com/questions/2571401/why-exactly-is-eval-evil The EXECUTE winds up replanning queries every time the trigger fires. If you can instead enumerate the partitions explicitly, putting them into (say) a CASE clause, the planner could generate the plan once, rather than a million times, which would be a HUGE savings, vastly greater than you could expect from recoding into C. The function might look more like: create or replace function quotes_insert_trigger () returns trigger as $$ declare c_rt text; begin c_rt := to_char(new.received_time, '_MM_DD'); case c_rt when '2012_03_01' then insert into 2012_03_01 values (NEW.*) using new; when '2012_03_02' then insert into 2012_03_02 values (NEW.*) using new; else raise exception 'Need a new partition function for %', c_rt; end case; end $$ language plpgsql; You'd periodically need to change the function to reflect the existing set of partitions, but that's cheaper than creating a new partition. The case statement gets more expensive (in effect O(n) on the number of partitions, n) as the number of partitions increases. You could split the date into pieces (e.g. - years, months, days) to diminish that cost. But at any rate, this should be *way* faster than what you're running now, and not at any heinous change in development costs (as would likely be the case reimplementing using SPI). -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this? I will change and implement it this way, I was not aware of such optimization. Will post back after my benchmark runs. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers