Re: [PERFORM] Partition insert trigger using C language
2013/2/13 Matheus de Oliveira matioli.math...@gmail.com Hi Ali, That is probably because you did not passed a parameter when defined the trigger. You can follow the model at [1]. When creating the trigger, you have to use a string parameter with the name of the field with the date value used for the partition. Let me know if you find any other problem. [1] https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/test/spi/schema.sql Regards, -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgres Hi Matheus, Yes. You are right. Now it's OK. I thought that the trigger function cannont vehicle any arguments. I had mis-interpreted this phrase of the Documentationhttp://www.postgresql.org/docs/9.2/static/sql-createtrigger.html: *function_name** * *A user-supplied function that is declared as taking no arguments and returning type trigger, which is executed when the trigger fires.* Thanks and best regards Ali
Re: [PERFORM] Partition insert trigger using C language
On Mon, Feb 11, 2013 at 1:24 PM, Ali Pouya alipou...@gmail.com wrote: Hi Matheus, I try to use your partition_insert_trigger_spi.c code for testing SPI partitionning. But at execution time the *trigdata-tg_trigger-tgargs* pointer is null. Do you know why ? Thanks a lot Best Reagrds Ali Pouya Hi Ali, That is probably because you did not passed a parameter when defined the trigger. You can follow the model at [1]. When creating the trigger, you have to use a string parameter with the name of the field with the date value used for the partition. Let me know if you find any other problem. [1] https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/test/spi/schema.sql Regards, -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgres
Re: [PERFORM] Partition insert trigger using C language
Hi Matheus, I try to use your partition_insert_trigger_spi.c code for testing SPI partitionning. But at execution time the *trigdata-tg_trigger-tgargs* pointer is null. Do you know why ? Thanks a lot Best Reagrds Ali Pouya 2013/1/11 Matheus de Oliveira matioli.math...@gmail.com On Fri, Jan 11, 2013 at 8:19 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 10.01.2013 21:48, Matheus de Oliveira wrote: I have made a small modification to keep the plans, and it got from 33957.768ms to 43782.376ms. If I'm reading results.txt correctly, the avg runtimes are: C and SPI_execute_with_args: 58567.708 ms C and SPI_(prepare/keepplan/execute_**plan): 43782.376 ms C and heap_insert: 33957.768 ms So switching to prepared plans helped quite a lot, but it's still slower than direct heap_inserts. Humm... You are right, I misread what it before, sorry. The 33957.768ms was with heap_insert. One thing that caught my eye: CREATE OR REPLACE FUNCTION partition_insert_trigger_spi() RETURNS trigger LANGUAGE C VOLATILE STRICT AS 'partition_insert_trigger_spi'**,'partition_insert_trigger_**spi' SET DateStyle TO 'ISO'; Calling a function with SET options has a fair amount of overhead, to set/restore the GUC on every invocation. That should be avoided in a performance critical function like this. I (stupidly) used SPI_getvalue [1] and expected it to always return as -MM-DD, but them I remembered it would do that only with DateStyle=ISO. But the truth is that I couldn't see any overhead, because the function was without that on my first tests, and after that I saw no difference on the tests. I think I should use SPI_getbinvalue instead, but I don't know how to parse the result to get year and month, any help on that? [1] https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/src/spi/partition_insert_trigger_spi.c#L103 Regards, -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgres
Re: [PERFORM] Partition insert trigger using C language
On 10.01.2013 21:48, Matheus de Oliveira wrote: I have made a small modification to keep the plans, and it got from 33957.768ms to 43782.376ms. If I'm reading results.txt correctly, the avg runtimes are: C and SPI_execute_with_args: 58567.708 ms C and SPI_(prepare/keepplan/execute_plan): 43782.376 ms C and heap_insert: 33957.768 ms So switching to prepared plans helped quite a lot, but it's still slower than direct heap_inserts. One thing that caught my eye: CREATE OR REPLACE FUNCTION partition_insert_trigger_spi() RETURNS trigger LANGUAGE C VOLATILE STRICT AS 'partition_insert_trigger_spi','partition_insert_trigger_spi' SET DateStyle TO 'ISO'; Calling a function with SET options has a fair amount of overhead, to set/restore the GUC on every invocation. That should be avoided in a performance critical function like this. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Partition insert trigger using C language
On Fri, Jan 11, 2013 at 8:19 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 10.01.2013 21:48, Matheus de Oliveira wrote: I have made a small modification to keep the plans, and it got from 33957.768ms to 43782.376ms. If I'm reading results.txt correctly, the avg runtimes are: C and SPI_execute_with_args: 58567.708 ms C and SPI_(prepare/keepplan/execute_**plan): 43782.376 ms C and heap_insert: 33957.768 ms So switching to prepared plans helped quite a lot, but it's still slower than direct heap_inserts. Humm... You are right, I misread what it before, sorry. The 33957.768ms was with heap_insert. One thing that caught my eye: CREATE OR REPLACE FUNCTION partition_insert_trigger_spi() RETURNS trigger LANGUAGE C VOLATILE STRICT AS 'partition_insert_trigger_spi'**,'partition_insert_trigger_**spi' SET DateStyle TO 'ISO'; Calling a function with SET options has a fair amount of overhead, to set/restore the GUC on every invocation. That should be avoided in a performance critical function like this. I (stupidly) used SPI_getvalue [1] and expected it to always return as -MM-DD, but them I remembered it would do that only with DateStyle=ISO. But the truth is that I couldn't see any overhead, because the function was without that on my first tests, and after that I saw no difference on the tests. I think I should use SPI_getbinvalue instead, but I don't know how to parse the result to get year and month, any help on that? [1] https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/src/spi/partition_insert_trigger_spi.c#L103 Regards, -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgres
Re: [PERFORM] Partition insert trigger using C language
On Thu, Jan 10, 2013 at 5:51 PM, Charles Gomes charles.go...@benchmarksolutions.com wrote: ** ** *From:* pgsql-performance-ow...@postgresql.org [mailto: pgsql-performance-ow...@postgresql.org] *On Behalf Of *Matheus de Oliveira *Sent:* Thursday, January 10, 2013 2:12 PM *To:* Heikki Linnakangas *Cc:* pgsql-performance; Charles Gomes *Subject:* Re: [PERFORM] Partition insert trigger using C language ** ** ** ** On Thu, Jan 10, 2013 at 4:54 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 10.01.2013 20:45, Matheus de Oliveira wrote: Inspired by Charles' thread and the work of Emmanuel [1], I have made some experiments trying to create a trigger to make partitioning using C language. The first attempt was not good, I tried to use SPI [2] to create a query to insert into the correct child table, but it took almost no improvement compared with the PL/pgSQL code. The right way to do this with SPI is to prepare each insert-statement on first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after that (SPI_execute_with_args). If you construct and plan the query on every invocation, it's not surprising that it's no different from PL/pgSQL performance. Yeah. I thought about that, but the problem was that I assumed the INSERTs came with random date, so in the worst scenario I would have to keep the plans of all of the child partitions. Am I wrong? But thinking better, even with hundreds of partitions, it wouldn't use to much memory/resource, would it? In fact, I didn't give to much attention to SPI method, because the other one is where we can have more fun, =P. Anyway, I'll change the code (maybe now), and see if it gets closer to the other method (that uses heap_insert), and will post back the results here. Interesting that you got an improvement. In my case I get almost no improvement at all: ** ** PL/SQL – Dynamic Trigger 4:15:54 PL/SQL - CASE / WHEN Statements 4:12:29 PL/SQL - IF Statements 4:12:39 C Trigger 4:10:49 ** ** Here is my code, I’m using heap insert and updating the indexes. With a similar approach of yours. The trigger is aware of http://www.charlesrg.com/~charles/pgsql/partition2.c ** Humm... Looking at your code, I saw no big difference from mine. The only thing I saw is that you don't fire triggers, but it would be even faster this way. Another thing that could cause that is the number of partitions, I tried only with 12. Could you make a test suite? Or try to run with my function in your scenario? It would be easy to make it get the partitions by day [1]. [1] https://gist.github.com/4509782 Regards, -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgres
Re: [PERFORM] Partition insert trigger using C language
On Fri, Jan 11, 2013 at 9:02 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 11.01.2013 12:36, Matheus de Oliveira wrote: On Fri, Jan 11, 2013 at 8:19 AM, Heikki Linnakangashlinnakangas@** vmware.com hlinnakan...@vmware.com wrote: One thing that caught my eye: CREATE OR REPLACE FUNCTION partition_insert_trigger_spi() RETURNS trigger LANGUAGE C VOLATILE STRICT AS 'partition_insert_trigger_spi','partition_insert_trigger_*** *spi' SET DateStyle TO 'ISO'; Calling a function with SET options has a fair amount of overhead, to set/restore the GUC on every invocation. That should be avoided in a performance critical function like this. I (stupidly) used SPI_getvalue [1] and expected it to always return as -MM-DD, but them I remembered it would do that only with DateStyle=ISO. But the truth is that I couldn't see any overhead, because the function was without that on my first tests, and after that I saw no difference on the tests. Oh, ok then. I would've expected it to make a measurable difference. I think I should use SPI_getbinvalue instead, but I don't know how to parse the result to get year and month, any help on that? The fastest way is probably to use j2date like date_out does: DateADT date = DatumGetDateADT(x) int year, month, mday; if (DATE_NOT_FINITE(date)) elog(ERROR, date must be finite); j2date(date + POSTGRES_EPOCH_JDATE, year, month, mday); - Heikki Nice. With the modifications you suggested I did saw a good improvement on the function using SPI (and a little one with heap_insert). So I was wrong to think that change the GUC would not make to much difference, the SPI code now runs almost as fast as the heap_insert: heap_insert: 31896.098 ms SPI: 36558.564 Of course I still could make some improvements on it, like using a LRU to keep the plans, or something like that. The new code is at github. Regards, -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgres
Re: [PERFORM] Partition insert trigger using C language
On 10.01.2013 20:45, Matheus de Oliveira wrote: Inspired by Charles' thread and the work of Emmanuel [1], I have made some experiments trying to create a trigger to make partitioning using C language. The first attempt was not good, I tried to use SPI [2] to create a query to insert into the correct child table, but it took almost no improvement compared with the PL/pgSQL code. The right way to do this with SPI is to prepare each insert-statement on first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after that (SPI_execute_with_args). If you construct and plan the query on every invocation, it's not surprising that it's no different from PL/pgSQL performance. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Partition insert trigger using C language
On Thu, Jan 10, 2013 at 4:54 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 10.01.2013 20:45, Matheus de Oliveira wrote: Inspired by Charles' thread and the work of Emmanuel [1], I have made some experiments trying to create a trigger to make partitioning using C language. The first attempt was not good, I tried to use SPI [2] to create a query to insert into the correct child table, but it took almost no improvement compared with the PL/pgSQL code. The right way to do this with SPI is to prepare each insert-statement on first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after that (SPI_execute_with_args). If you construct and plan the query on every invocation, it's not surprising that it's no different from PL/pgSQL performance. Yeah. I thought about that, but the problem was that I assumed the INSERTs came with random date, so in the worst scenario I would have to keep the plans of all of the child partitions. Am I wrong? But thinking better, even with hundreds of partitions, it wouldn't use to much memory/resource, would it? In fact, I didn't give to much attention to SPI method, because the other one is where we can have more fun, =P. Anyway, I'll change the code (maybe now), and see if it gets closer to the other method (that uses heap_insert), and will post back the results here. Thanks, -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgres
Re: [PERFORM] Partition insert trigger using C language
2013/1/10 Heikki Linnakangas hlinnakan...@vmware.com: On 10.01.2013 20:45, Matheus de Oliveira wrote: Inspired by Charles' thread and the work of Emmanuel [1], I have made some experiments trying to create a trigger to make partitioning using C language. The first attempt was not good, I tried to use SPI [2] to create a query to insert into the correct child table, but it took almost no improvement compared with the PL/pgSQL code. The right way to do this with SPI is to prepare each insert-statement on first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after that (SPI_execute_with_args). If you construct and plan the query on every invocation, it's not surprising that it's no different from PL/pgSQL performance. This a problematic for partitioning, because you need too much plans - and direct access is probably better - I am thinking. On second hand, there is relative high possibility to get inconsistent relations - broken indexes, if somebody don't write trigger well. Maybe we can enhance copy to support partitioning better. Now I have a prototype for fault tolerant copy and it can work nice together with some partitioning support Regards Pavel - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Partition insert trigger using C language
On 10.01.2013 21:11, Matheus de Oliveira wrote: On Thu, Jan 10, 2013 at 4:54 PM, Heikki Linnakangashlinnakan...@vmware.com wrote: The right way to do this with SPI is to prepare each insert-statement on first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after that (SPI_execute_with_args). If you construct and plan the query on every invocation, it's not surprising that it's no different from PL/pgSQL performance. Yeah. I thought about that, but the problem was that I assumed the INSERTs came with random date, so in the worst scenario I would have to keep the plans of all of the child partitions. Am I wrong? But thinking better, even with hundreds of partitions, it wouldn't use to much memory/resource, would it? Right, a few hundred saved plans would probably still be ok. And if that ever becomes a problem, you could keep the plans in a LRU list and only keep the last 100 plans or so. - Heikki -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Partition insert trigger using C language
On Thu, Jan 10, 2013 at 5:22 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 10.01.2013 21:11, Matheus de Oliveira wrote: On Thu, Jan 10, 2013 at 4:54 PM, Heikki Linnakangashlinnakangas@** vmware.com hlinnakan...@vmware.com wrote: The right way to do this with SPI is to prepare each insert-statement on first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after that (SPI_execute_with_args). If you construct and plan the query on every invocation, it's not surprising that it's no different from PL/pgSQL performance. Yeah. I thought about that, but the problem was that I assumed the INSERTs came with random date, so in the worst scenario I would have to keep the plans of all of the child partitions. Am I wrong? But thinking better, even with hundreds of partitions, it wouldn't use to much memory/resource, would it? Right, a few hundred saved plans would probably still be ok. And if that ever becomes a problem, you could keep the plans in a LRU list and only keep the last 100 plans or so. I have made a small modification to keep the plans, and it got from 33957.768ms to 43782.376ms. I'm not sure if I did something wrong/stupid on the code [1], or if something else broke my test. I can't rerun the test today, but I'll do that as soon as I have time. [1] https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/src/spi/partition_insert_trigger_spi.c -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgres
Re: [PERFORM] Partition insert trigger using C language
From: matioli.math...@gmail.com Date: Thu, 10 Jan 2013 16:45:43 -0200 Subject: Partition insert trigger using C language To: pgsql-performance@postgresql.org CC: charle...@outlook.com Hi All, Inspired by Charles' thread and the work of Emmanuel [1], I have made some experiments trying to create a trigger to make partitioning using C language. The first attempt was not good, I tried to use SPI [2] to create a query to insert into the correct child table, but it took almost no improvement compared with the PL/pgSQL code. Then, I used the Emmanuel's code and digged into the PG source code (basically at copy.c) to create a trigger function that insert to the partitioned table direct (using heap_insert instead of SQL) [3], and the improvement was about 100% (not 4/5 times like got by Emmanuel ). The function has no other performance trick, like caching the relations or something like that. The function does partition based on month/year, but it's easy to change to day/month/year or something else. And, of course, it's not ready for production, as I'm not sure if it can break things. The tests were made using a PL/pgSQL code to insert 1 milion rows, and I don't know if this is a real-life-like test (probably not). And there is a table partitioned by month, with a total of 12 partitions (with the insertions randomly distributed through all 2012). I put the trigger and the experiments on a repository at GitHub: https://github.com/matheusoliveira/pg_partitioning_tests I don't know if this is the right list for the topic, and I thought the old one has to many messages, so I created this one to show this tirgger sample and see if someone has a comment about it. PS: I'd be glad if someone could revise the code to make sure it don't brake in some corner case. I'm made some tests [4], but not sure if they covered everything. PS2: It surely will not work on old versions of PostgreSQL, perhaps not even 9.1 (not tested). [1] http://archives.postgresql.org/pgsql-hackers/2008-12/msg01221.php and http://archives.postgresql.org/pgsql-performance/2012-12/msg00189.php [2] https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/src/spi/partition_insert_trigger_spi.c [3] https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/src/partition_insert_trigger.c [4] https://github.com/matheusoliveira/pg_partitioning_tests/tree/master/test/regress Regards, -- Matheus de Oliveira Analista de Banco de Dados Dextra Sistemas - MPS.Br nível F! www.dextra.com.br/postgreshttp://www.dextra.com.br/postgres/ Interesting that you got an improvement. In my case I get almost no improvement at all: PL/SQL – Dynamic Trigger 4:15:54 PL/SQL - CASE / WHEN Statements 4:12:29 PL/SQL - IF Statements 4:12:39 C Trigger 4:10:49 Here is my code, I’m using heap insert and updating the indexes. With a similar approach of yours. The trigger is aware of http://www.charlesrg.com/~charles/pgsql/partition2.c -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance