Re: [PERFORM] Partition insert trigger using C language

2013-02-15 Thread Ali Pouya
2013/2/13 Matheus de Oliveira matioli.math...@gmail.com


 Hi Ali,

 That is probably because you did not passed a parameter when defined the
 trigger. You can follow the model at [1]. When creating the trigger, you
 have to use a string parameter with the name of the field with the date
 value used for the partition.

 Let me know if you find any other problem.

 [1]
 https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/test/spi/schema.sql

 Regards,
 --
 Matheus de Oliveira
 Analista de Banco de Dados
 Dextra Sistemas - MPS.Br nível F!
 www.dextra.com.br/postgres


 Hi Matheus,
Yes. You are right. Now it's OK.
I thought that the trigger function cannont vehicle any arguments. I had
mis-interpreted this phrase of the
Documentationhttp://www.postgresql.org/docs/9.2/static/sql-createtrigger.html:
*function_name** *

*A user-supplied function that is declared as taking no arguments and
returning type trigger, which is executed when the trigger fires.*

Thanks and best regards
Ali


Re: [PERFORM] Partition insert trigger using C language

2013-02-13 Thread Matheus de Oliveira
On Mon, Feb 11, 2013 at 1:24 PM, Ali Pouya alipou...@gmail.com wrote:

 Hi Matheus,
 I try to use your partition_insert_trigger_spi.c code for testing SPI
 partitionning.
 But at execution time the *trigdata-tg_trigger-tgargs* pointer is null.
 Do you know why ?
 Thanks a lot
 Best Reagrds
 Ali Pouya



Hi Ali,

That is probably because you did not passed a parameter when defined the
trigger. You can follow the model at [1]. When creating the trigger, you
have to use a string parameter with the name of the field with the date
value used for the partition.

Let me know if you find any other problem.

[1]
https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/test/spi/schema.sql

Regards,
-- 
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres


Re: [PERFORM] Partition insert trigger using C language

2013-02-11 Thread Ali Pouya
Hi Matheus,
I try to use your partition_insert_trigger_spi.c code for testing SPI
partitionning.
But at execution time the *trigdata-tg_trigger-tgargs* pointer is null.
Do you know why ?
Thanks a lot
Best Reagrds
Ali Pouya


2013/1/11 Matheus de Oliveira matioli.math...@gmail.com


 On Fri, Jan 11, 2013 at 8:19 AM, Heikki Linnakangas 
 hlinnakan...@vmware.com wrote:

 On 10.01.2013 21:48, Matheus de Oliveira wrote:

 I have made a small modification to keep the plans, and it got from
 33957.768ms to 43782.376ms.


 If I'm reading results.txt correctly, the avg runtimes are:

 C and SPI_execute_with_args: 58567.708 ms
 C and SPI_(prepare/keepplan/execute_**plan): 43782.376 ms
 C and heap_insert: 33957.768 ms

 So switching to prepared plans helped quite a lot, but it's still slower
 than direct heap_inserts.


 Humm... You are right, I misread what it before, sorry. The 33957.768ms
 was with heap_insert.



 One thing that caught my eye:

  CREATE OR REPLACE FUNCTION partition_insert_trigger_spi()
 RETURNS trigger
 LANGUAGE C
 VOLATILE STRICT
 AS 'partition_insert_trigger_spi'**,'partition_insert_trigger_**spi'
 SET DateStyle TO 'ISO';


 Calling a function with SET options has a fair amount of overhead, to
 set/restore the GUC on every invocation. That should be avoided in a
 performance critical function like this.


 I (stupidly) used SPI_getvalue [1] and expected it to always return as
 -MM-DD, but them I remembered it would do that only with DateStyle=ISO.

 But the truth is that I couldn't see any overhead, because the function
 was without that on my first tests, and after that I saw no difference on
 the tests. I think I should use SPI_getbinvalue instead, but I don't know
 how to parse the result to get year and month, any help on that?

 [1]
 https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/src/spi/partition_insert_trigger_spi.c#L103

 Regards,

 --
 Matheus de Oliveira
 Analista de Banco de Dados
 Dextra Sistemas - MPS.Br nível F!
 www.dextra.com.br/postgres




Re: [PERFORM] Partition insert trigger using C language

2013-01-11 Thread Heikki Linnakangas

On 10.01.2013 21:48, Matheus de Oliveira wrote:

I have made a small modification to keep the plans, and it got from
33957.768ms to 43782.376ms.


If I'm reading results.txt correctly, the avg runtimes are:

C and SPI_execute_with_args: 58567.708 ms
C and SPI_(prepare/keepplan/execute_plan): 43782.376 ms
C and heap_insert: 33957.768 ms

So switching to prepared plans helped quite a lot, but it's still slower 
than direct heap_inserts.


One thing that caught my eye:


CREATE OR REPLACE FUNCTION partition_insert_trigger_spi()
RETURNS trigger
LANGUAGE C
VOLATILE STRICT
AS 'partition_insert_trigger_spi','partition_insert_trigger_spi'
SET DateStyle TO 'ISO';


Calling a function with SET options has a fair amount of overhead, to 
set/restore the GUC on every invocation. That should be avoided in a 
performance critical function like this.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Partition insert trigger using C language

2013-01-11 Thread Matheus de Oliveira
On Fri, Jan 11, 2013 at 8:19 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 10.01.2013 21:48, Matheus de Oliveira wrote:

 I have made a small modification to keep the plans, and it got from
 33957.768ms to 43782.376ms.


 If I'm reading results.txt correctly, the avg runtimes are:

 C and SPI_execute_with_args: 58567.708 ms
 C and SPI_(prepare/keepplan/execute_**plan): 43782.376 ms
 C and heap_insert: 33957.768 ms

 So switching to prepared plans helped quite a lot, but it's still slower
 than direct heap_inserts.


Humm... You are right, I misread what it before, sorry. The 33957.768ms was
with heap_insert.



 One thing that caught my eye:

  CREATE OR REPLACE FUNCTION partition_insert_trigger_spi()
 RETURNS trigger
 LANGUAGE C
 VOLATILE STRICT
 AS 'partition_insert_trigger_spi'**,'partition_insert_trigger_**spi'
 SET DateStyle TO 'ISO';


 Calling a function with SET options has a fair amount of overhead, to
 set/restore the GUC on every invocation. That should be avoided in a
 performance critical function like this.


I (stupidly) used SPI_getvalue [1] and expected it to always return as
-MM-DD, but them I remembered it would do that only with DateStyle=ISO.

But the truth is that I couldn't see any overhead, because the function was
without that on my first tests, and after that I saw no difference on the
tests. I think I should use SPI_getbinvalue instead, but I don't know how
to parse the result to get year and month, any help on that?

[1]
https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/src/spi/partition_insert_trigger_spi.c#L103

Regards,
-- 
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres


Re: [PERFORM] Partition insert trigger using C language

2013-01-11 Thread Matheus de Oliveira
On Thu, Jan 10, 2013 at 5:51 PM, Charles Gomes 
charles.go...@benchmarksolutions.com wrote:

  ** **

 *From:* pgsql-performance-ow...@postgresql.org [mailto:
 pgsql-performance-ow...@postgresql.org] *On Behalf Of *Matheus de Oliveira
 *Sent:* Thursday, January 10, 2013 2:12 PM
 *To:* Heikki Linnakangas
 *Cc:* pgsql-performance; Charles Gomes
 *Subject:* Re: [PERFORM] Partition insert trigger using C language

 ** **

 ** **

 On Thu, Jan 10, 2013 at 4:54 PM, Heikki Linnakangas 
 hlinnakan...@vmware.com wrote:

 On 10.01.2013 20:45, Matheus de Oliveira wrote:

 Inspired by Charles' thread and the work of Emmanuel [1], I have made some
 experiments trying to create a trigger to make partitioning using C
 language.

 The first attempt was not good, I tried to use SPI [2] to create a query
 to
 insert into the correct child table, but it took almost no improvement
 compared with the PL/pgSQL code.



 The right way to do this with SPI is to prepare each insert-statement on
 first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after
 that (SPI_execute_with_args).

 If you construct and plan the query on every invocation, it's not
 surprising that it's no different from PL/pgSQL performance.


 Yeah. I thought about that, but the problem was that I assumed the
 INSERTs came with random date, so in the worst scenario I would have to
 keep the plans of all of the child partitions. Am I wrong?

 But thinking better, even with hundreds of partitions, it wouldn't use to
 much memory/resource, would it?

 In fact, I didn't give to much attention to SPI method, because the other
 one is where we can have more fun, =P.

 Anyway, I'll change the code (maybe now), and see if it gets closer to
 the other method (that uses heap_insert), and will post back the results
 here.





 

 Interesting that you got an improvement. In my case I  get almost no
 improvement at all:

 ** **

 PL/SQL – Dynamic Trigger

  4:15:54

 PL/SQL - CASE / WHEN Statements

  4:12:29

 PL/SQL - IF Statements

  4:12:39

 C Trigger

  4:10:49

 ** **

 Here is my code, I’m using heap insert and updating the indexes. With a
 similar approach of yours.

 The trigger is aware of 

 http://www.charlesrg.com/~charles/pgsql/partition2.c

 **


Humm... Looking at your code, I saw no big difference from mine. The only
thing I saw is that you don't fire triggers, but it would be even faster
this way. Another thing that could cause that is the number of partitions,
I tried only with 12.

Could you make a test suite? Or try to run with my function in your
scenario? It would be easy to make it get the partitions by day [1].

[1] https://gist.github.com/4509782

Regards,
-- 
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres


Re: [PERFORM] Partition insert trigger using C language

2013-01-11 Thread Matheus de Oliveira
On Fri, Jan 11, 2013 at 9:02 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 11.01.2013 12:36, Matheus de Oliveira wrote:

 On Fri, Jan 11, 2013 at 8:19 AM, Heikki Linnakangashlinnakangas@**
 vmware.com hlinnakan...@vmware.com

 wrote:


  One thing that caught my eye:

   CREATE OR REPLACE FUNCTION partition_insert_trigger_spi()

 RETURNS trigger
 LANGUAGE C
 VOLATILE STRICT
 AS 'partition_insert_trigger_spi','partition_insert_trigger_***
 *spi'
 SET DateStyle TO 'ISO';


 Calling a function with SET options has a fair amount of overhead, to
 set/restore the GUC on every invocation. That should be avoided in a
 performance critical function like this.


 I (stupidly) used SPI_getvalue [1] and expected it to always return as
 -MM-DD, but them I remembered it would do that only with
 DateStyle=ISO.

 But the truth is that I couldn't see any overhead, because the function
 was
 without that on my first tests, and after that I saw no difference on the
 tests.


 Oh, ok then. I would've expected it to make a measurable difference.


  I think I should use SPI_getbinvalue instead, but I don't know how
 to parse the result to get year and month, any help on that?


 The fastest way is probably to use j2date like date_out does:

   DateADT   date = DatumGetDateADT(x)
   int year, month, mday;

   if (DATE_NOT_FINITE(date))
 elog(ERROR, date must be finite);
   j2date(date + POSTGRES_EPOCH_JDATE, year, month, mday);

 - Heikki


Nice. With the modifications you suggested I did saw a good improvement on
the function using SPI (and a little one with heap_insert). So I was wrong
to think that change the GUC would not make to much difference, the SPI
code now runs almost as fast as the heap_insert:

heap_insert: 31896.098 ms
SPI: 36558.564

Of course I still could make some improvements on it, like using a LRU to
keep the plans, or something like that.

The new code is at github.

Regards,
-- 
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres


Re: [PERFORM] Partition insert trigger using C language

2013-01-10 Thread Heikki Linnakangas

On 10.01.2013 20:45, Matheus de Oliveira wrote:

Inspired by Charles' thread and the work of Emmanuel [1], I have made some
experiments trying to create a trigger to make partitioning using C
language.

The first attempt was not good, I tried to use SPI [2] to create a query to
insert into the correct child table, but it took almost no improvement
compared with the PL/pgSQL code.


The right way to do this with SPI is to prepare each insert-statement on 
first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after 
that (SPI_execute_with_args).


If you construct and plan the query on every invocation, it's not 
surprising that it's no different from PL/pgSQL performance.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Partition insert trigger using C language

2013-01-10 Thread Matheus de Oliveira
On Thu, Jan 10, 2013 at 4:54 PM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 10.01.2013 20:45, Matheus de Oliveira wrote:

 Inspired by Charles' thread and the work of Emmanuel [1], I have made some
 experiments trying to create a trigger to make partitioning using C
 language.

 The first attempt was not good, I tried to use SPI [2] to create a query
 to
 insert into the correct child table, but it took almost no improvement
 compared with the PL/pgSQL code.


 The right way to do this with SPI is to prepare each insert-statement on
 first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after
 that (SPI_execute_with_args).

 If you construct and plan the query on every invocation, it's not
 surprising that it's no different from PL/pgSQL performance.


Yeah. I thought about that, but the problem was that I assumed the INSERTs
came with random date, so in the worst scenario I would have to keep the
plans of all of the child partitions. Am I wrong?

But thinking better, even with hundreds of partitions, it wouldn't use to
much memory/resource, would it?

In fact, I didn't give to much attention to SPI method, because the other
one is where we can have more fun, =P.

Anyway, I'll change the code (maybe now), and see if it gets closer to the
other method (that uses heap_insert), and will post back the results here.

Thanks,
-- 
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres


Re: [PERFORM] Partition insert trigger using C language

2013-01-10 Thread Pavel Stehule
2013/1/10 Heikki Linnakangas hlinnakan...@vmware.com:
 On 10.01.2013 20:45, Matheus de Oliveira wrote:

 Inspired by Charles' thread and the work of Emmanuel [1], I have made some
 experiments trying to create a trigger to make partitioning using C
 language.

 The first attempt was not good, I tried to use SPI [2] to create a query
 to
 insert into the correct child table, but it took almost no improvement
 compared with the PL/pgSQL code.


 The right way to do this with SPI is to prepare each insert-statement on
 first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after that
 (SPI_execute_with_args).

 If you construct and plan the query on every invocation, it's not surprising
 that it's no different from PL/pgSQL performance.

This a problematic for partitioning, because you need too much plans -
and direct access is probably better - I am thinking.

On second hand, there is relative high possibility to get inconsistent
relations - broken indexes, if somebody don't write trigger well.

Maybe we can enhance copy to support partitioning better. Now I have a
prototype for fault tolerant copy and it can work nice together with
some partitioning support

Regards

Pavel


 - Heikki


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Partition insert trigger using C language

2013-01-10 Thread Heikki Linnakangas

On 10.01.2013 21:11, Matheus de Oliveira wrote:

On Thu, Jan 10, 2013 at 4:54 PM, Heikki Linnakangashlinnakan...@vmware.com

wrote:



The right way to do this with SPI is to prepare each insert-statement on
first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after
that (SPI_execute_with_args).

If you construct and plan the query on every invocation, it's not
surprising that it's no different from PL/pgSQL performance.


Yeah. I thought about that, but the problem was that I assumed the INSERTs
came with random date, so in the worst scenario I would have to keep the
plans of all of the child partitions. Am I wrong?

But thinking better, even with hundreds of partitions, it wouldn't use to
much memory/resource, would it?


Right, a few hundred saved plans would probably still be ok. And if that 
ever becomes a problem, you could keep the plans in a LRU list and only 
keep the last 100 plans or so.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Partition insert trigger using C language

2013-01-10 Thread Matheus de Oliveira
On Thu, Jan 10, 2013 at 5:22 PM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 10.01.2013 21:11, Matheus de Oliveira wrote:

 On Thu, Jan 10, 2013 at 4:54 PM, Heikki Linnakangashlinnakangas@**
 vmware.com hlinnakan...@vmware.com

 wrote:


  The right way to do this with SPI is to prepare each insert-statement on
 first invocation (SPI_prepare + SPI_keepplan), and reuse the plan after
 that (SPI_execute_with_args).

 If you construct and plan the query on every invocation, it's not
 surprising that it's no different from PL/pgSQL performance.


 Yeah. I thought about that, but the problem was that I assumed the INSERTs
 came with random date, so in the worst scenario I would have to keep the
 plans of all of the child partitions. Am I wrong?

 But thinking better, even with hundreds of partitions, it wouldn't use to
 much memory/resource, would it?


 Right, a few hundred saved plans would probably still be ok. And if that
 ever becomes a problem, you could keep the plans in a LRU list and only
 keep the last 100 plans or so.


I have made a small modification to keep the plans, and it got from
33957.768ms to 43782.376ms. I'm not sure if I did something wrong/stupid on
the code [1], or if something else broke my test. I can't rerun the test
today, but I'll do that as soon as I have time.

[1]
https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/src/spi/partition_insert_trigger_spi.c

-- 
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres


Re: [PERFORM] Partition insert trigger using C language

2013-01-10 Thread Charles Gomes

 From: matioli.math...@gmail.com 
 Date: Thu, 10 Jan 2013 16:45:43 -0200 
 Subject: Partition insert trigger using C language 
 To: pgsql-performance@postgresql.org 
 CC: charle...@outlook.com 
  
 Hi All, 
  
 Inspired by Charles' thread and the work of Emmanuel [1], I have made  
 some experiments trying to create a trigger to make partitioning using  
 C language. 
  
 The first attempt was not good, I tried to use SPI [2] to create a  
 query to insert into the correct child table, but it took almost no  
 improvement compared with the PL/pgSQL code. 
  
 Then, I used the Emmanuel's code and digged into the PG source code  
 (basically at copy.c) to create a trigger function that insert to the  
 partitioned table direct (using heap_insert instead of SQL) [3], and  
 the improvement was about 100% (not 4/5 times like got by Emmanuel ).  
 The function has no other performance trick, like caching the relations  
 or something like that. 
  
 The function does partition based on month/year, but it's easy to  
 change to day/month/year or something else. And, of course, it's not  
 ready for production, as I'm not sure if it can break things. 
  
 The tests were made using a PL/pgSQL code to insert 1 milion rows, and  
 I don't know if this is a real-life-like test (probably not). And there  
 is a table partitioned by month, with a total of 12 partitions (with  
 the insertions randomly distributed through all 2012). 
  
 I put the trigger and the experiments on a repository at GitHub: 
  
 https://github.com/matheusoliveira/pg_partitioning_tests 
  
 I don't know if this is the right list for the topic, and I thought the  
 old one has to many messages, so I created this one to show this  
 tirgger sample and see if someone has a comment about it. 
  
 PS: I'd be glad if someone could revise the code to make sure it don't  
 brake in some corner case. I'm made some tests [4], but not sure if  
 they covered everything. 
 PS2: It surely will not work on old versions of PostgreSQL, perhaps not  
 even 9.1 (not tested). 
  
  
 [1] http://archives.postgresql.org/pgsql-hackers/2008-12/msg01221.php  
 and  
 http://archives.postgresql.org/pgsql-performance/2012-12/msg00189.php 
 [2]  
 https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/src/spi/partition_insert_trigger_spi.c
  
 [3]  
 https://github.com/matheusoliveira/pg_partitioning_tests/blob/master/src/partition_insert_trigger.c
  
 [4]  
 https://github.com/matheusoliveira/pg_partitioning_tests/tree/master/test/regress
  
  
  
 Regards, 
 --  
 Matheus de Oliveira 
 Analista de Banco de Dados 
 Dextra Sistemas - MPS.Br nível F! 
 www.dextra.com.br/postgreshttp://www.dextra.com.br/postgres/ 


Interesting that you got an improvement. In my case I  get
almost no improvement at all:

 


 
  
  PL/SQL – Dynamic Trigger
  
  
   4:15:54
  
 
 
  
  PL/SQL - CASE / WHEN Statements
  
  
   4:12:29
  
 
 
  
  PL/SQL
  - IF Statements
  
  
   4:12:39
  
 
 
  
  C
  Trigger
  
  
   4:10:49
  
 


 

Here is my code, I’m using heap insert and updating the indexes.
With a similar approach of yours.

The trigger is aware of 

http://www.charlesrg.com/~charles/pgsql/partition2.c
  

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance