Re: [PERFORM] Duplicate deletion optimizations

2012-01-09 Thread antoine
Hello, Thanks for your numerous and complete answers! For those who have asked for more information about the process and hardware: The goal of the process is to compute data from a nosql cluster and write results in a PostgreSQL database. This process is triggered every 5 minutes for the

Re: [PERFORM] Duplicate deletion optimizations

2012-01-08 Thread Pierre C
That's almost identical to my tables. You explained your problem very well ;) I certainly will. Many thanks for those great lines of SQL! You're welcome ! Strangely I didn't receive the mail I posted to the list (received yours though). -- Sent via pgsql-performance mailing list

Re: [PERFORM] Duplicate deletion optimizations

2012-01-08 Thread Strange, John W
-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of anto...@inaps.org Sent: Friday, January 06, 2012 8:36 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] Duplicate deletion optimizations Hello, I've a table with approximately 50 million rows

Re: [PERFORM] Duplicate deletion optimizations

2012-01-08 Thread Jochen Erwied
Saturday, January 7, 2012, 1:21:02 PM you wrote: where t_imp.id is null and test.id=t_imp.id; = where t_imp.id is not null and test.id=t_imp.id; You're right, overlooked that one. But the increase to execute the query is - maybe not completely - suprisingly minimal. Because the query

Re: [PERFORM] Duplicate deletion optimizations

2012-01-07 Thread Jochen Erwied
Friday, January 6, 2012, 4:21:06 PM you wrote: Every 5 minutes, a process have to insert a few thousand of rows in this table, but sometime, the process have to insert an already existing row (based on values in the triplet (t_value, t_record, output_id). In this case, the row must be updated

Re: [PERFORM] Duplicate deletion optimizations

2012-01-07 Thread Pierre C
It's a fairly tricky problem. I have a number of sensors producing energy data about every 5 minutes, but at random times between 1 and 15 minutes. I can't change that as that's the way the hardware of the sensors works. These feed into another unit, which accumulates them and forwards them in

Re: [PERFORM] Duplicate deletion optimizations

2012-01-07 Thread Marc Mamin
] Duplicate deletion optimizations Friday, January 6, 2012, 4:21:06 PM you wrote: Every 5 minutes, a process have to insert a few thousand of rows in this table, but sometime, the process have to insert an already existing row (based on values in the triplet (t_value, t_record, output_id). In this case

Re: [PERFORM] Duplicate deletion optimizations

2012-01-07 Thread Misa Simic
Regards, Misa Sent from my Windows Phone From: Jochen Erwied Sent: 07/01/2012 12:58 To: anto...@inaps.org Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Duplicate deletion optimizations Friday, January 6, 2012, 4:21:06 PM you wrote: Every 5 minutes, a process have to insert a few

Re: [PERFORM] Duplicate deletion optimizations

2012-01-07 Thread Jochen Erwied
Saturday, January 7, 2012, 3:02:10 PM you wrote: • insert into live from temp where col1, col2 and col3 not exists in live 'not exists' is something I'm trying to avoid, even if the optimizer is able to handle it. -- Jochen Erwied | home: joc...@erwied.eu +49-208-38800-18, FAX:

Re: [PERFORM] Duplicate deletion optimizations

2012-01-07 Thread Pierre C
It's a fairly tricky problem. I have a number of sensors producing energy data about every 5 minutes, but at random times between 1 and 15 minutes. I can't change that as that's the way the hardware of the sensors works. These feed into another unit, which accumulates them and forwards them in

Re: [PERFORM] Duplicate deletion optimizations

2012-01-07 Thread Jeff Janes
On Fri, Jan 6, 2012 at 6:35 AM, anto...@inaps.org wrote: Hello, I've a table with approximately 50 million rows with a schema like this:    id bigint NOT NULL DEFAULT nextval('stats_5mn'::regclass),    t_value integer NOT NULL DEFAULT 0,    t_record integer NOT NULL DEFAULT 0,    

Re: [PERFORM] Duplicate deletion optimizations

2012-01-07 Thread Misa Simic
... But I think it is more personal feelings which is better then real... Sent from my Windows Phone From: Jochen Erwied Sent: 07/01/2012 15:18 To: Misa Simic Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Duplicate deletion optimizations Saturday, January 7, 2012, 3:02:10 PM you wrote

Re: [PERFORM] Duplicate deletion optimizations

2012-01-07 Thread Marc Eberhard
Hi Pierre! On 7 January 2012 12:20, Pierre C li...@peufeu.com wrote: I'm stuck home with flu, so I'm happy to help ;) [...] I'll build an example setup to make it clearer... [...] That's almost identical to my tables. :-) Note that the distance field represents the distance (in time) between

[PERFORM] Duplicate deletion optimizations

2012-01-06 Thread antoine
Hello, I've a table with approximately 50 million rows with a schema like this: id bigint NOT NULL DEFAULT nextval('stats_5mn'::regclass), t_value integer NOT NULL DEFAULT 0, t_record integer NOT NULL DEFAULT 0, output_id integer NOT NULL DEFAULT 0, count bigint NOT NULL

Re: [PERFORM] Duplicate deletion optimizations

2012-01-06 Thread antoine
On Fri, 06 Jan 2012 15:35:36 +0100, anto...@inaps.org wrote: Hello, I've a table with approximately 50 million rows with a schema like this: id bigint NOT NULL DEFAULT nextval('stats_5mn'::regclass), t_value integer NOT NULL DEFAULT 0, t_record integer NOT NULL DEFAULT 0,

Re: [PERFORM] Duplicate deletion optimizations

2012-01-06 Thread Samuel Gendler
On Fri, Jan 6, 2012 at 6:35 AM, anto...@inaps.org wrote: Hello, I've a table with approximately 50 million rows with a schema like this: id bigint NOT NULL DEFAULT nextval('stats_5mn'::regclass)**, t_value integer NOT NULL DEFAULT 0, t_record integer NOT NULL DEFAULT 0,

Re: [PERFORM] Duplicate deletion optimizations

2012-01-06 Thread Marc Eberhard
Hi Samuel! On 6 January 2012 20:02, Samuel Gendler sgend...@ideasculptor.com wrote: Have you considered doing the insert by doing a bulk insert into a temp table and then pulling rows that don't exist across to the final table in one query and updating rows that do exist in another query?  I

Re: [PERFORM] Duplicate deletion optimizations

2012-01-06 Thread Samuel Gendler
On Fri, Jan 6, 2012 at 12:22 PM, Marc Eberhard eberhar...@googlemail.comwrote: Hi Samuel! On 6 January 2012 20:02, Samuel Gendler sgend...@ideasculptor.com wrote: Have you considered doing the insert by doing a bulk insert into a temp table and then pulling rows that don't exist across to

Re: [PERFORM] Duplicate deletion optimizations

2012-01-06 Thread Marc Eberhard
On 6 January 2012 20:38, Samuel Gendler sgend...@ideasculptor.com wrote: On Fri, Jan 6, 2012 at 12:22 PM, Marc Eberhard eberhar...@googlemail.com wrote: On 6 January 2012 20:02, Samuel Gendler sgend...@ideasculptor.com wrote: Have you considered doing the insert by doing a bulk insert into a

Re: [PERFORM] Duplicate deletion optimizations

2012-01-06 Thread Misa Simic
Windows Phone From: anto...@inaps.org Sent: 06/01/2012 15:36 To: pgsql-performance@postgresql.org Subject: [PERFORM] Duplicate deletion optimizations Hello, I've a table with approximately 50 million rows with a schema like this: id bigint NOT NULL DEFAULT nextval('stats_5mn'::regclass