Fwd: [PERFORM] Insert performance (OT?)

Yves Vindevogel Tue, 19 Jul 2005 03:24:32 -0700

BTW: thank you for the idea

Begin forwarded message:

From: Yves Vindevogel <[EMAIL PROTECTED]>
Date: Tue 19 Jul 2005 12:20:34 CEST
To: Richard Huxton <dev@archonet.com>
Subject: Re: [PERFORM] Insert performance (OT?)

On 19 Jul 2005, at 11:39, Richard Huxton wrote:

Yves Vindevogel wrote:

Hi,
Suppose I have a table with 4 fields (f1, f2, f3, f4)
I define 2 unique indexes u1 (f1, f2, f3) and u2 (f1, f2, f4)
I have 3 records
A, B, C, D (this will be inserted)
A, B, C, E (this will pass u2, but not u1, thus not inserted)
A, B, F, D (this will pass u1, but not u2, thus not inserted)

Are you saying you want to know whether they will be inserted before you try to do so?

No, that is not an issue. Problem is that when I use a big query with "insert into .. select" and one record is wrong (like above) the complete insert query is abandonned.
Therefore, I must do it another way. Or I must be able to say, insert them and dump the rest.

Now, for performance ...
I have tables like this with 500.000 records where there's a new upload of approx. 20.000 records.
It is only now that we say index u2 to be necessary. So, until now, I did something like insert into ... select f1, f2, f2, max(f4) group by f1, f2, f3
That is ok ... and also logically ok because of the data definition

I'm confused here - assuming you meant "select f1,f2,f3", then I don't see how you guarantee the row doesn't alredy exist.

No, I meant it with max(f4) because my table has 4 fields. And no, I can't guarantee that, that is exactly my problem.
But with the unique indexes, I'm certain that it will not get into my database

I cannot do this with 2 group by's. I tried this on paper and I'm not succeeding.

I don't see how you can have two group-by's, or what that would mean if you did.

select from ( select from group by) as foo group by

So, I must use a function that will check against u1 and u2, and then insert if it is ok.
I know that such a function is way slower that my insert query.

So - you have a table, called something like "upload" with 20,000 rows and you'd like to know whether it is safe to insert them. Well, it's easy enough to identify which ones are duplicates.

SELECT * FROM upload JOIN main_table ON u1=f1 AND u2=f2 AND u3=f3;
SELECT * FROM upload JOIN main_table ON u1=f1 AND u2=f2 AND u3=f4;

That is a good idea. I can delete the ones that would fail my first unique index this way, and then delete the ones that would fail my second unique index and then upload them.
Hmm, why did I not think of that myself.

Are you saying that deleting these rows and then inserting takes too long?

This goes very fast, but not with a function that checks each record one by one.

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Met vriendelijke groeten,
Bien à vous,
Kind regards,

Yves Vindevogel
Implements

Mail: [EMAIL PROTECTED] - Mobile: +32 (478) 80 82 91

Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76

Web: http://www.implements.be
<x-tad-smaller>
First they ignore you. Then they laugh at you. Then they fight you. Then you win.
Mahatma Ghandi.</x-tad-smaller>

Met vriendelijke groeten,
Bien à vous,
Kind regards,

Yves Vindevogel
Implements

Mail: [EMAIL PROTECTED] - Mobile: +32 (478) 80 82 91

Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76

Web: http://www.implements.be
<x-tad-smaller>
First they ignore you. Then they laugh at you. Then they fight you. Then you win.
Mahatma Ghandi.</x-tad-smaller>

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?


               http://www.postgresql.org/docs/faq

Fwd: [PERFORM] Insert performance (OT?)

Reply via email to