Thomas F.O'Connell wrote:

The problem comes in importing new data into the tables for which the counts are maintained. The current import process does some preprocessing and then does a COPY from the filesystem to one of the tables on which counts are maintained. This means that for each row being inserted by COPY, a trigger is fired. This didn't seem like a big deal to me until testing began on realistic data sets.


For a 5,000-record import, preprocessing plus the COPY took about 5 minutes. Once the triggers used for maintaining the counts were added, this grew to 25 minutes. While I knew there would be a slowdown per row affected, I expected something closer to 2x than to 5x.
rformance out of this scenario?


Have been seeing similar behavior whilst testing sample code for the 8.0
docs (summary table plpgsql trigger example).

I think the nub of the problem is dead tuples bloat in the summary /
count table, so each additional triggered update becomes more and more
expensive as time goes on. I suspect the performance decrease is
exponential with the no of rows to be processed.


Would it be absurd to drop the triggers during import and recreate them afterward and update the counts in a summary update based on information from the import process?


That's the conclusion I came to :-)

regards

Mark


---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly

Reply via email to