Thomas F.O'Connell wrote:
The problem comes in importing new data into the tables for which the
counts are maintained. The current import process does some
preprocessing and then does a COPY from the filesystem to one of the
tables on which counts are maintained. This means that for each row
being inserted by COPY, a trigger is fired. This didn't seem like a big
deal to me until testing began on realistic data sets.
For a 5,000-record import, preprocessing plus the COPY took about 5
minutes. Once the triggers used for maintaining the counts were added,
this grew to 25 minutes. While I knew there would be a slowdown per row
affected, I expected something closer to 2x than to 5x.
rformance out of this scenario?
Have been seeing similar behavior whilst testing sample code for the 8.0
docs (summary table plpgsql trigger example).
I think the nub of the problem is dead tuples bloat in the summary /
count table, so each additional triggered update becomes more and more
expensive as time goes on. I suspect the performance decrease is
exponential with the no of rows to be processed.
Would it be absurd to drop the triggers during import and recreate them
afterward and update the counts in a summary update based on information
from the import process?
That's the conclusion I came to :-)
regards
Mark
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly