On 23/02/2014 14:05, Stephen R. van den Berg wrote:
james wrote:
>On 23/02/2014 11:08, Stephen R. van den Berg wrote:
>>Collecting all ids to be deleted in one temp table is not a good idea
>>to begin with.  Collecting the ids could become a scaling problem.
>I disagree.  Temp tables are lightweight and ids are small; workmem
>could easily be more of a problem.  And if its really a concern that
>so much has become garbage (in which case a big transaction is also a
>big problem) then just enforce an upper limit in the select with a
>max count.  I'd wager that you'd probably set that count pretty high.
Setting that max count is essential.  Omitting that is a recipe for
disaster on larger scales.

Well, no more of a disaster than the current query, and that is a long-running update query that may well create assorted temp tables itself. So let's keep it in perspective.

   With respect to "pretty high", it depends
on what you consider "pretty high".  In my case the optimum is close to
1000, which does not qualify as "pretty high", I think.  It results in
work-runs per delete of on average 30 seconds, and a minimum of 15 seconds,
and a rough maximum of 4 minutes.
-- Stephen.
The maxcount that limits the size of the temp table with candidate ids is nothing to do (well, not necessarily anything to do) with the count of ids to process in each batch that does deletion.

Personally I would probably look at a temp table max of a million or so and test with a batch for each delete of below 100. It might even work best with very small batches, perhaps even one at a time. Total run time will be somewhat higher but you are less likely to impact the online responsiveness by doing lots of small commits, especially with async commit turned on while you do so.

I'm not sure I would encourage batches that take 30s for something that can potentially impact user responsiveness. Still seems unnecessarily chunky to me.

Reply via email to