Re: [PERFORM] Slow concurrent update of same row in a given table

2005-10-04 Thread Jim C. Nasby
On Thu, Sep 29, 2005 at 07:59:34AM +0530, Rajesh Kumar Mallah wrote:
  I see. These problems regularly come up in database design. The best thing
  you can do is modify your database design/application such that instead of
  incrementing a count in a single row, you insert a row into a table,
  recording the 'dispatch_id'. Counting the number of rows for a given
  dispatch id will give you your count.
 
 
 sorry i will be accumulating huge amount of rows in seperate table
 with no extra info when i really want just the count. Do you have
 a better database design in mind?
 
 Also i encounter same problem in implementing read count of
 articles in sites and in counting banner impressions where same
 row get updated by multiple processes frequently.

Databases like to work on *sets* of data, not individual rows. Something
like this would probably perform much better than what you've got now,
and would prevent having a huge table laying around:

INSERT INTO holding_table ... -- Done for every incomming
connection/what-have-you

CREATE OR REPLACE FUNCTION summarize() RETURNS void AS $$
DECLARE
v_rows int;
BEGIN
DELETE FROM holding_table;
GET DIAGNOSTICS v_rows = ROW_COUNT;
UPDATE count_table
SET count = count + v_rows
;
END;
$$ LANGUAGE plpgsql;

Periodically (say, once a minute):
SELECT summarize()
VACUUM holding_table;
VACUUM count_table;
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow concurrent update of same row in a given table

2005-09-28 Thread Gavin Sherry
On Wed, 28 Sep 2005, Rajesh Kumar Mallah wrote:

   Number of Copies | Update perl Sec
  
   1  -- 119
   2  --- 59
   3  ---  38
   4  --- 28
   5 -- 22
   6 -- 19
   7 -- 16
   8 -- 14
   9 -- 11
   10 -- 11
   11 -- 10
 
  So, 11 instances result in 10 updated rows per second, database wide or
  per instance? If it is per instance, then 11 * 10 is close to the
  performance for one connection.


 Sorry do not understand the difference between database wide
 and per instance

Per instance.


 
  That being said, when you've got 10 connections fighting over one row, I
  wouldn't be surprised if you had bad performance.
 
  Also, at 119 updates a second, you're more than doubling the table's
  initial size (dead tuples) each second. How often are you vacuuming and
  are you using vacuum or vacuum full?


 Yes I realize the obvious phenomenon now, (and the uselessness of the script)
  , we should not consider it a performance degradation.

 I am having performance issue in my live database thats why i tried to
 simulate the situation(may the the script was overstresser).

 My original problem is  that i send 100 000s of emails carrying a
 beacon for tracking readership every tuesday and on wednesday i see
 lot of the said query in pg_stat_activity each of these query update
 the SAME row that corresponds to the dispatch of last day and it is
 then i face the performance problem.

 I think i can only post further details next wednesday , please lemme
 know how should i be dealing with the situation if each the updates takes
 100times more time that normal update duration.

I see. These problems regularly come up in database design. The best thing
you can do is modify your database design/application such that instead of
incrementing a count in a single row, you insert a row into a table,
recording the 'dispatch_id'. Counting the number of rows for a given
dispatch id will give you your count.

Thanks,

Gavin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow concurrent update of same row in a given table

2005-09-28 Thread Gavin Sherry
On Thu, 29 Sep 2005, Rajesh Kumar Mallah wrote:

 On 9/29/05, Gavin Sherry [EMAIL PROTECTED] wrote:
  On Wed, 28 Sep 2005, Rajesh Kumar Mallah wrote:
 
 Number of Copies | Update perl Sec

 1  -- 119
 2  --- 59
 3  ---  38
 4  --- 28
 5 -- 22
 6 -- 19
 7 -- 16
 8 -- 14
 9 -- 11
 10 -- 11
 11 -- 10
   
So, 11 instances result in 10 updated rows per second, database wide or
per instance? If it is per instance, then 11 * 10 is close to the
performance for one connection.
  
  
   Sorry do not understand the difference between database wide
   and per instance
 
  Per instance.
 
  
   
That being said, when you've got 10 connections fighting over one row, I
wouldn't be surprised if you had bad performance.
   
Also, at 119 updates a second, you're more than doubling the table's
initial size (dead tuples) each second. How often are you vacuuming and
are you using vacuum or vacuum full?
  
  
   Yes I realize the obvious phenomenon now, (and the uselessness of the 
   script)
, we should not consider it a performance degradation.
  
   I am having performance issue in my live database thats why i tried to
   simulate the situation(may the the script was overstresser).
  
   My original problem is  that i send 100 000s of emails carrying a
   beacon for tracking readership every tuesday and on wednesday i see
   lot of the said query in pg_stat_activity each of these query update
   the SAME row that corresponds to the dispatch of last day and it is
   then i face the performance problem.
  
   I think i can only post further details next wednesday , please lemme
   know how should i be dealing with the situation if each the updates takes
   100times more time that normal update duration.
 
  I see. These problems regularly come up in database design. The best thing
  you can do is modify your database design/application such that instead of
  incrementing a count in a single row, you insert a row into a table,
  recording the 'dispatch_id'. Counting the number of rows for a given
  dispatch id will give you your count.
 

 sorry i will be accumulating huge amount of rows in seperate table
 with no extra info when i really want just the count. Do you have
 a better database design in mind?

 Also i encounter same problem in implementing read count of
 articles in sites and in counting banner impressions where same
 row get updated by multiple processes frequently.

As I said in private email, accumulating large numbers of rows is not a
problem. In your current application, you are write bound, not read bound.
I've designed many similar systems which have hundred of millions of rows.
It takes a while to generate the count, but you just do it periodically in
non-busy periods.

With 8.1, constraint exclusion will give you significantly better
performance with this system, as well.

Thanks,

Gavin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match