Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-31 Thread Josh Berkus
Simon,

 Well thats pretty weird. That code path clearly only happens once per
 transaction and ought to be fast. The other code paths that take
 ProcArrayLock like TransactionIdIsInProgress() and GetSnapshotData()
 ought to spend more time holding the lock. Presumably you are running
 with a fair number of SERIALIZABLE transactions?

Given that this is TPCC-analog, I'd assume that we are.

Jignesh?

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-31 Thread Jignesh K. Shah
Yep quite a bit of transactions .. But the piece that's slow is where it 
is clearing it up in CommitTransaction().
I am not sure of how ProcArrayLock is designed to work and hence not 
clear what  we are seeing is what we expect.


Do we have some design doc on ProcArrayLock to understand its purpose?

Thanks.
Regards,
Jignesh


Josh Berkus wrote:

Simon,

  

Well thats pretty weird. That code path clearly only happens once per
transaction and ought to be fast. The other code paths that take
ProcArrayLock like TransactionIdIsInProgress() and GetSnapshotData()
ought to spend more time holding the lock. Presumably you are running
with a fair number of SERIALIZABLE transactions?



Given that this is TPCC-analog, I'd assume that we are.

Jignesh?

  


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-30 Thread Jignesh K. Shah
With CLOG 16 the drp[s comes at about 1150 users with the following lock 
stats

bash-3.00# ./4_lwlock_waits.d 16404

Lock IdMode   Count
  ProcArrayLock  Shared   2
 XidGenLock   Exclusive   2
 XidGenLock  Shared   4
  WALInsertLock   Exclusive   7
CLogControlLock  Shared   8
   WALWriteLock   Exclusive  46
  ProcArrayLock   Exclusive  64
CLogControlLock   Exclusive 263

Lock IdMode   Combined Time (ns)
 XidGenLock   Exclusive   528300
  ProcArrayLock  Shared   968800
  WALInsertLock   Exclusive  4090900
 XidGenLock  Shared 73987600
   WALWriteLock   Exclusive 86200700
  ProcArrayLock   Exclusive130756000
CLogControlLock  Shared240471000
CLogControlLock   Exclusive   4115158500

So I think 32  is a better option for CLogs before ProcArrayLock becomes 
the bottleneck.


Though I havent seen what we can do with ProcArrayLock problem.


Regards,
Jignesh



Jignesh K. Shah wrote:
Using CLOG Buffers 32 and the commit sibling check patch I still see a 
drop at 1200-1300 users..




bash-3.00# ./4_lwlock_waits.d 18250

Lock IdMode   Count
 XidGenLock  Shared   1
CLogControlLock  Shared   2
  ProcArrayLock  Shared   2
 XidGenLock   Exclusive   4
CLogControlLock   Exclusive  15
  WALInsertLock   Exclusive  18
   WALWriteLock   Exclusive  38
  ProcArrayLock   Exclusive  77

Lock IdMode   Combined Time (ns)
 XidGenLock  Shared88700
  WALInsertLock   Exclusive 69556000
  ProcArrayLock  Shared 95656800
 XidGenLock   Exclusive139634100
CLogControlLock   Exclusive148822200
CLogControlLock  Shared16163
   WALWriteLock   Exclusive332781800
  ProcArrayLock   Exclusive   5688265500

bash-3.00# ./4_lwlock_waits.d 18599

Lock IdMode   Count
  ProcArrayLock  Shared   2
 XidGenLock   Exclusive   3
 XidGenLock  Shared   4
CLogControlLock  Shared   5
  WALInsertLock   Exclusive  10
CLogControlLock   Exclusive  21
   WALWriteLock   Exclusive  28
  ProcArrayLock   Exclusive  54

Lock IdMode   Combined Time (ns)
 XidGenLock   Exclusive  5688800
  WALInsertLock   Exclusive 11424700
CLogControlLock  Shared 55589100
  ProcArrayLock  Shared135220400
   WALWriteLock   Exclusive177906900
 XidGenLock  Shared524146500
CLogControlLock   Exclusive524563900
  ProcArrayLock   Exclusive   5828744500

bash-3.00#
bash-3.00# ./6_lwlock_stack.d 4 18599

Lock IdMode   Count
  ProcArrayLock  Shared   1
  ProcArrayLock   Exclusive  52

Lock IdMode   Combined Time (ns)
  ProcArrayLock  Shared 41428300
  ProcArrayLock   Exclusive   3858386500

Lock Id   Combined Time (ns)


 postgres`LWLockAcquire+0x1f0
 postgres`GetSnapshotData+0x120
 postgres`GetTransactionSnapshot+0x80
 postgres`PortalStart+0x198
 postgres`exec_bind_message+0x84c
 postgres`PostgresMain+0x17f8
 postgres`BackendRun+0x2f8
 postgres`ServerLoop+0x680
 postgres`PostmasterMain+0xda8
 postgres`main+0x3d0
 postgres`_start+0x17c
 Shared 41428300

 postgres`LWLockAcquire+0x1f0
 postgres`CommitTransaction+0x104
 postgres`CommitTransactionCommand+0xbc
 postgres`finish_xact_command+0x78
 postgres`exec_execute_message+0x42c
 postgres`PostgresMain+0x1838
 postgres`BackendRun+0x2f8
 postgres`ServerLoop+0x680
 postgres`PostmasterMain+0xda8
 postgres`main+0x3d0
 postgres`_start+0x17c
 Exclusive3858386500


-Jignesh


---(end of 

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-27 Thread Alvaro Herrera
Jignesh K. Shah wrote:

 Top Wait time   seems to come from the following code path for 
 ProcArrayLock:

 Lock IdMode   Count
   ProcArrayLock   Exclusive  21

 Lock Id   Combined Time (ns)
   ProcArrayLock   5255937500

 Lock Id   Combined Time (ns)


  postgres`LWLockAcquire+0x1f0
  postgres`CommitTransaction+0x104

Yeah, ProcArrayLock is pretty contended.  I think it would be kinda neat
if we could split it up in partitions.  This lock is quite particular
though.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-27 Thread Jignesh K. Shah

Yes I can try to breakup the Shared and exclusive time..

Also yes I use commit delays =10, it helps out a lot in reducing IO load..

I will try out the patch soon.

-Jignesh


Simon Riggs wrote:

On Thu, 2007-07-26 at 17:17 -0400, Jignesh K. Shah wrote:
  

 Lock Id   Combined Time (ns)
  XidGenLock194966200
   WALInsertLock517955000
 CLogControlLock679665100
WALWriteLock   2838716200
   ProcArrayLock  44181002600



Is this the time the lock is held for or the time that we wait for that
lock? It would be good to see the break down of time separately for
shared and exclusive.

Can we have a table like this:
LockId,LockMode,SumTimeLockHeld,SumTimeLockWait

  
Top Wait time   seems to come from the following code path for 
ProcArrayLock:


 Lock IdMode   Count
   ProcArrayLock   Exclusive  21

 Lock Id   Combined Time (ns)
   ProcArrayLock   5255937500

 Lock Id   Combined Time (ns)


  postgres`LWLockAcquire+0x1f0
  postgres`CommitTransaction+0x104
  postgres`CommitTransactionCommand+0xbc
  postgres`finish_xact_command+0x78



Well thats pretty weird. That code path clearly only happens once per
transaction and ought to be fast. The other code paths that take
ProcArrayLock like TransactionIdIsInProgress() and GetSnapshotData()
ought to spend more time holding the lock. Presumably you are running
with a fair number of SERIALIZABLE transactions? 


Are you running with commit_delay  0? Its possible that the call to
CountActiveBackends() is causing pinging of the procarray by other
backends while we're trying to read it during CommitTransaction(). If
so, try the attached patch.

  


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-27 Thread Simon Riggs
On Thu, 2007-07-26 at 17:17 -0400, Jignesh K. Shah wrote:
  Lock Id   Combined Time (ns)
   XidGenLock194966200
WALInsertLock517955000
  CLogControlLock679665100
 WALWriteLock   2838716200
ProcArrayLock  44181002600

Is this the time the lock is held for or the time that we wait for that
lock? It would be good to see the break down of time separately for
shared and exclusive.

Can we have a table like this:
LockId,LockMode,SumTimeLockHeld,SumTimeLockWait

 Top Wait time   seems to come from the following code path for 
 ProcArrayLock:
 
  Lock IdMode   Count
ProcArrayLock   Exclusive  21
 
  Lock Id   Combined Time (ns)
ProcArrayLock   5255937500
 
  Lock Id   Combined Time (ns)
 
 
   postgres`LWLockAcquire+0x1f0
   postgres`CommitTransaction+0x104
   postgres`CommitTransactionCommand+0xbc
   postgres`finish_xact_command+0x78

Well thats pretty weird. That code path clearly only happens once per
transaction and ought to be fast. The other code paths that take
ProcArrayLock like TransactionIdIsInProgress() and GetSnapshotData()
ought to spend more time holding the lock. Presumably you are running
with a fair number of SERIALIZABLE transactions? 

Are you running with commit_delay  0? Its possible that the call to
CountActiveBackends() is causing pinging of the procarray by other
backends while we're trying to read it during CommitTransaction(). If
so, try the attached patch.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com
Index: src/backend/access/transam/xact.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.245
diff -c -r1.245 xact.c
*** src/backend/access/transam/xact.c	7 Jun 2007 21:45:58 -	1.245
--- src/backend/access/transam/xact.c	27 Jul 2007 09:09:08 -
***
*** 820,827 
  			 * are fewer than CommitSiblings other backends with active
  			 * transactions.
  			 */
! 			if (CommitDelay  0  enableFsync 
! CountActiveBackends() = CommitSiblings)
  pg_usleep(CommitDelay);
  
  			XLogFlush(recptr);
--- 820,826 
  			 * are fewer than CommitSiblings other backends with active
  			 * transactions.
  			 */
! 			if (CommitDelay  0  enableFsync)
  pg_usleep(CommitDelay);
  
  			XLogFlush(recptr);

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-27 Thread Simon Riggs
On Fri, 2007-07-27 at 04:58 -0400, Alvaro Herrera wrote:
 Jignesh K. Shah wrote:
 
  Top Wait time   seems to come from the following code path for 
  ProcArrayLock:
 
  Lock IdMode   Count
ProcArrayLock   Exclusive  21
 
  Lock Id   Combined Time (ns)
ProcArrayLock   5255937500
 
  Lock Id   Combined Time (ns)
 
 
   postgres`LWLockAcquire+0x1f0
   postgres`CommitTransaction+0x104
 
 Yeah, ProcArrayLock is pretty contended.  I think it would be kinda neat
 if we could split it up in partitions.  This lock is quite particular
 though.

Maybe, if we did we should set the partitions according to numbers of
users, so lower numbers of users are all in one partition.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-27 Thread Jignesh K. Shah
I tried CLOG Buffers 32 and the performance is as good as 64 bit.. (I 
havent tried 16 yet though.. ) I am going to try your second patch now..


Also here is the breakup by Mode. The combined time is the total time it 
waits for all counts.



Lock IdMode   Count
  ProcArrayLock  Shared   1
CLogControlLock   Exclusive   4
CLogControlLock  Shared   4
 XidGenLock  Shared   4
 XidGenLock   Exclusive   7
  WALInsertLock   Exclusive  21
   WALWriteLock   Exclusive  62
  ProcArrayLock   Exclusive  79

Lock IdModeCombined Time (ns)
CLogControlLockExclusive   325200
CLogControlLock   Shared  4509200
 XidGenLockExclusive 11839600
  ProcArrayLock   Shared 40506600
 XidGenLock   Shared119013700
  WALInsertLockExclusive148063100
   WALWriteLockExclusive347052100
  ProcArrayLockExclusive   1054780600

Here is another one at higher user count 1600:

bash-3.00# ./4_lwlock_waits.d 9208

Lock IdMode   Count
CLogControlLock   Exclusive   1
CLogControlLock  Shared   2
 XidGenLock  Shared   7
  WALInsertLock   Exclusive  12
   WALWriteLock   Exclusive  50
  ProcArrayLock   Exclusive  82

Lock IdMode   Combined Time (ns)
CLogControlLockExclusive27300
 XidGenLock   Shared 14689300
CLogControlLock   Shared 72664900
  WALInsertLockExclusive101431300
   WALWriteLockExclusive534357400
  ProcArrayLockExclusive   4110350300

Now I will try with your second patch.

Regards,
Jignesh

Simon Riggs wrote:

On Thu, 2007-07-26 at 17:17 -0400, Jignesh K. Shah wrote:
  

 Lock Id   Combined Time (ns)
  XidGenLock194966200
   WALInsertLock517955000
 CLogControlLock679665100
WALWriteLock   2838716200
   ProcArrayLock  44181002600



Is this the time the lock is held for or the time that we wait for that
lock? It would be good to see the break down of time separately for
shared and exclusive.

Can we have a table like this:
LockId,LockMode,SumTimeLockHeld,SumTimeLockWait

  
Top Wait time   seems to come from the following code path for 
ProcArrayLock:


 Lock IdMode   Count
   ProcArrayLock   Exclusive  21

 Lock Id   Combined Time (ns)
   ProcArrayLock   5255937500

 Lock Id   Combined Time (ns)


  postgres`LWLockAcquire+0x1f0
  postgres`CommitTransaction+0x104
  postgres`CommitTransactionCommand+0xbc
  postgres`finish_xact_command+0x78



Well thats pretty weird. That code path clearly only happens once per
transaction and ought to be fast. The other code paths that take
ProcArrayLock like TransactionIdIsInProgress() and GetSnapshotData()
ought to spend more time holding the lock. Presumably you are running
with a fair number of SERIALIZABLE transactions? 


Are you running with commit_delay  0? Its possible that the call to
CountActiveBackends() is causing pinging of the procarray by other
backends while we're trying to read it during CommitTransaction(). If
so, try the attached patch.

  



Index: src/backend/access/transam/xact.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.245
diff -c -r1.245 xact.c
*** src/backend/access/transam/xact.c   7 Jun 2007 21:45:58 -   1.245
--- src/backend/access/transam/xact.c   27 Jul 2007 09:09:08 -
***
*** 820,827 
 * are fewer than CommitSiblings other backends with 
active
 * transactions.
 */
!   if (CommitDelay  0  enableFsync 
!   CountActiveBackends() = CommitSiblings)
pg_usleep(CommitDelay);
  
  			XLogFlush(recptr);

--- 820,826 
 * are fewer than CommitSiblings other backends with 
active
 * transactions.
 */
!   if (CommitDelay  0  enableFsync)
pg_usleep(CommitDelay);
  
  			XLogFlush(recptr);
  

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-27 Thread Jignesh K. Shah
Using CLOG Buffers 32 and the commit sibling check patch I still see a 
drop at 1200-1300 users..




bash-3.00# ./4_lwlock_waits.d 18250

Lock IdMode   Count
 XidGenLock  Shared   1
CLogControlLock  Shared   2
  ProcArrayLock  Shared   2
 XidGenLock   Exclusive   4
CLogControlLock   Exclusive  15
  WALInsertLock   Exclusive  18
   WALWriteLock   Exclusive  38
  ProcArrayLock   Exclusive  77

Lock IdMode   Combined Time (ns)
 XidGenLock  Shared88700
  WALInsertLock   Exclusive 69556000
  ProcArrayLock  Shared 95656800
 XidGenLock   Exclusive139634100
CLogControlLock   Exclusive148822200
CLogControlLock  Shared16163
   WALWriteLock   Exclusive332781800
  ProcArrayLock   Exclusive   5688265500

bash-3.00# ./4_lwlock_waits.d 18599

Lock IdMode   Count
  ProcArrayLock  Shared   2
 XidGenLock   Exclusive   3
 XidGenLock  Shared   4
CLogControlLock  Shared   5
  WALInsertLock   Exclusive  10
CLogControlLock   Exclusive  21
   WALWriteLock   Exclusive  28
  ProcArrayLock   Exclusive  54

Lock IdMode   Combined Time (ns)
 XidGenLock   Exclusive  5688800
  WALInsertLock   Exclusive 11424700
CLogControlLock  Shared 55589100
  ProcArrayLock  Shared135220400
   WALWriteLock   Exclusive177906900
 XidGenLock  Shared524146500
CLogControlLock   Exclusive524563900
  ProcArrayLock   Exclusive   5828744500

bash-3.00#
bash-3.00# ./6_lwlock_stack.d 4 18599

Lock IdMode   Count
  ProcArrayLock  Shared   1
  ProcArrayLock   Exclusive  52

Lock IdMode   Combined Time (ns)
  ProcArrayLock  Shared 41428300
  ProcArrayLock   Exclusive   3858386500

Lock Id   Combined Time (ns)


 postgres`LWLockAcquire+0x1f0
 postgres`GetSnapshotData+0x120
 postgres`GetTransactionSnapshot+0x80
 postgres`PortalStart+0x198
 postgres`exec_bind_message+0x84c
 postgres`PostgresMain+0x17f8
 postgres`BackendRun+0x2f8
 postgres`ServerLoop+0x680
 postgres`PostmasterMain+0xda8
 postgres`main+0x3d0
 postgres`_start+0x17c
 Shared 41428300

 postgres`LWLockAcquire+0x1f0
 postgres`CommitTransaction+0x104
 postgres`CommitTransactionCommand+0xbc
 postgres`finish_xact_command+0x78
 postgres`exec_execute_message+0x42c
 postgres`PostgresMain+0x1838
 postgres`BackendRun+0x2f8
 postgres`ServerLoop+0x680
 postgres`PostmasterMain+0xda8
 postgres`main+0x3d0
 postgres`_start+0x17c
 Exclusive3858386500


-Jignesh


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-26 Thread Jignesh K. Shah

BEAUTIFUL!!!

Using the Patch I can now go upto 1300 users without dropping.. But now 
it still repeats at 1300-1350 users..


I corrected the Lock Descriptions based on what I got from lwlock.h and 
retried the whole scalability again with profiling again.. This time it 
looks like the ProcArrayLock



bash-3.00# echo 600 users
600 users
bash-3.00# ./4_lwlock_waits.d 7056

Lock IdMode   Count
 XidGenLock   Exclusive   1
CLogControlLock  Shared   2
 XidGenLock  Shared   2
   WALWriteLock   Exclusive   4
  WALInsertLock   Exclusive   8
CLogControlLock   Exclusive   9
  ProcArrayLock   Exclusive   9

Lock Id   Combined Time (ns)
   WALWriteLock  2842300
 XidGenLock  4951000
CLogControlLock 11151800
  WALInsertLock 13035600
  ProcArrayLock 2004

bash-3.00# echo 700 users
700 users
bash-3.00# ./4_lwlock_waits.d 7056

Lock IdMode   Count
 XidGenLock   Exclusive   1
   WALWriteLock   Exclusive   2
 XidGenLock  Shared   2
CLogControlLock  Shared   3
CLogControlLock   Exclusive   8
  WALInsertLock   Exclusive   9
  ProcArrayLock   Exclusive  22

Lock Id   Combined Time (ns)
 XidGenLock  4093300
   WALWriteLock  4914800
  WALInsertLock  7389100
  ProcArrayLock 10248200
CLogControlLock 11989400

bash-3.00# echo 800 users
800 users
bash-3.00# ./4_lwlock_waits.d 7056

Lock IdMode   Count
   WALWriteLock   Exclusive   1
 XidGenLock  Shared   2
CLogControlLock  Shared   3
CLogControlLock   Exclusive   7
  WALInsertLock   Exclusive   7
  ProcArrayLock   Exclusive  31

Lock Id   Combined Time (ns)
   WALWriteLock   319100
 XidGenLock  5388700
  WALInsertLock  9901400
CLogControlLock 13465000
  ProcArrayLock 42979700

bash-3.00# echo 900 users
900 users
bash-3.00# ./4_lwlock_waits.d 7056

Lock IdMode   Count
CLogControlLock  Shared   1
 XidGenLock   Exclusive   1
   WALWriteLock   Exclusive   2
CLogControlLock   Exclusive   6
  WALInsertLock   Exclusive   9
  ProcArrayLock   Exclusive  25

Lock Id   Combined Time (ns)
 XidGenLock  3197700
   WALWriteLock  3887100
CLogControlLock 15774500
  WALInsertLock 38268700
  ProcArrayLock162073100

bash-3.00# ./6_lwlock_stack.d 4 7056

Lock IdMode   Count
  ProcArrayLock  Shared   1
  ProcArrayLock   Exclusive  67

Lock Id   Combined Time (ns)
  ProcArrayLock216773800

Lock Id   Combined Time (ns)


 postgres`LWLockAcquire+0x1c8
 postgres`GetSnapshotData+0x118
 postgres`GetTransactionSnapshot+0x5c
 postgres`PortalStart+0x150
 postgres`exec_bind_message+0x81c
 postgres`PostgresMain+0x12b8
 postgres`BackendRun+0x278
 postgres`ServerLoop+0x63c
 postgres`PostmasterMain+0xc40
 postgres`main+0x394
 postgres`_start+0x108
 2779000

 postgres`LWLockAcquire+0x1c8
 postgres`CommitTransaction+0xe0
 postgres`CommitTransactionCommand+0x90
 postgres`finish_xact_command+0x60
 postgres`exec_execute_message+0x3d8
 postgres`PostgresMain+0x1300
 postgres`BackendRun+0x278
 postgres`ServerLoop+0x63c
 postgres`PostmasterMain+0xc40
 postgres`main+0x394
 postgres`_start+0x108
   213994800

bash-3.00# echo 1000 users - SLIGHT DROP
1000 users
bash-3.00# ./4_lwlock_waits.d 7056

Lock IdMode   Count
   WALWriteLock   Exclusive   1
CLogControlLock   Exclusive   2
 XidGenLock  Shared   2
CLogControlLock  Shared   3
  WALInsertLock   Exclusive   4
  ProcArrayLock   Exclusive  26

Lock Id   Combined Time (ns)
   WALWriteLock  1807400
 XidGenLock  2024000
 

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-26 Thread Simon Riggs
On Thu, 2007-07-26 at 15:44 -0400, Jignesh K. Shah wrote:
 BEAUTIFUL!!!
 
 Using the Patch I can now go upto 1300 users without dropping.. But now 
 it still repeats at 1300-1350 users..

OK, can you try again with 16 and 32 buffers please?  We need to know
how many is enough and whether this number needs to be variable via a
parameter, or just slightly larger by default. Thanks.

 I corrected the Lock Descriptions based on what I got from lwlock.h and 
 retried the whole scalability again with profiling again.. This time it 
 looks like the ProcArrayLock

That's what I would expect with that many users.

  Lock IdMode   Count
   XidGenLock   Exclusive   1
  CLogControlLock  Shared   2
   XidGenLock  Shared   2
 WALWriteLock   Exclusive   4
WALInsertLock   Exclusive   8
  CLogControlLock   Exclusive   9
ProcArrayLock   Exclusive   9

...but as Tom mentioned, we need to do longer runs now so these counts
get to somewhere in the hundreds so we have some statistical validity.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-26 Thread Robert Lor

Tom Lane wrote:


That path would be taking CLogControlLock ... so you're off by at least
one.  Compare the script to src/include/storage/lwlock.h.



Indeed, the indexing was off by one due to the removal of 
BuffMappingLock in src/include/storage/lwlock.h between 8.1 and 8.2 
which was not updated in the DTrace script.


Thanks,
Robert


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-26 Thread Tom Lane
Jignesh K. Shah [EMAIL PROTECTED] writes:
 For  600-850 Users: that potential mislabeled CheckPointStartLock  or 
 LockID==12 comes from various sources where the top source (while system 
 is still doing great) comes from:

   postgres`LWLockAcquire+0x1c8
   postgres`SimpleLruReadPage_ReadOnly+0xc
   postgres`TransactionIdGetStatus+0x14

That path would be taking CLogControlLock ... so you're off by at least
one.  Compare the script to src/include/storage/lwlock.h.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-26 Thread Jignesh K. Shah

I will look for runs with longer samples..

Now the script could have mislabeled lock names.. Anyway digging into 
the one that seems to increase over time... I did stack profiles on how 
that increases... and here are some numbers..



For  600-850 Users: that potential mislabeled CheckPointStartLock  or 
LockID==12 comes from various sources where the top source (while system 
is still doing great) comes from:




 postgres`LWLockAcquire+0x1c8
 postgres`SimpleLruReadPage_ReadOnly+0xc
 postgres`TransactionIdGetStatus+0x14
 postgres`TransactionLogFetch+0x58
 postgres`TransactionIdDidCommit+0x4
 postgres`HeapTupleSatisfiesSnapshot+0x234
 postgres`heap_release_fetch+0x1a8
 postgres`index_getnext+0xf4
 postgres`IndexNext+0x7c
 postgres`ExecScan+0x8c
 postgres`ExecProcNode+0xb4
 postgres`ExecutePlan+0xdc
 postgres`ExecutorRun+0xb0
 postgres`PortalRunSelect+0x9c
 postgres`PortalRun+0x244
 postgres`exec_execute_message+0x3a0
 postgres`PostgresMain+0x1300
 postgres`BackendRun+0x278
 postgres`ServerLoop+0x63c
 postgres`PostmasterMain+0xc40
 8202100

 postgres`LWLockAcquire+0x1c8
 postgres`TransactionIdSetStatus+0x1c
 postgres`RecordTransactionCommit+0x2a8
 postgres`CommitTransaction+0xc8
 postgres`CommitTransactionCommand+0x90
 postgres`finish_xact_command+0x60
 postgres`exec_execute_message+0x3d8
 postgres`PostgresMain+0x1300
 postgres`BackendRun+0x278
 postgres`ServerLoop+0x63c
 postgres`PostmasterMain+0xc40
 postgres`main+0x394
 postgres`_start+0x108
30822100


However at 900 Users where the big drop in throughput occurs:
It gives a different top consumer of time:



 postgres`LWLockAcquire+0x1c8
 postgres`TransactionIdSetStatus+0x1c
 postgres`RecordTransactionCommit+0x2a8
 postgres`CommitTransaction+0xc8
 postgres`CommitTransactionCommand+0x90
 postgres`finish_xact_command+0x60
 postgres`exec_execute_message+0x3d8
 postgres`PostgresMain+0x1300
 postgres`BackendRun+0x278
 postgres`ServerLoop+0x63c
 postgres`PostmasterMain+0xc40
 postgres`main+0x394
 postgres`_start+0x108
   406601300

 postgres`LWLockAcquire+0x1c8
 postgres`SimpleLruReadPage+0x1ac
 postgres`TransactionIdGetStatus+0x14
 postgres`TransactionLogFetch+0x58
 postgres`TransactionIdDidCommit+0x4
 postgres`HeapTupleSatisfiesUpdate+0x360
 postgres`heap_lock_tuple+0x27c
 postgres`ExecutePlan+0x33c
 postgres`ExecutorRun+0xb0
 postgres`PortalRunSelect+0x9c
 postgres`PortalRun+0x244
 postgres`exec_execute_message+0x3a0
 postgres`PostgresMain+0x1300
 postgres`BackendRun+0x278
 postgres`ServerLoop+0x63c
 postgres`PostmasterMain+0xc40
 postgres`main+0x394
 postgres`_start+0x108
   444523100

 postgres`LWLockAcquire+0x1c8
 postgres`SimpleLruReadPage+0x1ac
 postgres`TransactionIdGetStatus+0x14
 postgres`TransactionLogFetch+0x58
 postgres`TransactionIdDidCommit+0x4
 postgres`HeapTupleSatisfiesSnapshot+0x234
 postgres`heap_release_fetch+0x1a8
 postgres`index_getnext+0xf4
 postgres`IndexNext+0x7c
 postgres`ExecScan+0x8c
 postgres`ExecProcNode+0xb4
 postgres`ExecutePlan+0xdc
 postgres`ExecutorRun+0xb0
 postgres`PortalRunSelect+0x9c
 postgres`PortalRun+0x244
 postgres`exec_execute_message+0x3a0
 postgres`PostgresMain+0x1300
 postgres`BackendRun+0x278
 postgres`ServerLoop+0x63c
 postgres`PostmasterMain+0xc40
  166130



Maybe you all will understand more than I do on what it does here.. 
Looks like IndexNext has a problem at high number of users to me.. but I 
could be wrong..


-Jignesh




Tom Lane wrote:

Jignesh K. Shah [EMAIL PROTECTED] writes:
  
The count is only for a 10-second snapshot.. Plus remember there are 
about 1000 users running so the connection  being profiled only gets 
0.01 of the period on CPU..  And the count is for that CONNECTION only.



OK, that explains the low absolute levels of the counts, but if the
counts are for a regular backend then there's still a lot of bogosity
here.  Backends won't be taking the CheckpointLock at all, nor do they
take CheckpointStartLock in exclusive mode.  The bgwriter would do that
but 

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-26 Thread Jignesh K. Shah
The count is only for a 10-second snapshot.. Plus remember there are 
about 1000 users running so the connection  being profiled only gets 
0.01 of the period on CPU..  And the count is for that CONNECTION only.


Anyway using the lock wait script it shows the real picture as you 
requested. Here the combined time means time spent waiting for the lock.


bash-3.00# echo 500 users
500 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock IdMode   Count
 OidGenLock   Exclusive   1
CheckpointStartLock  Shared   3
 OidGenLock  Shared   4
  WALInsertLock   Exclusive   7
  FreeSpaceLock   Exclusive   8
 XidGenLock   Exclusive  15
CheckpointStartLock   Exclusive  16

Lock Id   Combined Time (ns)
 XidGenLock  3825000
 OidGenLock  5307100
  WALInsertLock  6317800
  FreeSpaceLock  7244100
CheckpointStartLock 22199200

bash-3.00# echo 600 users
600 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock IdMode   Count
 OidGenLock   Exclusive   1
  WALInsertLock   Exclusive   1
CheckpointStartLock  Shared   4
CheckpointStartLock   Exclusive  11
 XidGenLock   Exclusive  21

Lock Id   Combined Time (ns)
 OidGenLock  1728200
  WALInsertLock  2040700
 XidGenLock 19878500
CheckpointStartLock 24156500

bash-3.00# echo 700 users
700 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock IdMode   Count
 OidGenLock  Shared   1
 XidGenLock  Shared   1
CheckpointStartLock  Shared   2
  WALInsertLock   Exclusive   4
CheckpointStartLock   Exclusive   6
  FreeSpaceLock   Exclusive   6
 XidGenLock   Exclusive  13

Lock Id   Combined Time (ns)
 OidGenLock  173
  WALInsertLock  7253400
  FreeSpaceLock 10977700
CheckpointStartLock 13356800
 XidGenLock 38220500

bash-3.00# echo 800 users
800 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock IdMode   Count
CheckpointStartLock  Shared   1
  WALInsertLock   Exclusive   2
 XidGenLock  Shared   2
CheckpointStartLock   Exclusive   5
  FreeSpaceLock   Exclusive   8
 XidGenLock   Exclusive  12

Lock Id   Combined Time (ns)
  WALInsertLock  3746800
  FreeSpaceLock  7628900
CheckpointStartLock 11297500
 XidGenLock 16649000

bash-3.00# echo 900 users  - BIG DROP IN THROUGHPUT OCCURS...
900 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock IdMode   Count
 OidGenLock   Exclusive   1
 OidGenLock  Shared   1
 XidGenLock  Shared   1
  FreeSpaceLock   Exclusive   2
  WALInsertLock   Exclusive   2
CheckpointStartLock  Shared   6
 XidGenLock   Exclusive  12
CheckpointStartLock   Exclusive 121

Lock Id   Combined Time (ns)
 OidGenLock  1968100
  FreeSpaceLock  2076300
  WALInsertLock  2190400
 XidGenLock 20259400
CheckpointStartLock   1407294300

bash-3.00# echo 950 users
950 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock IdMode   Count
 OidGenLock   Exclusive   1
 OidGenLock  Shared   2
CheckpointStartLock  Shared   3
  WALInsertLock   Exclusive   4
  FreeSpaceLock   Exclusive   5
 XidGenLock   Exclusive  11
CheckpointStartLock   Exclusive  50

Lock Id   Combined Time (ns)
  WALInsertLock  5577100
  FreeSpaceLock  9115900
 XidGenLock 13765800
 OidGenLock 50155500
CheckpointStartLock759872200

bash-3.00# echo 1000 users
1000 users
bash-3.00# ./4_lwlock_waits.d 20764

Lock IdMode   Count
 OidGenLock  Shared   1
  WALInsertLock   Exclusive   1
 XidGenLock   Exclusive   5
CheckpointStartLock  Shared   6
CheckpointStartLock   

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-26 Thread Jignesh K. Shah

Will try 16 and 32  CLOGBUFFER tomorrow:

But here is locks data again with about increased time profiling (about 
2 minutes) for the connection with about 2000 users:


bash-3.00# time ./4_lwlock_waits.d 13583
^C

Lock IdMode   Count
  ProcArrayLock  Shared   5
 XidGenLock   Exclusive  13
CLogControlLock  Shared  14
 XidGenLock  Shared  15
CLogControlLock   Exclusive  21
  WALInsertLock   Exclusive  77
   WALWriteLock   Exclusive 175
  ProcArrayLock   Exclusive 275

Lock Id   Combined Time (ns)
 XidGenLock194966200
  WALInsertLock517955000
CLogControlLock679665100
   WALWriteLock   2838716200
  ProcArrayLock  44181002600


Top Wait time   seems to come from the following code path for 
ProcArrayLock:


Lock IdMode   Count
  ProcArrayLock   Exclusive  21

Lock Id   Combined Time (ns)
  ProcArrayLock   5255937500

Lock Id   Combined Time (ns)


 postgres`LWLockAcquire+0x1f0
 postgres`CommitTransaction+0x104
 postgres`CommitTransactionCommand+0xbc
 postgres`finish_xact_command+0x78
 postgres`exec_execute_message+0x42c
 postgres`PostgresMain+0x1838
 postgres`BackendRun+0x2f8
 postgres`ServerLoop+0x680
 postgres`PostmasterMain+0xda8
 postgres`main+0x3d0
 postgres`_start+0x17c
  5255937500



Regards,
Jignesh


Simon Riggs wrote:

On Thu, 2007-07-26 at 15:44 -0400, Jignesh K. Shah wrote:
  

BEAUTIFUL!!!

Using the Patch I can now go upto 1300 users without dropping.. But now 
it still repeats at 1300-1350 users..



OK, can you try again with 16 and 32 buffers please?  We need to know
how many is enough and whether this number needs to be variable via a
parameter, or just slightly larger by default. Thanks.

  
I corrected the Lock Descriptions based on what I got from lwlock.h and 
retried the whole scalability again with profiling again.. This time it 
looks like the ProcArrayLock



That's what I would expect with that many users.

  

 Lock IdMode   Count
  XidGenLock   Exclusive   1
 CLogControlLock  Shared   2
  XidGenLock  Shared   2
WALWriteLock   Exclusive   4
   WALInsertLock   Exclusive   8
 CLogControlLock   Exclusive   9
   ProcArrayLock   Exclusive   9



...but as Tom mentioned, we need to do longer runs now so these counts
get to somewhere in the hundreds so we have some statistical validity.

  


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-26 Thread Simon Riggs
On Thu, 2007-07-26 at 10:29 -0400, Jignesh K. Shah wrote:
 The count is only for a 10-second snapshot.. Plus remember there are 
 about 1000 users running so the connection  being profiled only gets 
 0.01 of the period on CPU..  And the count is for that CONNECTION only.

Is that for one process, or all processes aggregated in some way?

  CheckpointStartLock  Shared   6
  CheckpointStartLock   Exclusive 102

That's definitely whacked. Surely we didn't start 102 checkpoints yet
attempt to commit 6 times?
   
-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-26 Thread Tom Lane
Jignesh K. Shah [EMAIL PROTECTED] writes:
 The count is only for a 10-second snapshot.. Plus remember there are 
 about 1000 users running so the connection  being profiled only gets 
 0.01 of the period on CPU..  And the count is for that CONNECTION only.

OK, that explains the low absolute levels of the counts, but if the
counts are for a regular backend then there's still a lot of bogosity
here.  Backends won't be taking the CheckpointLock at all, nor do they
take CheckpointStartLock in exclusive mode.  The bgwriter would do that
but it'd not be taking most of these other locks.  So I think the script
is mislabeling the locks somehow.

Also, elementary statistics should tell you that a sample taken as above
is going to have enormous amounts of noise.  You should be sampling over
a much longer period, say on the order of a minute of runtime, to have
numbers that are trustworthy.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-26 Thread Simon Riggs
On Thu, 2007-07-26 at 11:27 -0400, Jignesh K. Shah wrote:

 However at 900 Users where the big drop in throughput occurs:
 It gives a different top consumer of time:


postgres`LWLockAcquire+0x1c8
   postgres`SimpleLruReadPage+0x1ac
   postgres`TransactionIdGetStatus+0x14
   postgres`TransactionLogFetch+0x58

TransactionIdGetStatus doesn't directly call SimpleLruReadPage().
Presumably the compiler has been rearranging things??

Looks like you're out of clog buffers. It seems like the clog buffers
aren't big enough to hold clog pages for long enough and the SELECT FOR
SHARE processing is leaving lots of additional read locks that are
increasing the number of clog requests for older xids.

Try the enclosed patch.
 
-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com
Index: src/include/access/clog.h
===
RCS file: /projects/cvsroot/pgsql/src/include/access/clog.h,v
retrieving revision 1.19
diff -c -r1.19 clog.h
*** src/include/access/clog.h	5 Jan 2007 22:19:50 -	1.19
--- src/include/access/clog.h	26 Jul 2007 15:44:58 -
***
*** 29,35 
  
  
  /* Number of SLRU buffers to use for clog */
! #define NUM_CLOG_BUFFERS	8
  
  
  extern void TransactionIdSetStatus(TransactionId xid, XidStatus status);
--- 29,35 
  
  
  /* Number of SLRU buffers to use for clog */
! #define NUM_CLOG_BUFFERS	64	
  
  
  extern void TransactionIdSetStatus(TransactionId xid, XidStatus status);

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-25 Thread Tom Lane
Jignesh K. Shah [EMAIL PROTECTED] writes:
 Here is how I got the numbers..
 I had about 1600 users login into postgresql. Then started the run with 
 500 users and using DTrace I started tracking Postgresql  Locking as 
 viewed from one user/connection. Echo statements indicate how many 
 users were active at that point and how was throughput performing. All 
 IO is done on /tmp which means on a RAM disk.

 bash-3.00# echo 500 users - baseline number
 500 users
 bash-3.00# ./3_lwlock_acquires.d 19178

  Lock IdMode   Count
 FirstLockMgrLock   Exclusive   1
 RelCacheInitLock   Exclusive   2
   SInvalLock   Exclusive   2
WALInsertLock   Exclusive  10
   BufMappingLock   Exclusive  12
   CheckpointLock  Shared  29
  CheckpointStartLock  Shared  29
   OidGenLock   Exclusive  29
   XidGenLock   Exclusive  29
 FirstLockMgrLock  Shared  33
  CheckpointStartLock   Exclusive  78
FreeSpaceLock   Exclusive 114
   OidGenLock  Shared 126
   XidGenLock  Shared 152
ProcArrayLock  Shared 482

I don't think I believe these numbers.  For one thing, CheckpointLock
is simply not ever taken in shared mode.  The ratios of counts for
different locks seems pretty improbable too, eg there is no way on
earth that the LockMgrLocks are taken more often shared than
exclusive (I would expect no shared acquires at all in the sort of
test you are running).  Not to mention that the absolute number of
counts seems way too low.  So I think the counting tool is broken.

  Lock Id   Combined Time (ns)
   SInvalLock29800
 RelCacheInitLock30300
   BufMappingLock   168800
 FirstLockMgrLock   414300
FreeSpaceLock  1281700
ProcArrayLock  7869900
WALInsertLock 3200
  CheckpointStartLock 13494700
   OidGenLock 25719100
   XidGenLock 26443300
   CheckpointLock194267800

Combined time of what exactly?  It looks like this must be the total
duration the lock is held, at least assuming that the time for
CheckpointLock is correctly reported.  It'd be much more useful to see
the total time spent waiting to acquire the lock.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-25 Thread Jignesh K. Shah

Here is how I got the numbers..
I had about 1600 users login into postgresql. Then started the run with 
500 users and using DTrace I started tracking Postgresql  Locking as 
viewed from one user/connection. Echo statements indicate how many 
users were active at that point and how was throughput performing. All 
IO is done on /tmp which means on a RAM disk.


bash-3.00# echo 500 users - baseline number
500 users
bash-3.00# ./3_lwlock_acquires.d 19178

Lock IdMode   Count
   FirstLockMgrLock   Exclusive   1
   RelCacheInitLock   Exclusive   2
 SInvalLock   Exclusive   2
  WALInsertLock   Exclusive  10
 BufMappingLock   Exclusive  12
 CheckpointLock  Shared  29
CheckpointStartLock  Shared  29
 OidGenLock   Exclusive  29
 XidGenLock   Exclusive  29
   FirstLockMgrLock  Shared  33
CheckpointStartLock   Exclusive  78
  FreeSpaceLock   Exclusive 114
 OidGenLock  Shared 126
 XidGenLock  Shared 152
  ProcArrayLock  Shared 482

Lock Id   Combined Time (ns)
 SInvalLock29800
   RelCacheInitLock30300
 BufMappingLock   168800
   FirstLockMgrLock   414300
  FreeSpaceLock  1281700
  ProcArrayLock  7869900
  WALInsertLock 3200
CheckpointStartLock 13494700
 OidGenLock 25719100
 XidGenLock 26443300
 CheckpointLock194267800

bash-3.00# echo 600 users - Throughput rising
600 users
bash-3.00# ./3_lwlock_acquires.d 19178

Lock IdMode   Count
   RelCacheInitLock   Exclusive   1
 SInvalLock   Exclusive   1
 BufMappingLock   Exclusive   2
CLogControlLock   Exclusive   2
  WALInsertLock   Exclusive  11
   FirstLockMgrLock  Shared  20
 CheckpointLock  Shared  24
CheckpointStartLock  Shared  24
 OidGenLock   Exclusive  24
 XidGenLock   Exclusive  24
CheckpointStartLock   Exclusive  72
  FreeSpaceLock   Exclusive 102
 OidGenLock  Shared 106
 XidGenLock  Shared 128
  ProcArrayLock  Shared 394

Lock Id   Combined Time (ns)
 SInvalLock15600
   RelCacheInitLock15700
 BufMappingLock31000
CLogControlLock41500
   FirstLockMgrLock   289000
  FreeSpaceLock  3045400
CheckpointStartLock  7371800
  WALInsertLock  9383200
  ProcArrayLock 10457900
 OidGenLock 20005900
 XidGenLock 20331500
 CheckpointLock187067900

bash-3.00# echo 700 users  - Throughput rising
700 users
bash-3.00# ./3_lwlock_acquires.d 19178

Lock IdMode   Count
   RelCacheInitLock   Exclusive   1
 SInvalLock   Exclusive   1
 BufMappingLock   Exclusive   2
  WALInsertLock   Exclusive  17
 CheckpointLock  Shared  33
CheckpointStartLock  Shared  33
 OidGenLock   Exclusive  33
 XidGenLock   Exclusive  33
   FirstLockMgrLock  Shared  81
CheckpointStartLock   Exclusive  87
  FreeSpaceLock   Exclusive 124
 OidGenLock  Shared 125
 XidGenLock  Shared 150
  ProcArrayLock  Shared 500

Lock Id   Combined Time (ns)
   RelCacheInitLock15100
 SInvalLock15400
 BufMappingLock47400
   FirstLockMgrLock  3021000
  FreeSpaceLock  3794300
  WALInsertLock  7567300
 XidGenLock 18427400
  ProcArrayLock 20884000
CheckpointStartLock 24084900
 OidGenLock 26399500
 CheckpointLock256549800

bash-3.00# echo 800 users  - Throughput rising
800 users
bash-3.00# ./3_lwlock_acquires.d 19178

Lock IdMode   Count
 BufMappingLock   Exclusive   1
   RelCacheInitLock   Exclusive   1
 SInvalLock   Exclusive   1
   WALWriteLock   Exclusive   1
  WALInsertLock 

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-23 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 EState is about 8300 bytes,

What?

(gdb) p sizeof(EState)
$1 = 112

This is on a 32-bit machine, but even on 64-bit it wouldn't be more than
double that.

 Would it be worth a special case in the palloc system to avoid having to
 repeatedly issue external memory allocation calls?

The appropriate hack would be to change the AllocSetContextCreate
initial-size parameter for the containing context.  But I really have
no idea what you're on about.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-23 Thread Simon Riggs
On Mon, 2007-07-23 at 10:11 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  EState is about 8300 bytes,
 
 What?
 
 (gdb) p sizeof(EState)
 $1 = 112
 
 This is on a 32-bit machine, but even on 64-bit it wouldn't be more than
 double that.
 
  Would it be worth a special case in the palloc system to avoid having to
  repeatedly issue external memory allocation calls?
 
 The appropriate hack would be to change the AllocSetContextCreate
 initial-size parameter for the containing context.  But I really have
 no idea what you're on about.

I looked at this last May and my notes say ExecutorState. I guess that
was wrong, but my analysis showed there was a single malloc of 8228
bytes happening once per query during my tests. 

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-23 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 I looked at this last May and my notes say ExecutorState. I guess that
 was wrong, but my analysis showed there was a single malloc of 8228
 bytes happening once per query during my tests. 

Well, if you can track down where it's coming from, we could certainly
hack the containing context's parameters.  But EState's not it.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-23 Thread Simon Riggs
On Mon, 2007-07-23 at 10:54 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  I looked at this last May and my notes say ExecutorState. I guess that
  was wrong, but my analysis showed there was a single malloc of 8228
  bytes happening once per query during my tests. 
 
 Well, if you can track down where it's coming from, we could certainly
 hack the containing context's parameters.  But EState's not it.

Well, I discover there is an allocation of 8232 (inflation...) made once
per statement by a memory context called... ExecutorState. Still not
sure exactly which allocation this is, but its definitely once per
statement on pgbench, which should narrow it down. Plan, query etc?

I don't see a way to hack the allocation, since the max chunk size is
8K.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-23 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Well, I discover there is an allocation of 8232 (inflation...) made once
 per statement by a memory context called... ExecutorState. Still not
 sure exactly which allocation this is, but its definitely once per
 statement on pgbench, which should narrow it down. Plan, query etc?

Are you working with stock sources?  The only allocation exceeding 1K
that I can see during pgbench is BTScanOpaqueData, which is 6600 bytes.
(Checked by setting a conditional breakpoint on AllocSetAlloc.)  The
path that allocates a single-chunk block is never taken at all.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-23 Thread Simon Riggs
On Mon, 2007-07-23 at 16:48 +0100, Simon Riggs wrote:
 On Mon, 2007-07-23 at 10:54 -0400, Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   I looked at this last May and my notes say ExecutorState. I guess that
   was wrong, but my analysis showed there was a single malloc of 8228
   bytes happening once per query during my tests. 
  
  Well, if you can track down where it's coming from, we could certainly
  hack the containing context's parameters.  But EState's not it.
 
 Well, I discover there is an allocation of 8232 (inflation...) made once
 per statement by a memory context called... ExecutorState. Still not
 sure exactly which allocation this is, but its definitely once per
 statement on pgbench, which should narrow it down. Plan, query etc?
 
 I don't see a way to hack the allocation, since the max chunk size is
 8K.

It is the allocation of BTScanOpaqueData called from btrescan() in
nbtree.c

currPos and markPos are defined as BTScanPosData, which is an array of
BTScanPosItems. That makes BTScanOpaqueData up to 8232 bytes, which
seems wasteful since markPos is only ever used during merge joins. Most
of that space isn't even used during merge joins either, we just do that
to slightly optimise the speed of the restore during merge joins.

Seems like we should allocate the memory when we do the first mark.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-23 Thread Simon Riggs
On Mon, 2007-07-23 at 12:35 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Well, I discover there is an allocation of 8232 (inflation...) made once
  per statement by a memory context called... ExecutorState. Still not
  sure exactly which allocation this is, but its definitely once per
  statement on pgbench, which should narrow it down. Plan, query etc?
 
 Are you working with stock sources?  The only allocation exceeding 1K
 that I can see during pgbench is BTScanOpaqueData, which is 6600 bytes.
 (Checked by setting a conditional breakpoint on AllocSetAlloc.)  The
 path that allocates a single-chunk block is never taken at all.

I do have the bitmap patch currently applied, but it doesn't touch that
part of the code.

(gdb) p size
$1 = 8232

(gdb) p sizeof(int)
$2 = 4

(gdb) p sizeof(BTScanPosData)
$3 = 4104

Since my notes say I got 8228 last year, seems reasonable.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-23 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 currPos and markPos are defined as BTScanPosData, which is an array of
 BTScanPosItems. That makes BTScanOpaqueData up to 8232 bytes, which
 seems wasteful since markPos is only ever used during merge joins. Most
 of that space isn't even used during merge joins either, we just do that
 to slightly optimise the speed of the restore during merge joins.

Ah.  I was seeing it as 6600 bytes on HPPA and 6608 on x86_64, but
I forgot that both of those architectures have MAXALIGN = 8.  On a
MAXALIGN = 4 machine, MaxIndexTuplesPerPage will be significantly
larger, leading to larger BTScanPosData.

Not sure it's worth fooling with, given that these days almost everyone
who's seriously concerned about performance is probably using 64bit
hardware.  One less malloc cycle per indexscan is never going to be a
measurable savings anyway...

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-23 Thread Simon Riggs
On Mon, 2007-07-23 at 14:19 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  currPos and markPos are defined as BTScanPosData, which is an array of
  BTScanPosItems. That makes BTScanOpaqueData up to 8232 bytes, which
  seems wasteful since markPos is only ever used during merge joins. Most
  of that space isn't even used during merge joins either, we just do that
  to slightly optimise the speed of the restore during merge joins.
 
 Ah.  I was seeing it as 6600 bytes on HPPA and 6608 on x86_64, but
 I forgot that both of those architectures have MAXALIGN = 8.  On a
 MAXALIGN = 4 machine, MaxIndexTuplesPerPage will be significantly
 larger, leading to larger BTScanPosData.
 
 Not sure it's worth fooling with, given that these days almost everyone
 who's seriously concerned about performance is probably using 64bit
 hardware.  One less malloc cycle per indexscan is never going to be a
 measurable savings anyway...

Oh sure, I was thinking to avoid Solaris' mutex by avoiding malloc()
completely.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Josh Berkus [EMAIL PROTECTED] writes:

 That's an interesting thought.  Let me check lock counts and see if this is 
 possibly the case.

 AFAIK you'd get hard failures, not slowdowns, if you ran out of lock
 space entirely

I assume you've checked the server logs and are sure that you aren't in fact
getting errors. I could, for example, envision a situation where a fraction of
the transactions are getting some error and those transactions are therefore
not being counted against the txn/s result.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Greg Smith

On Thu, 19 Jul 2007, Josh Berkus wrote:

It's a TPCC-like workload, so heavy single-row updates, and the 
updates/inserts are what's being measured.


There's so much going on with a TPC-C kind of workload.  Has anyone ever 
looked into quantifying scaling for more fundamental operations?  There 
are so many places a complicated workload could get caught up that 
starting there is hard.  I've found it's helpful to see the breaking 
points for simpler operations, then compare how things change as each new 
transaction element is introduced.


As an example, take a look at the MySQL SysBench tool:
http://sysbench.sourceforge.net/docs/

Specifically their oltp tests.  Note how you can get a handle on how 
simple selects scale, then simple inserts, then updates, and so on.  The 
only thing I've thought of they missed is testing a trivial operation that 
doesn't even touch the buffer cache ('SELECT 1'?) that would let you 
quantify just general connection scaling issues.


It seems to me that you could narrow the list of possible causes here much 
more quickly if you had a good handle on the upper concurrency of 
lower-level operations.


[Note:  it's possible to run SysBench against a PG database, but the code 
is very immature.  Last time I tried there were plenty of crashes and 
there seemed to be some transaction wrapping issues that caused deadlocks 
with some tests.]


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Josh Berkus

Greg,

There's so much going on with a TPC-C kind of workload.  Has anyone ever 
looked into quantifying scaling for more fundamental operations?  There 
are so many places a complicated workload could get caught up that 
starting there is hard.  I've found it's helpful to see the breaking 
points for simpler operations, then compare how things change as each 
new transaction element is introduced.


... eagerly awaiting Michael Doilson's PgUnitTest 

--Josh

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah

Awww Josh,

I was just enjoying the chat on the picket fence! :-)

Anyway the workload is mixed (reads,writes) with simple to medium 
queries. The workload is known to scale well. But inorder to provide 
substantial input I am still trying to eliminate things that can 
bottleneck.  Currently I have eliminated CPU (plenty free) , RAM 
(memory  is 48GB RAM in this server for  a 32-bit postgresql instance), 
IO Storage (used the free ram to do /tmp database to eliminate IO)  and 
am still trying to eliminate any network bottlenecks to say for sure we 
have a problem in PostgreSQL. But yes till that final thing is confirmed 
(network which can very well be the case) it could be a problem 
somewhere else. However the thing that worries me is more of the big 
drop instead of remaining constant out there..


Anyway more on this within a day or two once I add more network nics 
between the systems to eliminate network problems (even though stats 
dont show them as problems right now) and also reduce malloc lock 
penalties if any.


As for other questions:

max_locks_per_transactions is set to default (10 I believe) increasing 
it still seems to degrade overall throughput number.


max_connections is set to 1500 for now till I get decent scaling till 
1400-1500 users.


There are no hard failures reported anywhere. Log min durations does 
show that queries are now slowing down and taking longer.


OS is not swapping and also eliminated IO  by putting the whole database 
on /tmp


So while I finish adding more network connections between the two 
systems (need to get cards) do enjoy the following URL :-)


http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-20070703-00073.html

Of course all postgresql.conf still remains from the old test so no 
flames on that one again :-)


Regards,
Jignesh




Josh Berkus wrote:

Tom,

  

Well, if the load is a lot of short writing transactions then you'd
expect the throughput to depend on how fast stuff can be pushed down to
WAL.  What have you got wal_buffers set to?  Are you using a commit
delay?  What's the I/O system anyway (any BB write cache on the WAL
disk?) and what wal sync method are you using?



You know, I think Jignesh needs to me on this list so I can stop relaying 
questions on a workload I didn't design.  Let me get him.


  


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah
I forgot to add one more piece of information.. I also tried the same 
test with 64-bit postgresql with 6GB shared_buffers and results are the 
same it drops around the same point which to me sounds like a bottleneck..


More later

-Jignesh


Jignesh K. Shah wrote:

Awww Josh,

I was just enjoying the chat on the picket fence! :-)

Anyway the workload is mixed (reads,writes) with simple to medium 
queries. The workload is known to scale well. But inorder to provide 
substantial input I am still trying to eliminate things that can 
bottleneck.  Currently I have eliminated CPU (plenty free) , RAM 
(memory  is 48GB RAM in this server for  a 32-bit postgresql 
instance), IO Storage (used the free ram to do /tmp database to 
eliminate IO)  and am still trying to eliminate any network 
bottlenecks to say for sure we have a problem in PostgreSQL. But yes 
till that final thing is confirmed (network which can very well be the 
case) it could be a problem somewhere else. However the thing that 
worries me is more of the big drop instead of remaining constant out 
there..


Anyway more on this within a day or two once I add more network nics 
between the systems to eliminate network problems (even though stats 
dont show them as problems right now) and also reduce malloc lock 
penalties if any.


As for other questions:

max_locks_per_transactions is set to default (10 I believe) increasing 
it still seems to degrade overall throughput number.


max_connections is set to 1500 for now till I get decent scaling till 
1400-1500 users.


There are no hard failures reported anywhere. Log min durations does 
show that queries are now slowing down and taking longer.


OS is not swapping and also eliminated IO  by putting the whole 
database on /tmp


So while I finish adding more network connections between the two 
systems (need to get cards) do enjoy the following URL :-)


http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-20070703-00073.html 



Of course all postgresql.conf still remains from the old test so no 
flames on that one again :-)


Regards,
Jignesh




Josh Berkus wrote:

Tom,

 

Well, if the load is a lot of short writing transactions then you'd
expect the throughput to depend on how fast stuff can be pushed down to
WAL.  What have you got wal_buffers set to?  Are you using a commit
delay?  What's the I/O system anyway (any BB write cache on the WAL
disk?) and what wal sync method are you using?



You know, I think Jignesh needs to me on this list so I can stop 
relaying questions on a workload I didn't design.  Let me get him.


  


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Tom Lane
Jignesh K. Shah [EMAIL PROTECTED] writes:
 There are no hard failures reported anywhere. Log min durations does 
 show that queries are now slowing down and taking longer.
 OS is not swapping and also eliminated IO  by putting the whole database 
 on /tmp

Hmm.  Do you see any evidence of a context swap storm (ie, a drastic
increase in the context swaps/second reading reported by vmstat)?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah
Yes I did see increase in context switches and CPU migrations at that 
point using mpstat.


Regards,
Jignesh


Tom Lane wrote:

Jignesh K. Shah [EMAIL PROTECTED] writes:
  
There are no hard failures reported anywhere. Log min durations does 
show that queries are now slowing down and taking longer.
OS is not swapping and also eliminated IO  by putting the whole database 
on /tmp



Hmm.  Do you see any evidence of a context swap storm (ie, a drastic
increase in the context swaps/second reading reported by vmstat)?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match
  


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah


Tom Lane wrote:

Jignesh K. Shah [EMAIL PROTECTED] writes:
Yes I did see increase in context switches and CPU migrations at that 
point using mpstat.


So follow that up --- try to determine which lock is being contended
for.  There's some very crude code in the sources that you can enable
with -DLWLOCK_STATS, but probably DTrace would be a better tool.

regards, tom lane


Using plockstat -A -s 5 -p $pid

on bgwriter:  doesnt report anything

On one of the many connections:

This one is hard to read easily
By default, plockstat monitors all lock con-
tention events, gathers  frequency  and  timing  data  about
those  events, and displays the data in decreasing frequency
order, so that the most common events appear first.


^Cbash-3.00# plockstat -A -s 5  -p 6401
^C
Mutex hold

---
Count nsec Lock Caller
  59   186888 0x10059e280  libumem.so.1`process_free+0x12c

 nsec  Time Distribution --- count Stack
16384 || 1 libumem.so.1`process_free+0x12c
32768 |@   |14 postgres`AllocSetDelete+0x98
65536 |@@  | 5 
postgres`MemoryContextDelete+0x78

   131072 || 0 postgres`CommitTransaction+0x240
   262144 |@@@ |39
---
Count nsec Lock Caller
 53012226 0x10059e280  
libumem.so.1`umem_cache_alloc+0x200


 nsec  Time Distribution --- count Stack
 4096 |@@@ |   338 
libumem.so.1`umem_cache_alloc+0x200

 8192 |@   |24 libumem.so.1`umem_alloc+0x5c
16384 |@   |37 libumem.so.1`malloc+0x40
32768 |@   |   131 postgres`AllocSetAlloc+0x1c4
---
Count nsec Lock Caller
 32410214 0x100578030  libumem.so.1`vmem_xfree+0x164

 nsec  Time Distribution --- count Stack
 4096 |@@  |   192 libumem.so.1`vmem_xfree+0x164
 8192 ||56 libumem.so.1`process_free+0x12c
16384 |@   |26 postgres`AllocSetDelete+0x98
32768 |@@@ |50 
postgres`MemoryContextDelete+0x78

---
Count nsec Lock Caller
 16113585 0x10059e280  libumem.so.1`process_free+0x12c

 nsec  Time Distribution --- count Stack
 4096 |@   |   118 libumem.so.1`process_free+0x12c
 8192 || 4 postgres`AllocSetDelete+0x98
16384 |@   |10 
postgres`MemoryContextDelete+0x78

32768 |@@@ |24 postgres`PortalDrop+0x160
65536 || 3
   131072 || 0
   262144 || 2
---
Count nsec Lock Caller
 326 6081 libumem.so.1`vmem0+0xc38 libumem.so.1`vmem_xalloc+0x630

 nsec  Time Distribution --- count Stack
 4096 ||   170 libumem.so.1`vmem_xalloc+0x630
 8192 |@@@ |   155 libumem.so.1`vmem_alloc+0x1f8
16384 || 1 libumem.so.1`vmem_xalloc+0x524
   libumem.so.1`vmem_alloc+0x1f8
---
Count nsec Lock Caller
 326 5867 libumem.so.1`vmem0+0x30  libumem.so.1`vmem_alloc+0x248

 nsec  Time Distribution --- count Stack
 4096 |@   |   185 libumem.so.1`vmem_alloc+0x248
 8192 |@@  |   141 
libumem.so.1`vmem_sbrk_alloc+0x30

   libumem.so.1`vmem_xalloc+0x524
   libumem.so.1`vmem_alloc+0x1f8
---
Count nsec Lock Caller
 318 5873 0x100578030  libumem.so.1`vmem_alloc+0x1d0

 nsec  Time Distribution --- count Stack
 4096 |@   |   228 libumem.so.1`vmem_alloc+0x1d0
 8192 |@   |78 libumem.so.1`umem_alloc+0xec
16384 || 6 libumem.so.1`malloc+0x40
32768 || 6 postgres`AllocSetAlloc+0x1c4
---
Count nsec Lock Caller
 326 5591 

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Tom Lane
Jignesh K. Shah [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 So follow that up --- try to determine which lock is being contended
 for.  There's some very crude code in the sources that you can enable
 with -DLWLOCK_STATS, but probably DTrace would be a better tool.

 Using plockstat -A -s 5 -p $pid

I don't know what that is, but it doesn't appear to have anything to do
with Postgres LWLocks or spinlocks, which are the locks I was thinking of.
Try asking Robert Lor about this --- IIRC he had some dtrace probes to
work with our locks.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah

sorry..

The are solaris mutex locks used by the postgresql process.

What its saying is that there are holds/waits in trying to get locks 
which are locked at Solaris user library levels called from the  
postgresql functions:
For example both the following functions are hitting on the same mutex 
lock  0x10059e280  in Solaris Library call:

postgres`AllocSetDelete+0x98
postgres`AllocSetAlloc+0x1c4


I need to enable the DTrace probes on my builds

-Jignesh

Tom Lane wrote:

Jignesh K. Shah [EMAIL PROTECTED] writes:
  

Tom Lane wrote:


So follow that up --- try to determine which lock is being contended
for.  There's some very crude code in the sources that you can enable
with -DLWLOCK_STATS, but probably DTrace would be a better tool.
  


  

Using plockstat -A -s 5 -p $pid



I don't know what that is, but it doesn't appear to have anything to do
with Postgres LWLocks or spinlocks, which are the locks I was thinking of.
Try asking Robert Lor about this --- IIRC he had some dtrace probes to
work with our locks.

regards, tom lane
  


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Tom Lane
Jignesh K. Shah [EMAIL PROTECTED] writes:
 What its saying is that there are holds/waits in trying to get locks 
 which are locked at Solaris user library levels called from the  
 postgresql functions:
 For example both the following functions are hitting on the same mutex 
 lock  0x10059e280  in Solaris Library call:
 postgres`AllocSetDelete+0x98
 postgres`AllocSetAlloc+0x1c4

That's a perfect example of the sort of useless overhead that I was
complaining of just now in pgsql-patches.  Having malloc/free use
an internal mutex is necessary in multi-threaded programs, but the
backend isn't multi-threaded.  And yet, apparently you can't turn
that off in Solaris.

(Fortunately, the palloc layer is probably insulating us from malloc's
performance enough that this isn't a huge deal.  But it's annoying.)

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Jignesh K. Shah
True you cant switch off the locks since libthread has been folded into 
libc in Solaris 10.


Anyway just to give you an idea of the increase in context switching at 
the break point here are the mpstat (taken at 10 second interval) on 
this 8-socket Sun Fire V890.


The low icsw (Involuntary Context Switches) is about 950-1000 user mark 
after which a context switch storm starts at users above 1000-1050 mark 
and drops in total throughput drops about 30% instantaneously.. I will 
try rebuilding the postgresql with dtrace probes to get more clues. 
(NOTE you will see 1 cpu (cpuid:22) doing more system work... thats the 
one doing handling the network interrupts)



CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
 0   57   0   27   1086 4072   98 1749  4161  7763   47  13   0  40
 1   46   0   24226 4198   11 1826  4270  7547   45  13   0  42
 2   42   0   34   1048 4103   91 1682  4241  7797   46  13   0  41
 3   51   0   22216 4125   10 1734  4350  7399   45  13   0  43
 4   65   0   27196 40158 1706  4110  7292   44  15   0  41
 5   54   0   21216 4297   10 1702  4640  7708   45  13   0  42
 6   36   0   1666   47 4218   12 1713  4260  7685   47  11   0  42
 7   40   0  100   318  206 3699   10 1534  5850  6851   45  14   0  41
16   41   0   30875 3780   78 1509  4011  7604   45  13   0  42
17   39   0   24225 3970   12 1631  4080  7265   44  12   0  44
18   42   0   24995 3829   89 1519  4011  7343   45  12   0  43
19   39   0   31 788305 35888 1509  4000  6629   43  13   0  44
20   22   0   20196 39259 1577  4190  7364   44  12   0  44
21   38   0   31235 3792   13 1566  4070  7133   45  12   0  44
228   0  110  7053 7045 16418  728  8380  2917   16  50   0  33
23   62   0   29215 3985   10 1579  4490  7368   44  12   0  44
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
 0   13   0   27   1236 4228  113 1820  4331  8084   49  13   0  38
 1   16   0   63266 4253   15 1875  4200  7754   47  14   0  39
 2   11   0   31   1108 4178   97 1741  4251  8095   48  14   0  38
 38   0   24206 42579 1818  4440  7807   47  13   0  40
 4   13   0   54286 4145   17 1774  4261  7732   46  16   0  38
 5   12   0   35236 4412   12 1775  4470  8249   48  13   0  39
 68   0   2438   15 4323   14 1760  4220  8016   49  11   0  39
 78   0  120   323  206 3801   15 1599  6350  7290   47  15   0  38
16   11   0   44   1075 3896   98 1582  3931  7997   47  15   0  39
17   15   0   29245 4120   14 1716  4160  7648   46  13   0  41
189   0   35   1135 3933  103 1594  3991  7714   47  13   0  40
198   0   34 832715 3702   12 1564  4030  7010   45  14   0  41
207   0   28276 3997   16 1624  4000  7676   46  13   0  41
218   0   28255 3997   15 1664  4020  7658   47  12   0  41
224   0   97  7741 7731 1586   11  704  9060  2933   17  51   0  32
23   13   0   28255 4144   15 1658  4370  7810   47  12   0  41
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
 00   0  141   3156 9262  301 2812  3300 10905   49  16   0  35
 11   0  153   1996 9400  186 2808  3120 11066   48  16   0  37
 20   0  140   2568 8798  242 2592  3100 10111   47  15   0  38
 31   0  141   1896 8803  172 2592  3140 10171   47  15   0  39
 40   0  120   2146 9540  207 2801  3220 10531   46  17   0  36
 51   0  152   1806 8764  161 2564  3420  9904   47  15   0  38
 61   0  107   344  148 8180  181 2512  2900  9314   51  14   0  35
 70   0  665   443  204 8733  153 2574  4040  9892   43  21   0  37
160   0  113   2175 6446  201 1975  2650  7552   45  12   0  44
170   0  107   1535 6568  140 2021  2740  7586   44  11   0  45
180   0  121   2155 6072  201 1789  2761  7690   44  12   0  44
191   0  102 471425 6123  126 1829  2620  7185   43  12   0  45
200   0  102   1436 6451  129 1939  2620  7450   43  13   0  44
211   0  106   1505 6538  133 1997  2850  7425   44  11   0  44
220   0  494  5949 5876 3586   73 1040  3990  4058   26  39   0  34
230   0  102   1595 6393  142 1942  3240  7226   43  12   0  46
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
 00   0  217   4417 10763  426 3234  3630 12449   47  18   
0  35
 10   0  210   3227 3  309 3273  3510 12527   46  17   
0  37
 21   0  212   3878 10306  370 2977  3540 11320   45  16   
0  38
 30   0  230   2767 10332  257 2947  3410 11901   43  16   
0  40
 40   0  234   3067 11324  290 3265  3520 

Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Tom Lane
Jignesh K. Shah [EMAIL PROTECTED] writes:
 Yes I did see increase in context switches and CPU migrations at that 
 point using mpstat.

So follow that up --- try to determine which lock is being contended
for.  There's some very crude code in the sources that you can enable
with -DLWLOCK_STATS, but probably DTrace would be a better tool.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread David Boreham

Tom Lane wrote:

Having malloc/free use
an internal mutex is necessary in multi-threaded programs, but the
backend isn't multi-threaded.  
  
Hmm...confused. I'm not following why then there is contention for the 
mutex.
Surely this has to be some other mutex that is in contention, not a heap 
lock ?


It'd be handy to see the call stack for the wait state -- if the thing 
is spending
a significant proportion of its time in contention it should be easy to 
get that with

a simple tool such as pstack or a debugger.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-20 Thread Tom Lane
David Boreham [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Having malloc/free use
 an internal mutex is necessary in multi-threaded programs, but the
 backend isn't multi-threaded.  

 Hmm...confused. I'm not following why then there is contention for the 
 mutex.

There isn't any contention for that mutex; Jignesh's results merely show
that it was taken and released a lot of times.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Josh Berkus
Folks,

I've run into this a number of times with various PostgreSQL users, so we 
tested it at Sun.  What seems to be happening is that at some specific number 
of connections average throughput drops 30% and response time quadruples or 
worse.  The amount seems to vary per machine; I've seen it as variously 95, 
1050, 1700 or 2800 connections.  Tinkering with postgresql.conf parameters 
doesn't seem to affect this threshold.

As an example of this behavior:

Users   Txn/User  Resp. Time
50  105.38  0.01
100 113.05  0.01
150 114.05  0.01
200 113.51  0.01
250 113.38  0.01
300 112.14  0.01
350 112.26  0.01
400 111.43  0.01
450 110.72  0.01
500 110.44  0.01
550 109.36  0.01
600 107.01  0.02
650 105.71  0.02
700 106.95  0.02
750 107.69  0.02
800 106.78  0.02
850 108.59  0.02
900 106.03  0.02
950 106.13  0.02
100064.58   0.15
105052.32   0.23
110049.79   0.25

Tinkering with shared_buffers has had no effect on this threholding (the above 
was with 3gb to 6gb of shared_buffers).   Any ideas on where we should look 
for the source of the bottleneck?

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Joshua D. Drake

Josh Berkus wrote:

Folks,




650 105.71  0.02
700 106.95  0.02
750 107.69  0.02
800 106.78  0.02
850 108.59  0.02
900 106.03  0.02
950 106.13  0.02
100064.58   0.15
105052.32   0.23
110049.79   0.25

Tinkering with shared_buffers has had no effect on this threholding (the above 
was with 3gb to 6gb of shared_buffers).   Any ideas on where we should look 
for the source of the bottleneck?


I have seen this as well. I always knocked it up to PG having to 
managing so many connections but there are some interesting evidences to 
review.


The amount of memory each connection takes up. Consider 4-11 meg per 
connection depending on various things like number of prepared queries.


Number of CPUs. Obviously 500 connections over 4 CPUS isn't the same as 
500 connections over 8 CPUS.


That number of connections generally means a higher velocity, a higher 
velocity means more checkpoint segments. Wrong settings with your 
checkpoint segments, bgwriter and checkpoint will cause you to start 
falling down.


I would also note that our experience is that PG falls down a little 
higher, more toward 2500 connections last time I checked, but this was 
likely on different hardware.


Joshua D. Drake



--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Alvaro Herrera
Josh Berkus wrote:
 Folks,
 
 I've run into this a number of times with various PostgreSQL users, so we 
 tested it at Sun.  What seems to be happening is that at some specific number 
 of connections average throughput drops 30% and response time quadruples or 
 worse.  The amount seems to vary per machine; I've seen it as variously 95, 
 1050, 1700 or 2800 connections.  Tinkering with postgresql.conf parameters 
 doesn't seem to affect this threshold.
 
 As an example of this behavior:
 
 Users Txn/User  Resp. Time
 50105.38  0.01
 100   113.05  0.01
 150   114.05  0.01
 200   113.51  0.01
 250   113.38  0.01
 300   112.14  0.01
 350   112.26  0.01
 400   111.43  0.01
 450   110.72  0.01
 500   110.44  0.01
 550   109.36  0.01
 600   107.01  0.02
 650   105.71  0.02
 700   106.95  0.02
 750   107.69  0.02
 800   106.78  0.02
 850   108.59  0.02
 900   106.03  0.02
 950   106.13  0.02
 1000  64.58   0.15
 1050  52.32   0.23
 1100  49.79   0.25
 
 Tinkering with shared_buffers has had no effect on this threholding (the 
 above 
 was with 3gb to 6gb of shared_buffers).   Any ideas on where we should look 
 for the source of the bottleneck?

Have you messed with max_connections and/or max_locks_per_transaction
while testing this?  The lock table is sized to max_locks_per_xact times
max_connections, and shared memory hash tables get slower when they are
full.  Of course, the saturation point would depend on the avg number of
locks acquired per user, which would explain why you are seeing a lower
number for some users and higher for others (simpler/more complex
queries).

This is just a guess though.  No profiling or measuring at all, really.

-- 
Alvaro Herrera  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
How amazing is that? I call it a night and come back to find that a bug has
been identified and patched while I sleep.(Robert Davidson)
   http://archives.postgresql.org/pgsql-sql/2006-03/msg00378.php

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Josh Berkus
Alvaro,

 Have you messed with max_connections and/or max_locks_per_transaction
 while testing this?  The lock table is sized to max_locks_per_xact times
 max_connections, and shared memory hash tables get slower when they are
 full.  Of course, the saturation point would depend on the avg number of
 locks acquired per user, which would explain why you are seeing a lower
 number for some users and higher for others (simpler/more complex
 queries).

That's an interesting thought.  Let me check lock counts and see if this is 
possibly the case.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 Alvaro,
 Have you messed with max_connections and/or max_locks_per_transaction
 while testing this?  The lock table is sized to max_locks_per_xact times
 max_connections, and shared memory hash tables get slower when they are
 full.  Of course, the saturation point would depend on the avg number of
 locks acquired per user, which would explain why you are seeing a lower
 number for some users and higher for others (simpler/more complex
 queries).

 That's an interesting thought.  Let me check lock counts and see if this is 
 possibly the case.

AFAIK you'd get hard failures, not slowdowns, if you ran out of lock
space entirely; and the fact that you can continue the curve upwards
says that you're not on the edge of running out.  However I agree that
it's worth experimenting with those two parameters to see if the curve
moves around at all.

Another resource that might be interesting is the number of open files.

Also, have you tried watching vmstat or local equivalent to confirm that
the machine's not starting to swap?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Alvaro Herrera
Tom Lane wrote:
 Josh Berkus [EMAIL PROTECTED] writes:
  Alvaro,
  Have you messed with max_connections and/or max_locks_per_transaction
  while testing this?  The lock table is sized to max_locks_per_xact times
  max_connections, and shared memory hash tables get slower when they are
  full.  Of course, the saturation point would depend on the avg number of
  locks acquired per user, which would explain why you are seeing a lower
  number for some users and higher for others (simpler/more complex
  queries).
 
  That's an interesting thought.  Let me check lock counts and see if this is 
  possibly the case.
 
 AFAIK you'd get hard failures, not slowdowns, if you ran out of lock
 space entirely;

Well, if there still is shared memory available, the lock hash can
continue to grow, but it would slow down according to this comment in
ShmemInitHash:

 * max_size is the estimated maximum number of hashtable entries.  This is
 * not a hard limit, but the access efficiency will degrade if it is
 * exceeded substantially (since it's used to compute directory size and
 * the hash table buckets will get overfull).

For the lock hash tables this max_size is
(MaxBackends+max_prepared_xacts) * max_locks_per_xact.

So maybe this does not make much sense in normal operation, thus not
applicable to what Josh Berkus is reporting.

However I was talking to Josh Drake yesterday and he told me that
pg_dump was spending some significant amount of time in LOCK TABLE when
there are lots of tables (say 300k).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 AFAIK you'd get hard failures, not slowdowns, if you ran out of lock
 space entirely;

 Well, if there still is shared memory available, the lock hash can
 continue to grow, but it would slow down according to this comment in
 ShmemInitHash:

Right, but there's not an enormous amount of headroom in shared memory
beyond the intended size of the hash tables.  I'd think that you'd start
seeing hard failures not very far beyond the point at which performance
impacts became visible.  Of course this is all speculation; I quite
agree with varying the table-size parameters to see if it makes a
difference.

Josh, what sort of workload is being tested here --- read-mostly,
write-mostly, a mixture?

 However I was talking to Josh Drake yesterday and he told me that
 pg_dump was spending some significant amount of time in LOCK TABLE when
 there are lots of tables (say 300k).

I wouldn't be too surprised if there's some O(N^2) effects when a single
transaction holds that many locks, because of the linked-list proclock
data structures.  This would not be relevant to Josh's case though.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Joshua D. Drake

Alvaro Herrera wrote:

Tom Lane wrote:

Josh Berkus [EMAIL PROTECTED] writes:



So maybe this does not make much sense in normal operation, thus not
applicable to what Josh Berkus is reporting.

However I was talking to Josh Drake yesterday and he told me that
pg_dump was spending some significant amount of time in LOCK TABLE when
there are lots of tables (say 300k).


Less, 128k

Joshua D. Drake







---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Greg Smith

On Thu, 19 Jul 2007, Josh Berkus wrote:

What seems to be happening is that at some specific number of 
connections average throughput drops 30% and response time quadruples or 
worse.


Could you characterize what each connection is doing and how you're 
generating the load?  I don't know how productive speculating about the 
cause here will be until there's a test script available so other people 
can see where the tipping point is on their system.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Josh Berkus
Tom, all:

 Also, have you tried watching vmstat or local equivalent to confirm that
 the machine's not starting to swap?

We're not swapping.

 Josh, what sort of workload is being tested here --- read-mostly,
 write-mostly, a mixture?

It's a TPCC-like workload, so heavy single-row updates, and the 
updates/inserts are what's being measured.  For that matter, when I've seen 
this before it was with heavy-write workloads and we were measuring the 
number of updates/inserts and not the number of reads.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 Josh, what sort of workload is being tested here --- read-mostly,
 write-mostly, a mixture?

 It's a TPCC-like workload, so heavy single-row updates, and the 
 updates/inserts are what's being measured.  For that matter, when I've seen 
 this before it was with heavy-write workloads and we were measuring the 
 number of updates/inserts and not the number of reads.

Well, if the load is a lot of short writing transactions then you'd
expect the throughput to depend on how fast stuff can be pushed down to
WAL.  What have you got wal_buffers set to?  Are you using a commit
delay?  What's the I/O system anyway (any BB write cache on the WAL
disk?) and what wal sync method are you using?

While I'm asking questions, exactly what were the data columns you
presented?  Txn/User doesn't make much sense to me, and I'm not sure
what response time you were measuring either.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] User concurrency thresholding: where do I look?

2007-07-19 Thread Josh Berkus
Tom,

 Well, if the load is a lot of short writing transactions then you'd
 expect the throughput to depend on how fast stuff can be pushed down to
 WAL.  What have you got wal_buffers set to?  Are you using a commit
 delay?  What's the I/O system anyway (any BB write cache on the WAL
 disk?) and what wal sync method are you using?

You know, I think Jignesh needs to me on this list so I can stop relaying 
questions on a workload I didn't design.  Let me get him.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend