Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2008-03-13 Thread Simon Riggs
On Wed, 2008-03-12 at 20:13 -0400, Bruce Momjian wrote:
> Is this a TODO?  Tom's reply was:

The general topic, yes. The caveats still apply.

> > Nonsense.  Main transaction exit also takes an exclusive lock, and is
> > far more likely to be exercised in typical workloads than a
> > subtransaction abort.
> > 
> > In any case: there has still not been any evidence presented by anyone
> > that optimizing XidCacheRemoveRunningXids will help one bit.  Given the
> > difficulty of measuring any benefit from the last couple of
> > optimizations in this general area, I'm thinking that such evidence
> > will be hard to come by.  And we have got way more than enough on our
> > plates already.  Can we let go of this for 8.3, please?
> 
> ---
> 
> Simon Riggs wrote:
> > On Wed, 2006-09-13 at 21:45 -0400, Tom Lane wrote:
> > 
> > > Anyway, given that there's this one nonobvious gotcha, there might be
> > > others.  My recommendation is that we take this off the open-items list
> > > for 8.2 and revisit it in the 8.3 cycle when there's more time.
> > 
> > Well, its still 8.3 just...
> > 
> > As discussed in the other thread "Final Thoughts for 8.3 on LWLocking
> > and Scalability", XidCacheRemoveRunningXids() is now the only holder of
> > an X lock during normal processing, so I would like to remove it. 
> > Here's how:
> > 
> > Currently, we take the lock, remove the subxact and then shuffle down
> > all the other subxactIds so that the subxact cache is contiguous.
> > 
> > I propose that we simply zero out the subxact entry without re-arranging
> > the cache; this will be atomic, so we need not acquire an X lock. We
> > then increment ndeletedxids. When we enter a new subxact into the cache,
> > if ndeletedxids > 0 we scan the cache to find an InvalidTransactionId
> > that we can use, then decrement ndeletedxids. So ndeletedxids is just a
> > hint, not an absolute requirement. nxids then becomes the number of
> > cache entries and never goes down until EOXact. The subxact cache is no
> > longer in order, but then it doesn't need to be either.
> > 
> > When we take a snapshot we will end up taking a copy of zeroed cache
> > entries, so the snapshots will be slightly larger than previously.
> > Though still no larger than the max. The size reduction was not so large
> > as to make a significant difference across the whole array, so
> > scalability is the main issue to resolve.
> > 
> > The snapshots will be valid with no change, since InvalidTransactionId
> > will never match against any recorded Xid.
> > 
> > I would also like to make the size of the subxact cache configurable
> > with a parameter such as subtransaction_cache_size = 64 (default), valid
> > range 4-256.
> > 
> > -- 
> >   Simon Riggs
> >   2ndQuadrant  http://www.2ndQuadrant.com
> > 
> > 
> > ---(end of broadcast)---
> > TIP 9: In versions below 8.0, the planner will ignore your desire to
> >choose an index scan if your joining column's datatypes do not
> >match
> 
-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2008-03-12 Thread Bruce Momjian

Is this a TODO?  Tom's reply was:

> Nonsense.  Main transaction exit also takes an exclusive lock, and is
> far more likely to be exercised in typical workloads than a
> subtransaction abort.
> 
> In any case: there has still not been any evidence presented by anyone
> that optimizing XidCacheRemoveRunningXids will help one bit.  Given the
> difficulty of measuring any benefit from the last couple of
> optimizations in this general area, I'm thinking that such evidence
> will be hard to come by.  And we have got way more than enough on our
> plates already.  Can we let go of this for 8.3, please?

---

Simon Riggs wrote:
> On Wed, 2006-09-13 at 21:45 -0400, Tom Lane wrote:
> 
> > Anyway, given that there's this one nonobvious gotcha, there might be
> > others.  My recommendation is that we take this off the open-items list
> > for 8.2 and revisit it in the 8.3 cycle when there's more time.
> 
> Well, its still 8.3 just...
> 
> As discussed in the other thread "Final Thoughts for 8.3 on LWLocking
> and Scalability", XidCacheRemoveRunningXids() is now the only holder of
> an X lock during normal processing, so I would like to remove it. 
> Here's how:
> 
> Currently, we take the lock, remove the subxact and then shuffle down
> all the other subxactIds so that the subxact cache is contiguous.
> 
> I propose that we simply zero out the subxact entry without re-arranging
> the cache; this will be atomic, so we need not acquire an X lock. We
> then increment ndeletedxids. When we enter a new subxact into the cache,
> if ndeletedxids > 0 we scan the cache to find an InvalidTransactionId
> that we can use, then decrement ndeletedxids. So ndeletedxids is just a
> hint, not an absolute requirement. nxids then becomes the number of
> cache entries and never goes down until EOXact. The subxact cache is no
> longer in order, but then it doesn't need to be either.
> 
> When we take a snapshot we will end up taking a copy of zeroed cache
> entries, so the snapshots will be slightly larger than previously.
> Though still no larger than the max. The size reduction was not so large
> as to make a significant difference across the whole array, so
> scalability is the main issue to resolve.
> 
> The snapshots will be valid with no change, since InvalidTransactionId
> will never match against any recorded Xid.
> 
> I would also like to make the size of the subxact cache configurable
> with a parameter such as subtransaction_cache_size = 64 (default), valid
> range 4-256.
> 
> -- 
>   Simon Riggs
>   2ndQuadrant  http://www.2ndQuadrant.com
> 
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>choose an index scan if your joining column's datatypes do not
>match

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2007-09-11 Thread Simon Riggs
On Tue, 2007-09-11 at 09:58 -0400, Tom Lane wrote:

>  Can we let go of this for 8.3, please?

OK, we've moved forward, so its a good place to break.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2007-09-11 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> As discussed in the other thread "Final Thoughts for 8.3 on LWLocking
> and Scalability", XidCacheRemoveRunningXids() is now the only holder of
> an X lock during normal processing,

Nonsense.  Main transaction exit also takes an exclusive lock, and is
far more likely to be exercised in typical workloads than a
subtransaction abort.

In any case: there has still not been any evidence presented by anyone
that optimizing XidCacheRemoveRunningXids will help one bit.  Given the
difficulty of measuring any benefit from the last couple of
optimizations in this general area, I'm thinking that such evidence
will be hard to come by.  And we have got way more than enough on our
plates already.  Can we let go of this for 8.3, please?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2007-09-11 Thread Simon Riggs
On Wed, 2006-09-13 at 21:45 -0400, Tom Lane wrote:

> Anyway, given that there's this one nonobvious gotcha, there might be
> others.  My recommendation is that we take this off the open-items list
> for 8.2 and revisit it in the 8.3 cycle when there's more time.

Well, its still 8.3 just...

As discussed in the other thread "Final Thoughts for 8.3 on LWLocking
and Scalability", XidCacheRemoveRunningXids() is now the only holder of
an X lock during normal processing, so I would like to remove it. 
Here's how:

Currently, we take the lock, remove the subxact and then shuffle down
all the other subxactIds so that the subxact cache is contiguous.

I propose that we simply zero out the subxact entry without re-arranging
the cache; this will be atomic, so we need not acquire an X lock. We
then increment ndeletedxids. When we enter a new subxact into the cache,
if ndeletedxids > 0 we scan the cache to find an InvalidTransactionId
that we can use, then decrement ndeletedxids. So ndeletedxids is just a
hint, not an absolute requirement. nxids then becomes the number of
cache entries and never goes down until EOXact. The subxact cache is no
longer in order, but then it doesn't need to be either.

When we take a snapshot we will end up taking a copy of zeroed cache
entries, so the snapshots will be slightly larger than previously.
Though still no larger than the max. The size reduction was not so large
as to make a significant difference across the whole array, so
scalability is the main issue to resolve.

The snapshots will be valid with no change, since InvalidTransactionId
will never match against any recorded Xid.

I would also like to make the size of the subxact cache configurable
with a parameter such as subtransaction_cache_size = 64 (default), valid
range 4-256.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-09-14 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> --- and because the entries are surely added in increasing XID order,
>> such an array could be binary-searched.  

> If they're only added if they write to disk then isn't it possible to add them
> out of order? Start a child transaction, start a child of that one and write
> to disk, then exit the grandchild and write to disk in the first
> child?

No, because we enforce child XID > parent XID.  In the case above, the
child xact would be given an XID when the grandchild needs one --- see
recursion in AssignSubTransactionId().  The actually slightly shaky
assumption above is that children of the same parent xact must subcommit
in numerical order ... but as long as we have strict nesting of subxacts
I think this must be so.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-09-14 Thread Gregory Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> --- and because the entries are surely added in increasing XID order,
> such an array could be binary-searched.  

If they're only added if they write to disk then isn't it possible to add them
out of order? Start a child transaction, start a child of that one and write
to disk, then exit the grandchild and write to disk in the first child? I'm
just going on your description, I'm not familiar with this part of the code at
all.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-09-14 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> I think Theo's problem is probably somewhere else, too --- apparently
>> it's not so much that TransactionIdIsCurrentTransactionId takes a long
>> time as that something is calling it lots of times with no check for
>> interrupt.

> Could it be something like heap_lock_tuple?  It calls MultiXactIdWait,
> which calls GetMultXactIdMembers and TransactionIdIsCurrentTransactionId
> on each member.  (heap_update and heap_delete do the same thing).  I
> must admit I didn't read Theo's description on his scenario though.

He shows HeapTupleSatisfiesSnapshot as the next thing down the call
stack, so those scenarios don't seem quite right.  I'm wondering about a
CHECK_FOR_INTERRUPTS-free loop in either plperl or trigger handling,
myself.

Anyway, I was thinking some more about Theo's original suggestion that
the linked-list representation of childXids was too inefficient.  I'm
disinclined to use a hash as he suggests, but it strikes me that we
could very easily change the list into a dynamically extended array
--- and because the entries are surely added in increasing XID order,
such an array could be binary-searched.  This wouldn't be a win for
very small numbers of child XIDs, but for large numbers it would.

OTOH, there are probably enough other inefficiencies in handling large
numbers of subxact XIDs that speeding up TransactionIdIsCurrentTransactionId
might be a useless exercise.  It would be good to profile a test case
before spending much effort here.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-09-14 Thread Alvaro Herrera
Tom Lane wrote:
> I wrote:
> > I see a bug though, which is that RecordSubTransactionAbort() calls
> > GetCurrentTransactionId() before having verified that it needs to do
> > anything.  This means that we'll generate and then discard an XID
> > uselessly in a failed subxact that didn't touch disk.
> 
> Well, it would be a bug except that RecordSubTransactionAbort isn't
> called unless the current subxact has an XID.  Perhaps a comment would
> be appropriate but there's nothing to fix here.
> 
> I think Theo's problem is probably somewhere else, too --- apparently
> it's not so much that TransactionIdIsCurrentTransactionId takes a long
> time as that something is calling it lots of times with no check for
> interrupt.

Could it be something like heap_lock_tuple?  It calls MultiXactIdWait,
which calls GetMultXactIdMembers and TransactionIdIsCurrentTransactionId
on each member.  (heap_update and heap_delete do the same thing).  I
must admit I didn't read Theo's description on his scenario though.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-09-14 Thread Tom Lane
I wrote:
> I see a bug though, which is that RecordSubTransactionAbort() calls
> GetCurrentTransactionId() before having verified that it needs to do
> anything.  This means that we'll generate and then discard an XID
> uselessly in a failed subxact that didn't touch disk.

Well, it would be a bug except that RecordSubTransactionAbort isn't
called unless the current subxact has an XID.  Perhaps a comment would
be appropriate but there's nothing to fix here.

I think Theo's problem is probably somewhere else, too --- apparently
it's not so much that TransactionIdIsCurrentTransactionId takes a long
time as that something is calling it lots of times with no check for
interrupt.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-09-14 Thread Tom Lane
I wrote:
> Yeah, I was just looking at that.  Removing useless entries from the
> child-xid list would presumably help him.  Considering we're not even
> formally in beta yet, I'm probably being too conservative to recommend
> we not touch it.

Actually ... wait a minute.  We do not assign an XID to a subtransaction
at all unless it writes a tuple to disk (see GetCurrentTransactionId
and its callers).  So this whole "optimization" idea is redundant.

I see a bug though, which is that RecordSubTransactionAbort() calls
GetCurrentTransactionId() before having verified that it needs to do
anything.  This means that we'll generate and then discard an XID
uselessly in a failed subxact that didn't touch disk.  Worth fixing,
but it doesn't look like this is Theo's problem.

Unless I'm missing something, Theo's problem must involve having done
tuple updates in 4.6M different subtransactions.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-09-14 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> Tom Lane <[EMAIL PROTECTED]> writes:
>> Anyway, given that there's this one nonobvious gotcha, there might be
>> others.  My recommendation is that we take this off the open-items list
>> for 8.2 and revisit it in the 8.3 cycle when there's more time.

> I wonder if Theo's recent reported problem with 4.3M child xids changes the
> calculus on this. 

Yeah, I was just looking at that.  Removing useless entries from the
child-xid list would presumably help him.  Considering we're not even
formally in beta yet, I'm probably being too conservative to recommend
we not touch it.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-09-14 Thread Gregory Stark
Tom Lane <[EMAIL PROTECTED]> writes:

> Anyway, given that there's this one nonobvious gotcha, there might be
> others.  My recommendation is that we take this off the open-items list
> for 8.2 and revisit it in the 8.3 cycle when there's more time.

I wonder if Theo's recent reported problem with 4.3M child xids changes the
calculus on this. 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-09-13 Thread Tom Lane
I wrote:
> ... it seems like in the
> case where RecordSubTransactionCommit detects that the subxact has not
> stored its XID anywhere, we could immediately remove the XID from
> the PGPROC array, just as if it had aborted.  This would avoid chewing
> subxid slots for cases such as exception blocks in plpgsql that are
> not modifying the database, but just catching computational errors.

(and later realized that Alvaro had had the same idea awhile back, but
I don't have his message at hand).

I looked into this a bit more; it seems like basically it should only
take addition of

else
XidCacheRemoveRunningXids(xid, 0, NULL);

to the bottom of RecordSubTransactionCommit(), plus suitable adjustment
of the comments in both routines.  However, there's a problem: if we
delete a second-level subxact's XID from PGPROC, and later its upper
subtransaction aborts, XidCacheRemoveRunningXids will emit scary
warnings when it doesn't find the sub-subxact in PGPROC.  This could
doubtless be fixed with sufficient jiggery-pokery --- simply removing
the debug warnings would be a brute-force answer, but I'd like to find
something a bit less brute-force.  Maybe drop the sub-subxact from its
parent's list immediately, instead of carrying it forward?

Anyway, given that there's this one nonobvious gotcha, there might be
others.  My recommendation is that we take this off the open-items list
for 8.2 and revisit it in the 8.3 cycle when there's more time.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-09-03 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> OTOH I think we only need to store live Xids and those finished that
> wrote a WAL record; we can drop subaborted and subcommitted if they
> didn't.

While reviewing this thread, I see Alvaro already had the idea I just
came to...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-09-03 Thread Tom Lane
ITAGAKI Takahiro <[EMAIL PROTECTED]> writes:
> I added a subxid array to Snapshot and running subxids are gathered from
> PGPROC->subxids cache. There are two overflowed case; any of PGPROC->subxids
> are overflowed or the number of total subxids exceeds pre-allocated buffers.
> If overflowed, we cannot avoid to call SubTransGetTopmostTransaction.

Applied after some editorialization (you really need to pay more
attention to keeping comments in sync with code ;-))

I cannot measure any consistent speed difference in plain pgbench
scenarios with and without the patch, so at least as a rough
approximation the extra cycles in GetSnapshotData aren't hurting.
And I confirm that the test case you posted before no longer exhibits
a context-swap storm.

This change makes it even more obvious than before that we really want
to stay out of the subxids-overflowed regime.  I don't especially want
to make those arrays even bigger, but I wonder if there isn't more we
can do to use them efficiently.  In particular, it seems like in the
case where RecordSubTransactionCommit detects that the subxact has not
stored its XID anywhere, we could immediately remove the XID from
the PGPROC array, just as if it had aborted.  This would avoid chewing
subxid slots for cases such as exception blocks in plpgsql that are
not modifying the database, but just catching computational errors.
Comments?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-08-28 Thread ITAGAKI Takahiro

Bruce Momjian <[EMAIL PROTECTED]> wrote:
> Is there anything to do for 8.2 here?

I'm working on Tom's idea. It is not a feature and does not change
the on-disk-structures, so I hope it meet the 8.2 deadline...

Tom Lane <[EMAIL PROTECTED]> wrote:
> I'm wondering about doing something similar to what
> TransactionIdIsInProgress does, ie, make use of the PGPROC lists
> of live subtransactions.  Suppose that GetSnapshotData copies not
> only top xids but live subxids into the snapshot, and adds a flag
> indicating whether the subxids are complete (ie, none of the subxid
> lists have overflowed).  Then if the flag is set, tqual.c doesn't
> need to do SubTransGetTopmostTransaction() before searching the
> list.

I added a subxid array to Snapshot and running subxids are gathered from
PGPROC->subxids cache. There are two overflowed case; any of PGPROC->subxids
are overflowed or the number of total subxids exceeds pre-allocated buffers.
If overflowed, we cannot avoid to call SubTransGetTopmostTransaction.

I tested the patch and did not see any context switch storm which comes
from pg_subtrans, but there may be some bugs in the visibility checking.
It would be very nice if you could review or test the patch.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



snapshot_subtrans.patch
Description: Binary data

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-08-25 Thread Bruce Momjian

Is there anything to do for 8.2 here?

---

ITAGAKI Takahiro wrote:
> This is an additional information.
> 
> I wrote:
> > If we want to resolve the probmen fundamentally, we might have to
> > improve SubTrans using a better buffer management algorithm or so.
> 
> The above is maybe wrong. I checked each lwlock of pg_subtrans's buffers.
> All lwlocks are uniformly acquired and I could not see any differences
> among buffers. So the cause seems not to be a buffer management algorithm,
> but just a lack of SLRU buffer pages.
> 
> NUM_SUBTRANS_BUFFERS is defined as 32 in HEAD. If we increase it,
> we can avoid the old transaction problem for a certain time.
> However, it doesn't help much on high-load -- for example, on a workload
> with 2000 tps, we will use up 1000 pg_subtrans pages in 15 minites.
> I suppose it is not enough for online and batch/maintenance mix.
> 
> Also, the simple scanning way in SLRU will likely cause another performance
> issue when we highly increase the number of buffers. A sequential scanning
> is used in SLRU, so it will not work well against many buffers.
> 
> 
> I hope some cares in upper layer, snapshot, hitbits or something,
> being discussed in the recent thread.
> 
> Regards,
> ---
> ITAGAKI Takahiro
> NTT Open Source Software Center
> 
> 
> 
> ---(end of broadcast)---
> TIP 1: if posting/reading through Usenet, please send an appropriate
>subscribe-nomail command to [EMAIL PROTECTED] so that your
>message can get through to the mailing list cleanly

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-08-07 Thread ITAGAKI Takahiro
This is an additional information.

I wrote:
> If we want to resolve the probmen fundamentally, we might have to
> improve SubTrans using a better buffer management algorithm or so.

The above is maybe wrong. I checked each lwlock of pg_subtrans's buffers.
All lwlocks are uniformly acquired and I could not see any differences
among buffers. So the cause seems not to be a buffer management algorithm,
but just a lack of SLRU buffer pages.

NUM_SUBTRANS_BUFFERS is defined as 32 in HEAD. If we increase it,
we can avoid the old transaction problem for a certain time.
However, it doesn't help much on high-load -- for example, on a workload
with 2000 tps, we will use up 1000 pg_subtrans pages in 15 minites.
I suppose it is not enough for online and batch/maintenance mix.

Also, the simple scanning way in SLRU will likely cause another performance
issue when we highly increase the number of buffers. A sequential scanning
is used in SLRU, so it will not work well against many buffers.


I hope some cares in upper layer, snapshot, hitbits or something,
being discussed in the recent thread.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-08-07 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> I was thinking at what time was the most appropiate to insert or remove
> an Xid from the cache.  We can do without any excl-locking because 1) we
> already assume the storing of an Xid to be atomic, and 2) no one can be
> interested in querying for an Xid before the originating transaction has
> had the chance to write a tuple with that Xid anyway.

Actually ... that fails if GetSnapshotData is going to copy subtrans
XIDs.  So this area needs more thought.

> On the third hand, are we going to sh-acquire the ProcArray lock while a
> GetSnapshotData copies all subxact Xids of all running transactions?
> ProcArrayLock will become more of a contention point than it already is.

Yeah, but sharelock is better than exclusive lock ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-08-07 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> I'm wondering about doing something similar to what
> >> TransactionIdIsInProgress does, ie, make use of the PGPROC lists
> >> of live subtransactions.
> 
> > Well, that sounds awfully more expensive than setting
> > local-to-my-database Xmins as well as global (all databases) Xmins :-)
> 
> Only when you've got a lot of subtransactions.  The point of this
> discussion is to optimize for the few-or-none case.  In any case,
> the problem with the local/global bit was that you'd be expending
> foreground-process cycles without any foreground-process return.
> Being able to use a snapshot without consulting pg_subtrans will
> certainly buy back some cycles...

I can buy that.  Some idle thoughts:

I was thinking at what time was the most appropiate to insert or remove
an Xid from the cache.  We can do without any excl-locking because 1) we
already assume the storing of an Xid to be atomic, and 2) no one can be
interested in querying for an Xid before the originating transaction has
had the chance to write a tuple with that Xid anyway.

OTOH I think we only need to store live Xids and those finished that
wrote a WAL record; we can drop subaborted and subcommitted if they
didn't.

On the third hand, are we going to sh-acquire the ProcArray lock while a
GetSnapshotData copies all subxact Xids of all running transactions?
ProcArrayLock will become more of a contention point than it already is.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-08-07 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> I'm wondering about doing something similar to what
>> TransactionIdIsInProgress does, ie, make use of the PGPROC lists
>> of live subtransactions.

> Well, that sounds awfully more expensive than setting
> local-to-my-database Xmins as well as global (all databases) Xmins :-)

Only when you've got a lot of subtransactions.  The point of this
discussion is to optimize for the few-or-none case.  In any case,
the problem with the local/global bit was that you'd be expending
foreground-process cycles without any foreground-process return.
Being able to use a snapshot without consulting pg_subtrans will
certainly buy back some cycles...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-08-07 Thread Alvaro Herrera
Tom Lane wrote:

> I'm wondering about doing something similar to what
> TransactionIdIsInProgress does, ie, make use of the PGPROC lists
> of live subtransactions.  Suppose that GetSnapshotData copies not
> only top xids but live subxids into the snapshot, and adds a flag
> indicating whether the subxids are complete (ie, none of the subxid
> lists have overflowed).  Then if the flag is set, tqual.c doesn't
> need to do SubTransGetTopmostTransaction() before searching the
> list.

Well, that sounds awfully more expensive than setting
local-to-my-database Xmins as well as global (all databases) Xmins :-)

On the other hand, ISTM as soon as one cache overflows, you have to go
check pg_subtrans which means the entire optimization buys nothing in
that case.  It would be nice if the optimization degraded more
gracefully.  I don't have any concrete suggestion though.  The changes
proposed in the other CS-storm thread by the NTT person may help.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-08-07 Thread Tom Lane
ITAGAKI Takahiro <[EMAIL PROTECTED]> writes:
> The invokers of SubTrans module are two SubTransGetTopmostTransaction()
> in HeapTupleSatisfiesSnapshot(). When I disabled the calls, CSStorm did
> not occur. SubTransGetTopmostTransaction returns the argument without
> change when we don't use SAVEPOINTs.

> If we optimize for non-subtransactions, we can avoid to lock SubTrans
> for check visiblities of tuples inserted by top transactions.

Only for top transactions still in progress, so I doubt that would
help much.

I'm wondering about doing something similar to what
TransactionIdIsInProgress does, ie, make use of the PGPROC lists
of live subtransactions.  Suppose that GetSnapshotData copies not
only top xids but live subxids into the snapshot, and adds a flag
indicating whether the subxids are complete (ie, none of the subxid
lists have overflowed).  Then if the flag is set, tqual.c doesn't
need to do SubTransGetTopmostTransaction() before searching the
list.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2006-08-06 Thread ITAGAKI Takahiro

Tom Lane <[EMAIL PROTECTED]> wrote:
> > It does not solve, even if it increases the number of NUM_SUBTRANS_BUFFERS.
> > The problem was only postponed.
> 
> Can you provide a reproducible test case for this?

This is the reproducible test case:
- Occurs on both 8.1.4 and HEAD.
- On smp machine. I used dual opterons.
  CSStrom becomes worse on dual xeon with hyper-threading.
- Tuning parameters are default. Whole data are cached in shared buffers.
  (shared_buffers=32MB, data of pgbench (scale=1) are less than 15MB.)
- Using custom pgbench. One client doing UPDATE with indexscan
  and multiple clients doing SELECT with seqscan/indexscan.

$ pgbench -i
$ pgbench -n -c  1 -t 10 -f cs_update.sql&
$ pgbench -n -c 50 -t 10 -f cs_indexscan.sql &
$ pgbench -n -c 35 -t 10 -f cs_seqscan.sql   &
(The scripts are attached at end of this message.)

In above workload, context switches are 2000-1/sec and cpu usage is
user=100%. Then, start a long open transaction on another connection.

$ psql
# begin; -- Long open transaction

After a lapse of 30-60 seconds, context switches become 5/sec over
(12 over on xeons) and cpu usage is user=66% / sys=21% / idle=13%.
If we increase the frequency of UPDATE, the duration becomes shorter.


This is a human-induced workload, but I can see the same condition in
TPC-W -- even though it is a benchmark. TPC-W requires full-text search
and it is implementd using "LIKE %foo%" in my implementation (DBT-1, too).
Also, it requires periodical aggregations. They might behave as long
transactions.


The cause seems to be a lock contention. The number of locks on
SubtransControlLock and SubTransBuffer are significantly increased
by comparison with BufMappingLocks.

# Before starting a long transaction.
 kind |lwlock   | sh_call  | sh_wait | ex_call | ex_wait
--+-+--+-+-+-
   13 | SubtransControlLock |28716 |   2 |  54 |   0
   22 | BufMappingLock  | 11637884 |   0 |2492 |   0
   27 | SubTransBuffer  |0 |   0 |  11 |   0

# After
 kind |lwlock   | sh_call  | sh_wait | ex_call | ex_wait
--+-+--+-+-+-
   13 | SubtransControlLock |  4139111 |   65059 | 3926691 |  390838
   22 | BufMappingLock  | 32348073 |   0 |2509 |   0
   27 | SubTransBuffer  |   939646 |  960341 | 1419152 |  61



The invokers of SubTrans module are two SubTransGetTopmostTransaction()
in HeapTupleSatisfiesSnapshot(). When I disabled the calls, CSStorm did
not occur. SubTransGetTopmostTransaction returns the argument without
change when we don't use SAVEPOINTs.

If we optimize for non-subtransactions, we can avoid to lock SubTrans
for check visiblities of tuples inserted by top transactions.
If we want to resolve the probmen fundamentally, we might have to
improve SubTrans using a better buffer management algorithm or so.

Do you have any idea to avoid such a problem?



-- cs_update.sql
\set naccounts 10 * :tps
\setrandom aid 1 :naccounts
\setrandom delta -5000 5000
UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid;
SELECT pg_sleep(0.1);

-- cs_seqscan.sql
\set naccounts 10 * :tps
\setrandom aid 1 :naccounts
SELECT abalance FROM accounts WHERE aid::int8 = :aid; -- cast to force seqscan

-- cs_indexscan.sql
\set naccounts 10 * :tps
\setrandom aid 1 :naccounts
SELECT abalance FROM accounts WHERE aid = :aid;


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings