Re: [HACKERS] HOT is applied

2007-09-24 Thread Merlin Moncure
On 9/21/07, Merlin Moncure [EMAIL PROTECTED] wrote:
 Well, my first round of results are so far not showing the big gains I
 saw with hot in some of the earlier patches...so far, it looks
 approximately to be a wash although with the reduced need to vacuum.

let me correct myself here.  I did some longer runs and came up with
the following results  (these runs are still not really long enough, I
need to run some more tests).  The smaller, cpu bound cases were
blowout victories for hot.  The largest run (i/o bound) was very close
but there were not enough transactions to force a vacuum, after which
hot would probably pull away by some degree.

one small aside: I am suspicious that 8.3 improvements to the stats
collector overhead are going to reap big benefits.  This merits some
extra investigation, i think.

hardware:
2xintel [EMAIL PROTECTED], 4 cores total
8gb ram
5x15krpm sas, raid 0 data xfs
5x15krpm sas, raid 0, wal xfs

fsync on, asynch commit, partial page writes, autovac on

not hot build cvs dated 9/14
hot build cvs dated 9/21

merlin (results follow)

* without hot: *
scaling factor: 1
number of clients: 8
number of transactions per client: 25
number of transactions actually processed: 200/200
tps = 2345.225742 (including connections establishing)
tps = 2345.264846 (excluding connections establishing)
transaction type: TPC-B (sort of)
scaling factor: 10
number of clients: 8
number of transactions per client: 25
number of transactions actually processed: 200/200
tps = 3038.119776 (including connections establishing)
tps = 3038.185492 (excluding connections establishing)
transaction type: TPC-B (sort of)
scaling factor: 25
number of clients: 8
number of transactions per client: 25
number of transactions actually processed: 200/200
tps = 3671.987348 (including connections establishing)
tps = 3672.083077 (excluding connections establishing)
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 8
number of transactions per client: 25
number of transactions actually processed: 200/200
tps = 4240.424756 (including connections establishing)
tps = 4240.542851 (excluding connections establishing)
transaction type: TPC-B (sort of)
scaling factor: 1000
number of clients: 8
number of transactions per client: 10
number of transactions actually processed: 80/80
tps = 889.890173 (including connections establishing)
tps = 889.905128 (excluding connections establishing)

* witht hot: *
transaction type: TPC-B (sort of)
scaling factor: 1
number of clients: 8
number of transactions per client: 25
number of transactions actually processed: 200/200
tps = 3186.553423 (including connections establishing)
tps = 3186.622178 (excluding connections establishing)
transaction type: TPC-B (sort of)
scaling factor: 10
number of clients: 8
number of transactions per client: 25
number of transactions actually processed: 200/200
tps = 5123.153371 (including connections establishing)
tps = 5123.331343 (excluding connections establishing)
transaction type: TPC-B (sort of)
scaling factor: 25
number of clients: 8
number of transactions per client: 25
number of transactions actually processed: 200/200
tps = 4994.897937 (including connections establishing)
tps = 4995.075480 (excluding connections establishing)
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 8
number of transactions per client: 25
number of transactions actually processed: 200/200
tps = 4675.778153 (including connections establishing)
tps = 4675.936096 (excluding connections establishing)
transaction type: TPC-B (sort of)
scaling factor: 1000
number of clients: 8
number of transactions per client: 10
number of transactions actually processed: 80/80
tps = 893.904762 (including connections establishing)
tps = 893.919032 (excluding connections establishing)

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT is applied

2007-09-23 Thread Robert Treat
On Friday 21 September 2007 13:02, Tom Lane wrote:
 I wrote:
  Dunno about more general, but your idea reduces the runtime of this
  example by about 50% (22.2s to 10.5s) for me.  I'm worried though that
  it would be a net negative in more typical situations, especially if
  you've got a lot of open subtransactions.

 Actually ... the only way that TransactionIdIsCurrentTransactionId can
 take a meaningful amount of time is if you've got lots of
 subtransactions, and in that case your own subxids cache has certainly
 overflowed, which is likely to force TransactionIdIsInProgress into the
 slow answer path.  But if we use TransactionIdIsCurrentTransactionId
 to handle our own xids then we can just ignore MyProc altogether inside
 the loop, thus very possibly (if we are the only overflowed-subxids proc)
 saving us from going through the slow answer path.  So on reflection
 I can't see how this isn't a win.  I'll clean it up and apply it.


Just curious, but does this apply to subtransactions that are the result of 
plpgsql try...exception blocks? And how many is lots? Presumably looping 50k 
times in a single procedure would be enough? 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] HOT is applied

2007-09-23 Thread Tom Lane
Robert Treat [EMAIL PROTECTED] writes:
 Just curious, but does this apply to subtransactions that are the result of 
 plpgsql try...exception blocks?

Only if they changed the database; else they won't have XIDs.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] HOT is applied

2007-09-21 Thread Bruce Momjian
Heikki Linnakangas wrote:
 Tom Lane wrote:
  I've committed the HOT patch.
 
 Thanks, much easier to work with it now that it's in.
 
   I'd still like to think about whether we
  can be smarter about when to invoke pruning, but that's a small enough
  issue that the patch can go in without it.
 
 Yeah. I'm doing some micro-benchmarking, and the attached test case is
 much slower with HOT. It's spending a lot of time trying to prune, only
 to find out that it can't.
 
 Instead of/in addition to avoiding pruning when it doesn't help, maybe
 we could make HeapTupleSatisfiesVacuum cheaper.
 
 I'm going to continue testing, this is just a heads-up that HOT as
 committed seriously hurts performance in some cases. (though one can
 argue that this test case isn't a very realistic one.)

This might be a simplistic question but if the page is +90% full and
there is a long-lived transaction, isn't Postgres going to try pruning
on each page read access?

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] HOT is applied

2007-09-21 Thread Pavan Deolasee
On 9/21/07, Bruce Momjian [EMAIL PROTECTED] wrote:


 This might be a simplistic question but if the page is +90% full and
 there is a long-lived transaction, isn't Postgres going to try pruning
 on each page read access?


The way it stands today, yes. Thats one reason why we are seeing
the performance drop in this particular case.

I liked the earlier proposed idea of storing the xid in the page header
to quickly tell us whether its worth pruning the page.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Bruce Momjian wrote:
 This might be a simplistic question but if the page is +90% full and
 there is a long-lived transaction, isn't Postgres going to try pruning
 on each page read access?

Yes :(. That's why we earlier talked about stored the xid of the oldest
deleted tuple on the page in the page header. That way we could skip the
fruitless pruning attempts until that xid  OldestXmin.

Another approach is to try to make HeapTupleSatisfiesVacuum cheaper, so
that the fruitless pruning attempts wouldn't hurt that much.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Tom Lane wrote:
 Heikki Linnakangas [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'd still like to think about whether we
 can be smarter about when to invoke pruning, but that's a small enough
 issue that the patch can go in without it.
 
 Yeah. I'm doing some micro-benchmarking, and the attached test case is
 much slower with HOT. It's spending a lot of time trying to prune, only
 to find out that it can't.
 
 Not sure if that's an appropriate description or not.  oprofile
 (on a dual Xeon running Fedora 6) shows me this:
 
 ...
 samples  %symbol name
 1070003  29.8708  LWLockAcquire
 1015097  28.3380  LWLockRelease
 2835147.9147  heap_page_prune
 ...
 so basically it's all about the locking.  Maybe the problem is that with
 HOT we lock the buffer too often?  heap_page_prune_opt is designed to
 not take the buffer lock unless there's a good probability of needing
 to prune, but maybe that's not working as intended.

If you look at the callgraph, you'll see that those
LWLockAcquire/Release calls are coming from HeapTupleSatisfiesVacuum -
TransactionIdIsInProgress, which keeps trashing the ProcArrayLock. A
if(TransactionIdIsCurrentTransactionId(xid)) return true; check in
TransactionIdIsInProgress would speed that up, but I wonder if there's a
more general solution to make HeapTupleSatisfiesVacuum cheaper. For
example, we could cache the in-progress status of tuples.

 Shouldn't we be able to prune rows that have been inserted and deleted
 by the same transaction?  I'd have hoped to see this example use only
 one heap page ...

Well maybe, but that's a separate issue. Wouldn't we need the snapshot
bookkeeping we've discussed in the past, to notice that there's no
snapshot in our own backend that needs to see the tuples?

Nevertheless, the fruitless pruning attempts would still hurt us in
slightly more complex scenarios, even if we fixed the above.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Merlin Moncure wrote:
 On 9/20/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:
 Yeah. I'm doing some micro-benchmarking, and the attached test case is
 much slower with HOT. It's spending a lot of time trying to prune, only
 to find out that it can't.

 Instead of/in addition to avoiding pruning when it doesn't help, maybe
 we could make HeapTupleSatisfiesVacuum cheaper.

 I'm going to continue testing, this is just a heads-up that HOT as
 committed seriously hurts performance in some cases. (though one can
 argue that this test case isn't a very realistic one.)
 
 well, I ran your test on my box and here are the results:
 pre hot:
 run 1: 3617.641 ms
 run 2: 5195.215 ms
 run 3: 6760.449 ms
 after vacuum:
 run 1: 4171.362 ms
 run 2: 5513.317 ms
 run 3: 6884.125 ms
 post hot:
 run 1: Time: 7286.292 ms
 run 2: Time: 7477.089 ms
 run 3: Time: 7701.229 ms
 
 those results aren't exactly terrible, and this case is highly artificial.

Your runtimes seem to be increasing as you repeat the test. Did you
remove the DROP TABLE from the beginning? On my laptop, post hot takes
~2x as long as pre hot, even when repeated, which matches the results of
your first runs.

Yeah, it's highly artificial.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Merlin Moncure wrote:
 On 9/20/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:
 Yeah. I'm doing some micro-benchmarking, and the attached test case is
 much slower with HOT. It's spending a lot of time trying to prune, only
 to find out that it can't.

 Instead of/in addition to avoiding pruning when it doesn't help, maybe
 we could make HeapTupleSatisfiesVacuum cheaper.

 I'm going to continue testing, this is just a heads-up that HOT as
 committed seriously hurts performance in some cases. (though one can
 argue that this test case isn't a very realistic one.)
 
 well, I ran your test on my box and here are the results:
 pre hot:
 run 1: 3617.641 ms
 run 2: 5195.215 ms
 run 3: 6760.449 ms
 after vacuum:
 run 1: 4171.362 ms
 run 2: 5513.317 ms
 run 3: 6884.125 ms
 post hot:
 run 1: Time: 7286.292 ms
 run 2: Time: 7477.089 ms
 run 3: Time: 7701.229 ms
 
 those results aren't exactly terrible,

Your runtimes seem to be increasing as you repeat the test. Did you
remove the DROP TABLE from the beginning? On my laptop, post hot takes
~2x as long as pre hot, even when repeated, which matches the results of
your first runs.

 and this case is highly artificial.

Yeah.

I'm going through a bunch of CPU test cases a colleague of mine wrote.
Some of the test cases shows much worse performance with HOT.
Unfortunately they're quite complex and overlapping, so instead of
posting them as is, I'm going through the ones that HOT performs badly
at, and try to reduce them to simpler test cases like the above.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] HOT is applied

2007-09-21 Thread Pavan Deolasee
On 9/21/07, Tom Lane [EMAIL PROTECTED] wrote:


 so this example is getting past the heuristic tests in
 heap_page_prune_opt almost every time.  Why is that?  Too tired to poke
 at it more tonight.


I guess you already know the answer now, but anyways: Since we are
updating a single tuple in a single transaction, each update is preceded
by a sequential scan. All but last pages are completely full and marked
prunable, so HOT would try to (unsuccessfully) prune every (except may
be last) page.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] HOT is applied

2007-09-21 Thread Pavan Deolasee
On 9/21/07, Tom Lane [EMAIL PROTECTED] wrote:


 It shouldn't, though --- the hint bit should get cleared on the first
 try.  I think I probably broke something in the last round of revisions
 to heap_page_prune_opt, but haven't looked yet ...



We set the hint bit (prunable) again when we see a RECENTLY_DEAD
or DELETE_IN_PROGRESS tuple. This is correct in normal circumstances
because otherwise we would never be able to prune the page.

On a second thought, may be we can choose not to set the bit again. In
the worst case, the next update on the page would be a COLD update.
That would set the bit again and we shall retry prune in the next visit to
the page.

This scheme would work as long as we don't put in mechanism to
update FSM after pruning. But if we choose to update FSM (I am
inclined to do this to avoid repeated relation extension because
of COLD updates), we would need to improve this. Otherwise a
page full of only DEAD tuples may never be pruned and vacuum
would be required to reclaim that space.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Bruce Momjian wrote:
 This might be a simplistic question but if the page is +90% full and
 there is a long-lived transaction, isn't Postgres going to try pruning
 on each page read access?

 Yes :(

It shouldn't, though --- the hint bit should get cleared on the first
try.  I think I probably broke something in the last round of revisions
to heap_page_prune_opt, but haven't looked yet ...

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Tom Lane wrote:
 I don't much like the idea of adding an xid to the page header --- for
 one thing, *which* xid would you put there, and what would you test it
 against?

I was thinking that you would put the smallest in-progress xmax on the
page there, and you would test it against OldestXmin. If all
transactions commit, there surely isn't anything to prune until that xid
falls beyond the OldestXmin horizon. If an inserting transaction aborts,
we could prune the aborted tuple earlier, but optimizing for aborts
doesn't seem that important.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 If you look at the callgraph, you'll see that those
 LWLockAcquire/Release calls are coming from HeapTupleSatisfiesVacuum -
 TransactionIdIsInProgress, which keeps trashing the ProcArrayLock. A
 if(TransactionIdIsCurrentTransactionId(xid)) return true; check in
 TransactionIdIsInProgress would speed that up, but I wonder if there's a
 more general solution to make HeapTupleSatisfiesVacuum cheaper. For
 example, we could cache the in-progress status of tuples.

Dunno about more general, but your idea reduces the runtime of this
example by about 50% (22.2s to 10.5s) for me.  I'm worried though that
it would be a net negative in more typical situations, especially if
you've got a lot of open subtransactions.

regards, tom lane

*** src/backend/storage/ipc/procarray.c.origSat Sep  8 16:31:15 2007
--- src/backend/storage/ipc/procarray.c Fri Sep 21 11:08:34 2007
***
*** 341,346 
--- 341,353 
return false;
}
  
+   /*
+* Also, we can detect our own transaction without any access to shared
+* memory.
+*/
+   if (TransactionIdIsCurrentTransactionId(xid))
+   return true;
+ 
/* Get workspace to remember main XIDs in */
xids = (TransactionId *) palloc(sizeof(TransactionId) * 
arrayP-maxProcs);
  

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 On 9/21/07, Tom Lane [EMAIL PROTECTED] wrote:
 
 so this example is getting past the heuristic tests in
 heap_page_prune_opt almost every time.  Why is that?  Too tired to poke
 at it more tonight.
 
 I guess you already know the answer now, but anyways: Since we are
 updating a single tuple in a single transaction, each update is preceded
 by a sequential scan. All but last pages are completely full and marked
 prunable, so HOT would try to (unsuccessfully) prune every (except may
 be last) page.

Hmm ... the problem really is that heap_page_prune turns the hint back
on when it sees a DELETE_IN_PROGRESS tuple.  Maybe that's a bad idea.

I don't much like the idea of adding an xid to the page header --- for
one thing, *which* xid would you put there, and what would you test it
against?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Tom Lane wrote:
 Heikki Linnakangas [EMAIL PROTECTED] writes:
 Bruce Momjian wrote:
 This might be a simplistic question but if the page is +90% full and
 there is a long-lived transaction, isn't Postgres going to try pruning
 on each page read access?
 
 Yes :(
 
 It shouldn't, though --- the hint bit should get cleared on the first
 try.

Think so? That would mean that you only get a single chance to prune
after an update. Not sure how big a problem that is, but I do feel that
would make HOT a lot less effective in some usage patterns.

The idea of having a prunable xmin in the page header sounds more and
more attractive to me...

  I think I probably broke something in the last round of revisions
 to heap_page_prune_opt, but haven't looked yet ...

Pavan's patch was like that as well; you don't clear the flag (or
rather, you set it again while pruning) until there's nothing left to
prune on the page.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Tom Lane wrote:
 Heikki Linnakangas [EMAIL PROTECTED] writes:
 If you look at the callgraph, you'll see that those
 LWLockAcquire/Release calls are coming from HeapTupleSatisfiesVacuum -
 TransactionIdIsInProgress, which keeps trashing the ProcArrayLock. A
 if(TransactionIdIsCurrentTransactionId(xid)) return true; check in
 TransactionIdIsInProgress would speed that up, but I wonder if there's a
 more general solution to make HeapTupleSatisfiesVacuum cheaper. For
 example, we could cache the in-progress status of tuples.
 
 Dunno about more general, but your idea reduces the runtime of this
 example by about 50% (22.2s to 10.5s) for me.  I'm worried though that
 it would be a net negative in more typical situations, especially if
 you've got a lot of open subtransactions.

Yeah. I played with this a bit more, and came up with a couple of other
micro-optimizations:

1. Instead of pallocing and pfreeing a new array in
TransactionIdIsInProgress, we could just malloc the array once and reuse
it. That palloc/pfree alone was consuming around 8% of CPU time in the
test case.

2. About ~30% of CPU time is spent in HeapTupleSatisfiesMVCC -
TransactionIdDidAbort. This is in fact not related to HOT, but it caught
my eye while looking at the profile. We have this piece of code in
HeapTupleSatisfiesMVCC:

   /* deleting subtransaction aborted? */
   /* FIXME -- is this correct w.r.t. the cmax of the tuple? */
   if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
   {
   SetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
   InvalidTransactionId);
   return true;
   }
 
   
 Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple)));

We've already checked that the xmin is our own transaction id, so we
check if the xmax is an aborted subtransaction of our own transaction. A
TransactionIdDidAbort call seems like an awfully expensive way to check
that. We could call TransactionIdIsCurrentTransactionId instead, which
doesn't need to access any shared memory structures (but might be
expensive if you have a lot of active subxacts, as you pointed out). Or
we could keep a list of aborted subtransactions in backend private
memory, and just check against that. In this particular case, a simple
if(xmin==xmax) check would be effective as well, since we know that
the xmin is not aborted at that point.

3. One more general alternative to the little patch I sent earlier would
be to build an array of in-progress-xids in TransactionIdInProgress,
like we do in GetSnapshotData, and use that for the in-progress checks
in HeapTupleSatisfiesVacuum. That should work, if we build the cached
array when we lock the page for pruning, and use it until we unlock. If
a transaction commits/abort after we build the cached array, we might
return DELETE/INSERT_IN_PROGRESS for a tuple that is in fact already
DEAD, but that's ok. Since we're holding a lock on the page, it's not
possible for new transaction to start and subsequently put it's xid on
the page where we would bump into it. I believe this would have roughly
the same performance effect in this test case as calling
TransactionIdIsCurrentTransactionId from TransactionIdIsInProgress.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
I wrote:
 Dunno about more general, but your idea reduces the runtime of this
 example by about 50% (22.2s to 10.5s) for me.  I'm worried though that
 it would be a net negative in more typical situations, especially if
 you've got a lot of open subtransactions.

Actually ... the only way that TransactionIdIsCurrentTransactionId can
take a meaningful amount of time is if you've got lots of
subtransactions, and in that case your own subxids cache has certainly
overflowed, which is likely to force TransactionIdIsInProgress into the
slow answer path.  But if we use TransactionIdIsCurrentTransactionId
to handle our own xids then we can just ignore MyProc altogether inside
the loop, thus very possibly (if we are the only overflowed-subxids proc)
saving us from going through the slow answer path.  So on reflection
I can't see how this isn't a win.  I'll clean it up and apply it.

I'm also starting to come around to liking the page-header-xid field
a bit more.  I suggest that it could replace the page is prunable
flag bit altogether --- to mark the page prunable, you must store
some appropriate xid into the header field.  This would avoid a useless
prune attempt immediately after a page is marked prunable.

A possible problem with it, if we treat it as a non-WAL-logged hint,
is that corruption of the value could lead to a page being marked with
an xid that's way in the future, keeping it from getting pruned.
However, if the page header gets corrupted this is the least of your
worries, so it doesn't seem like an enormous objection.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] HOT is applied

2007-09-21 Thread Merlin Moncure
On 9/21/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:
 Merlin Moncure wrote:
  pre hot:
  run 1: 3617.641 ms
  run 2: 5195.215 ms
  run 3: 6760.449 ms
  after vacuum:
  run 1: 4171.362 ms
  run 2: 5513.317 ms
  run 3: 6884.125 ms
  post hot:
  run 1: Time: 7286.292 ms
  run 2: Time: 7477.089 ms
  run 3: Time: 7701.229 ms
 
  those results aren't exactly terrible, and this case is highly artificial.

 Your runtimes seem to be increasing as you repeat the test. Did you
 remove the DROP TABLE from the beginning? On my laptop, post hot takes
 ~2x as long as pre hot, even when repeated, which matches the results of
 your first runs.

correct.

Well, my first round of results are so far not showing the big gains I
saw with hot in some of the earlier patches...so far, it looks
approximately to be a wash although with the reduced need to vacuum.
i'll test some more when things settle down.

merlin

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Merlin Moncure wrote:
 Well, my first round of results are so far not showing the big gains I
 saw with hot in some of the earlier patches...so far, it looks
 approximately to be a wash although with the reduced need to vacuum.
 i'll test some more when things settle down.

Oh... Which version did you test earlier? What are you testing, pgbench?

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Tom Lane wrote:
 Actually ... the only way that TransactionIdIsCurrentTransactionId can
 take a meaningful amount of time is if you've got lots of
 subtransactions, and in that case your own subxids cache has certainly
 overflowed, which is likely to force TransactionIdIsInProgress into the
 slow answer path.  But if we use TransactionIdIsCurrentTransactionId
 to handle our own xids then we can just ignore MyProc altogether inside
 the loop, thus very possibly (if we are the only overflowed-subxids proc)
 saving us from going through the slow answer path.  So on reflection
 I can't see how this isn't a win.  I'll clean it up and apply it.

Good point.

I noted that most callers of TransactionIdIsInProgress in tqual.c
already call TransactionIdIsCurrentTransactionId before
TransactionIdIsInProgress. In those cases we could just skip the test
for our own xids altogether, if it's worth code mangling to tell
TransactionIdIsInProgress whether it's safe to skip it or not.

 I'm also starting to come around to liking the page-header-xid field
 a bit more.  I suggest that it could replace the page is prunable
 flag bit altogether --- to mark the page prunable, you must store
 some appropriate xid into the header field.  This would avoid a useless
 prune attempt immediately after a page is marked prunable.

Yeah, we don't need the flag if we have an xid.

 A possible problem with it, if we treat it as a non-WAL-logged hint,
 is that corruption of the value could lead to a page being marked with
 an xid that's way in the future, keeping it from getting pruned.
 However, if the page header gets corrupted this is the least of your
 worries, so it doesn't seem like an enormous objection.

Agreed. Next attempt to prune or vacuum the page will fix it anyway.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Yeah. I played with this a bit more, and came up with a couple of other
 micro-optimizations:

 1. Instead of pallocing and pfreeing a new array in
 TransactionIdIsInProgress, we could just malloc the array once and reuse
 it. That palloc/pfree alone was consuming around 8% of CPU time in the
 test case.

Good idea --- not only faster, but we can get rid of all the goto's and
the locked flag, if we're willing to have a couple more LWLockRelease
calls here.  I'll incorporate this in the patch I'm working up.

 We've already checked that the xmin is our own transaction id, so we
 check if the xmax is an aborted subtransaction of our own transaction. A
 TransactionIdDidAbort call seems like an awfully expensive way to check
 that. We could call TransactionIdIsCurrentTransactionId instead, which
 doesn't need to access any shared memory structures (but might be
 expensive if you have a lot of active subxacts, as you pointed out).

I like that idea too ...

 3. One more general alternative to the little patch I sent earlier would
 be to build an array of in-progress-xids in TransactionIdInProgress,
 like we do in GetSnapshotData, and use that for the in-progress checks
 in HeapTupleSatisfiesVacuum. That should work, if we build the cached
 array when we lock the page for pruning, and use it until we unlock.

This seems like it uglifies a whole lot of APIs, though, for what's
probably not going to be a lot of gain.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] HOT is applied

2007-09-21 Thread Pavan Deolasee
On 9/21/07, Tom Lane [EMAIL PROTECTED] wrote:



 I'm also starting to come around to liking the page-header-xid field
 a bit more.  I suggest that it could replace the page is prunable
 flag bit altogether --- to mark the page prunable, you must store
 some appropriate xid into the header field.  This would avoid a useless
 prune attempt immediately after a page is marked prunable.


Agree. I was thinking of minimum of the following:

1 xmin of INSERT_IN_PROGRESS tuple
2 xmax of DELETE_IN_PROGRESS tuple
3 xmax of RECENTLY_DEAD_TUPLE

When we attempt a prune, we can set xid to the minimum of the above
three. If none of the above tuples exist in the page, xid can be set to
FrozenXid
A page need not be pruned if its xid is set to FrozenXid or is greater
than OldtestXmin.

In addition, I guess we can even set the xid also when we insert,
delete or update from the page unless its already set to a lower value.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 We've already checked that the xmin is our own transaction id, so we
 check if the xmax is an aborted subtransaction of our own transaction. A
 TransactionIdDidAbort call seems like an awfully expensive way to check
 that. We could call TransactionIdIsCurrentTransactionId instead, which
 doesn't need to access any shared memory structures (but might be
 expensive if you have a lot of active subxacts, as you pointed out).

Applied --- seems to buy another 50% savings in the example with all the
pruning activity.  I'll get after the pruning activity itself in a bit,
but it seems worth squeezing the tqual functions first while we have
this example that stresses them so much.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 I noted that most callers of TransactionIdIsInProgress in tqual.c
 already call TransactionIdIsCurrentTransactionId before
 TransactionIdIsInProgress. In those cases we could just skip the test
 for our own xids altogether, if it's worth code mangling to tell
 TransactionIdIsInProgress whether it's safe to skip it or not.

At least in this example, it seems it wouldn't buy anything.
HeapTupleSatisfiesVacuum doesn't call
TransactionIdIsCurrentTransactionId, and while HeapTupleSatisfiesMVCC
does, it falls out at that point and doesn't get to
TransactionIdIsInProgress.  So I'm not inclined to contort the API
unless we can come up with an example where it's important.

I'm off to work on the page-header-xid idea ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] HOT is applied

2007-09-20 Thread Heikki Linnakangas
Tom Lane wrote:
 I've committed the HOT patch.

Thanks, much easier to work with it now that it's in.

  I'd still like to think about whether we
 can be smarter about when to invoke pruning, but that's a small enough
 issue that the patch can go in without it.

Yeah. I'm doing some micro-benchmarking, and the attached test case is
much slower with HOT. It's spending a lot of time trying to prune, only
to find out that it can't.

Instead of/in addition to avoiding pruning when it doesn't help, maybe
we could make HeapTupleSatisfiesVacuum cheaper.

I'm going to continue testing, this is just a heads-up that HOT as
committed seriously hurts performance in some cases. (though one can
argue that this test case isn't a very realistic one.)

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
DROP TABLE IF EXISTS testtable;
CREATE TABLE testtable (key integer);

-- Note that there's no indexes, so updates have to do a seq scan.

CREATE OR REPLACE FUNCTION testfunc(data int, key1 int) RETURNS int AS $$
DECLARE
BEGIN
  FOR cnt IN 1..1 LOOP
UPDATE testtable SET key = data WHERE key = key1;
  END LOOP;
  RETURN 1;
END;
$$ LANGUAGE plpgsql;

INSERT INTO testtable VALUES (1);
BEGIN;
SELECT testfunc(1,1);
COMMIT;

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] HOT is applied

2007-09-20 Thread Merlin Moncure
On 9/20/07, Heikki Linnakangas [EMAIL PROTECTED] wrote:
 Tom Lane wrote:
  I've committed the HOT patch.

 Thanks, much easier to work with it now that it's in.

   I'd still like to think about whether we
  can be smarter about when to invoke pruning, but that's a small enough
  issue that the patch can go in without it.

 Yeah. I'm doing some micro-benchmarking, and the attached test case is
 much slower with HOT. It's spending a lot of time trying to prune, only
 to find out that it can't.

 Instead of/in addition to avoiding pruning when it doesn't help, maybe
 we could make HeapTupleSatisfiesVacuum cheaper.

 I'm going to continue testing, this is just a heads-up that HOT as
 committed seriously hurts performance in some cases. (though one can
 argue that this test case isn't a very realistic one.)

well, I ran your test on my box and here are the results:
pre hot:
run 1: 3617.641 ms
run 2: 5195.215 ms
run 3: 6760.449 ms
after vacuum:
run 1: 4171.362 ms
run 2: 5513.317 ms
run 3: 6884.125 ms
post hot:
run 1: Time: 7286.292 ms
run 2: Time: 7477.089 ms
run 3: Time: 7701.229 ms

those results aren't exactly terrible, and this case is highly artificial.

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] HOT is applied

2007-09-20 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'd still like to think about whether we
 can be smarter about when to invoke pruning, but that's a small enough
 issue that the patch can go in without it.

 Yeah. I'm doing some micro-benchmarking, and the attached test case is
 much slower with HOT. It's spending a lot of time trying to prune, only
 to find out that it can't.

Not sure if that's an appropriate description or not.  oprofile
(on a dual Xeon running Fedora 6) shows me this:

CPU: P4 / Xeon with 2 hyper-threads, speed 2792.99 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped)
with a unit mask of 0x01 (mandatory) count 10
samples  %symbol name
1070003  29.8708  LWLockAcquire
1015097  28.3380  LWLockRelease
2835147.9147  heap_page_prune
2522707.0425  AllocSetAlloc
1489814.1590  HeapTupleSatisfiesMVCC
1464424.0882  TransactionIdIsInProgress
92673 2.5871  AllocSetFree
84966 2.3720  HeapTupleSatisfiesVacuum
83097 2.3198  TransactionIdGetStatus
80737 2.2539  SimpleLruReadPage_ReadOnly
52803 1.4741  TransactionLogFetch
43406 1.2117  heapgetpage
42536 1.1875  HeapTupleHeaderGetCmin
34842 0.9727  TransactionIdIsCurrentTransactionId
27761 0.7750  TransactionIdDidAbort
24833 0.6933  MemoryContextAlloc
16446 0.4591  TransactionIdPrecedes
16089 0.4491  HeapTupleHeaderGetCmax
12919 0.3607  hash_search_with_hash_value
11857 0.3310  pfree
4964  0.1386  heap_page_prune_opt
3683  0.1028  hash_any
3215  0.0898  LWLockConditionalAcquire
3086  0.0862  PinBuffer
2573  0.0718  UnpinBuffer
2009  0.0561  ConditionalLockBufferForCleanup
1854  0.0518  ReadBuffer_common
934   0.0261  XLogInsert
784   0.0219  heapgettup_pagemode

so basically it's all about the locking.  Maybe the problem is that with
HOT we lock the buffer too often?  heap_page_prune_opt is designed to
not take the buffer lock unless there's a good probability of needing
to prune, but maybe that's not working as intended.

[ pokes at it a bit more... ]  Actually the disturbing part is that
pruning doesn't seem to be working at all: after the test finishes,
I see

regression=# vacuum verbose testtable;
INFO:  vacuuming public.testtable
INFO:  testtable: removed 1 row versions in 44 pages
INFO:  testtable: found 1 removable, 1 nonremovable row versions in 45 
pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 9955 unused item pointers.
45 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

Shouldn't we be able to prune rows that have been inserted and deleted
by the same transaction?  I'd have hoped to see this example use only
one heap page ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] HOT is applied

2007-09-20 Thread Tom Lane
I wrote:
 ... so basically it's all about the locking.  Maybe the problem is that with
 HOT we lock the buffer too often?  heap_page_prune_opt is designed to
 not take the buffer lock unless there's a good probability of needing
 to prune, but maybe that's not working as intended.

Indeed it seems it's not; gprof shows

0.000.00 111/1389276 index_getnext cycle 7 [89]
0.05   49.52 1389165/1389276 heapgetpage [6]
[8] 50.90.05   49.52 1389276 heap_page_prune_opt [8]
7.17   42.31 1366696/1366696 heap_page_prune [9]
0.010.03 1366696/1366696 
ConditionalLockBufferForCleanup [106]
0.010.00 278/2780795 PageGetHeapFreeSpace [177]

so this example is getting past the heuristic tests in
heap_page_prune_opt almost every time.  Why is that?  Too tired to poke
at it more tonight.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] HOT is applied

2007-09-20 Thread Pavan Deolasee
On 9/21/07, Tom Lane [EMAIL PROTECTED] wrote:


 Shouldn't we be able to prune rows that have been inserted and deleted
 by the same transaction?  I'd have hoped to see this example use only
 one heap page ...


Not before the transaction commits ? In the test, we update a single tuple
1 times in the same transaction. So there is no opportunity to prune.

Thanks,
Pavan


-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] HOT is applied

2007-09-20 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 On 9/21/07, Tom Lane [EMAIL PROTECTED] wrote:
 Shouldn't we be able to prune rows that have been inserted and deleted
 by the same transaction?  I'd have hoped to see this example use only
 one heap page ...
 
 Not before the transaction commits ? In the test, we update a single tuple
 1 times in the same transaction. So there is no opportunity to prune.

[ looks a bit more ... ]  Hm, the test I was thinking of was this one
at the end of HeapTupleSatisfiesVacuum:

if (TransactionIdEquals(HeapTupleHeaderGetXmin(tuple),
HeapTupleHeaderGetXmax(tuple)))
{
/*
 * Inserter also deleted it, so it was never visible to anyone else.
 * However, we can only remove it early if it's not an updated tuple;
 * else its parent tuple is linking to it via t_ctid, and this tuple
 * mustn't go away before the parent does.
 */
if (!(tuple-t_infomask  HEAP_UPDATED))
return HEAPTUPLE_DEAD;
}

but control never gets that far because neither xmin nor xmax is
committed yet.  We could fix that, probably, by considering the
xmin=xmax case in the xmin-in-progress case further up; but the
HEAP_UPDATED exclusion is still a problem.  Still, it seems like this
is leaving some money on the table when you think about pruning a HOT
chain.  Can we improve on it easily?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] HOT is applied

2007-09-20 Thread Pavan Deolasee
On 9/21/07, Tom Lane [EMAIL PROTECTED] wrote:



 but control never gets that far because neither xmin nor xmax is
 committed yet.  We could fix that, probably, by considering the
 xmin=xmax case in the xmin-in-progress case further up; but the
 HEAP_UPDATED exclusion is still a problem.  Still, it seems like this
 is leaving some money on the table when you think about pruning a HOT
 chain.  Can we improve on it easily?


May be we can, but it would get a bit tricky. There might be a transaction
looking at the first tuple in the chain and waiting for this
(inserting-deleting)
transaction to finish. If the waiting transaction is running in READ
COMMITTED
mode, it would then follow the update chain. Removing any intermediate
tuples without fixing the previous tuple's xmax/ctid (or redirected line
pointer)
would be tricky.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com