On Mon, Apr 4, 2011 at 9:25 AM, Merlin Moncure <mmonc...@gmail.com> wrote: > On Sun, Apr 3, 2011 at 6:40 PM, Merlin Moncure <mmonc...@gmail.com> wrote: >> On Sat, Apr 2, 2011 at 8:37 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >>> Merlin Moncure <mmonc...@gmail.com> writes: >>>> On Thu, Mar 31, 2011 at 5:38 PM, Merlin Moncure <mmonc...@gmail.com> wrote: >>>>> working on exanding the cache to # xid > 1. >>> >>>> patch attached. this is essentially my original idea except it's >>>> injected directly in to tqual.c as a kind of a expansion of the 'last >>>> seen' concept. Because the cache loops are inlined and very tight (4 >>>> int compares), it's actually cheaper than jumping out of line into >>>> clog to test this_int = that_int; >>> >>> This seems like a mess. Why is the criterion for caching commit states >>> different from whether the hint bit can be set? And why are you >> >> The hint bits are always set. The page is flagged dirty if and only >> if you have to dip below the process local cache to get it. The >> rationale for this is simple: it caps your downside because unlike >> hint bit removal or BM_UNTIDY, you only have to dip into clog and pay >> the i/o once per page -- so that your downside is limited to pre-cache >> behavior (minus the overhead of maintaining the cache). > > I might have missed part of the thrust of your question. In the > patch, the commit status is only stored if the LSN for the xid is > known safe (instead of checking it just before setting the bit as > SetHintBits does). This is indeed different, and it was done so as to > avoid having to a. repeatedly check XLogNeedsFlush or b. cache the xid > LSN as transam.c does. Perhaps this is weak sauce, so I think the > first thing to do is have another go at doing it at the transam level. > Stay tuned...
Ok, after having done some more testing, I don't think this is gonna work without at least some changes to tqual.c. It's exactly the stuff that was bugging you, or some variant thereof, that needs to happen with penalizing some cases. Letting transam.c handle 100% of the cache management has the following issues: *) No way to signal tqual.c to not dirty page if we got a cache hit -- the i/o mitigation strategy rests on being able to do this. *) transam.c cache implementation should also cache transaction LSN position, which is expensive memory wise for less benefit *) Lots of non-inline calls. Jumping out of HeapTupleSatisifiesMVCC into non-inline function is expensive, at least on my test machine. Since there are more cases now where you fall through thei hint bit check, you are elevating significantly the amount of extra calls you make out of line which eats up all the savings you get -- sometimes more. In particular: *) both TransactionIdIsCurrentTransactionId and TransactionIdIsInProgress are called *before* TransactionIdDidCommit. so, even if you are going to get a cache 'hit', you still have to pay for jumping out to these functions and running them. TransactionIdIsInProgress could be made inline possibly if you go put it under transam.c cache umbrella, it still has material cost in that case. *) Setting commit cache if and only if your LSN is known safe is important innovation and must be preserved -- if you are getting lots of 'commit cache hit' tuples, it saves a lot of time not having to constantly recheck the LSN flush status of the same transactions over and over. Here's the times I'm seeing on my workstation (all timings are somewhat noisy, accurate within 10ms or so). test: create table v as select generate_series(1,500000) v; select count(*) from v; stock: run 1: 3000 ms (setting hint bits) run 2+ 665 ms (hint bit optimized) transam.c cache non inline run 1+ 1000ms transam.c cache inline (made inline stub that proxies to existing function on cache miss) run 1+ 800ms tqual.c cache, checked at TransactionIdDidCommit run 1+ 790ms (1-2 less 'if' statements) tqual.c cache, checked at TransactionIdDidCommit, no LSN flush check run 1+ 740ms tqual.c cache, checked right after hint bit, no LSN flush check run 1+ 670ms The last case is near hint bit performance, because the check is happening more or less at the same place, and the check itself is a small number of int math operations (and can be optimized further still). I haven't yet timed the degenerate case where the commit cache is receiving tons of misses and is constantly flushing out pages. I'm not expecting this to be bad, but need to mention this -- I have some ideas about mitigation if it's a problem. So the question is, what is the way forwards? Follows is the commit cache code I'm hooking into various places, with some fixes from previous patch. The actual implementation of the cache structure I'm not stuck on, other than it has to be completely inline (no dynahash) and very tight. Where it gets called from is much more important. merlin #define COMMIT_CACHE_BUCKET_COUNT 4 #define COMMIT_CACHE_BUCKET_BLOCKSZ BLCKSZ #define COMMIT_CACHE_ROLLUPSZ 100 #define COMMIT_CACHE_HIT_THRESHOLD 5 typedef struct { int XidBucket; int HitCount; bool Clear; } CommitCacheBucketHeader; /* I know static definition here is lousy, but it's convenient for testing purposes */ static CommitCacheBucketHeader CommitCacheBucketList[COMMIT_CACHE_BUCKET_COUNT] = {{-1, 0, false}, {-1, 0, false}, {-1, 0, false}, {-1, 0, false}}; static char CommitCacheData[COMMIT_CACHE_BUCKET_COUNT][COMMIT_CACHE_BUCKET_BLOCKSZ]; static int CommitCacheBucketRollupList[COMMIT_CACHE_ROLLUPSZ]; static int CommitCacheMissCount = 0; /* qsort comparison function */ static int int_cmp(const void *p1, const void *p2) { int v1 = *((const int *) p1); int v2 = *((const int *) p2); if (v1 < v2) return -1; if (v1 > v2) return 1; return 0; } static void RollUpCommitCache() { TransactionId LastBucket = CommitCacheBucketRollupList[0]; int BucketIdx; int HitCount = 0; int Bucket; int BucketMinHits = COMMIT_CACHE_ROLLUPSZ + 1; int BucketMinIdx; int XidBucket; qsort(CommitCacheBucketRollupList, COMMIT_CACHE_ROLLUPSZ, sizeof(int), int_cmp); for (BucketIdx = 0; BucketIdx < COMMIT_CACHE_ROLLUPSZ; BucketIdx++) { if(CommitCacheBucketRollupList[BucketIdx] == LastBucket) HitCount++; else { if (HitCount >= COMMIT_CACHE_HIT_THRESHOLD) { CheckLastXidBlock: for (Bucket = 0; Bucket < COMMIT_CACHE_BUCKET_COUNT; Bucket++) { if (CommitCacheBucketList[Bucket].HitCount < BucketMinHits) { BucketMinIdx = Bucket; BucketMinHits = CommitCacheBucketList[Bucket].HitCount; } } XidBucket = CommitCacheBucketRollupList[BucketIdx]; if (HitCount > BucketMinHits) { if(XidBucket != CommitCacheBucketList[BucketMinIdx].XidBucket) { CommitCacheBucketList[BucketMinIdx].XidBucket = XidBucket; CommitCacheBucketList[BucketMinIdx].HitCount = HitCount; CommitCacheBucketList[BucketMinIdx].Clear = true; } } } HitCount = 1; LastBucket = CommitCacheBucketRollupList[BucketIdx]; } } if(HitCount >= COMMIT_CACHE_HIT_THRESHOLD) { BucketIdx--; goto CheckLastXidBlock; } for (Bucket = 0; Bucket < COMMIT_CACHE_BUCKET_COUNT; Bucket++) { if(CommitCacheBucketList[BucketMinIdx].Clear) memset(CommitCacheData[Bucket], 0, COMMIT_CACHE_BUCKET_BLOCKSZ); CommitCacheBucketList[BucketMinIdx].Clear = 0; CommitCacheBucketList[BucketMinIdx].HitCount = 0; } CommitCacheMissCount = 0; return; } static inline bool IsXidInCommitCacheI(TransactionId xid) { int XidBucket = xid / COMMIT_CACHE_BUCKET_BLOCKSZ; int Bucket; for(Bucket = 0; Bucket < COMMIT_CACHE_BUCKET_COUNT; Bucket++) { if(CommitCacheBucketList[Bucket].XidBucket == XidBucket) { int ByteOffset = xid / BITS_PER_BYTE; int BitOffset = xid % BITS_PER_BYTE; if (CommitCacheData[Bucket][ByteOffset] & (1 << BitOffset)) { CommitCacheBucketList[Bucket].HitCount++; return true; } break; } } CommitCacheBucketRollupList[CommitCacheMissCount++] = XidBucket; if (CommitCacheMissCount == COMMIT_CACHE_ROLLUPSZ) { RollUpCommitCache(); } return false; } static inline void SetXidInCommitCacheI(TransactionId xid) { int XidBucket = xid / COMMIT_CACHE_BUCKET_BLOCKSZ; int Bucket; for(Bucket = 0; Bucket < COMMIT_CACHE_BUCKET_COUNT; Bucket++) { if(XidBucket == CommitCacheBucketList[Bucket].XidBucket) { int ByteOffset = xid / BITS_PER_BYTE; int BitOffset = xid % BITS_PER_BYTE; CommitCacheData[Bucket][ByteOffset] |= (1 << BitOffset); break; } } return; } static inline bool TransactionIdDidCommitCacheI(TransactionId transactionId) { return IsXidInCommitCacheI(transactionId); } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers