Re: [HACKERS] Spinlock performance improvement proposal
Tom Lane wrote: Neil Padgett [EMAIL PROTECTED] writes: Well. Currently the runs are the typical pg_bench runs. With what parameters? If you don't initialize the pg_bench database with scale proportional to the number of clients you intend to use, then you'll naturally get huge lock contention. For example, if you use scale=1, there's only one branch in the database. Since every transaction wants to update the branch's balance, every transaction has to write-lock that single row, and so everybody serializes on that one lock. Under these conditions it's not surprising to see lots of lock waits and lots of useless runs of the deadlock detector ... The results you saw with the large number of useless runs of the deadlock detector had a scale factor of 2. With a scale factor 2, the performance fall-off began at about 100 clients. So, I reran the 512 client profiling run with a scale factor of 12. (2:100 as 10:500 -- so 12 might be an appropriate scale factor with some cushion?) This does, of course, reduce the contention. However, the throughput is still only about twice as much, which sounds good, but is still a small fraction of the throughput realized on the same machine with a small number of clients. (This is the uniprocessor machine.) The new profile looks like this (uniprocessor machine): Flat profile: Each sample counts as 1 samples. % cumulative self self total time samples samplescalls T1/call T1/call name 9.44 10753.00 10753.00 pg_fsync (I'd attribute this to the slow disk in the machine -- scale 12 yields a lot of tuples.) 6.63 18303.01 7550.00 s_lock_sleep 6.56 25773.01 7470.00 s_lock 5.88 32473.01 6700.00 heapgettup 5.28 38487.02 6014.00 HeapTupleSatisfiesSnapshot 4.83 43995.02 5508.00 hash_destroy 2.77 47156.02 3161.00 load_file 1.90 49322.02 2166.00 XLogInsert 1.86 51436.02 2114.00 _bt_compare 1.82 53514.02 2078.00 AllocSetAlloc 1.72 55473.02 1959.00 LockBuffer 1.50 57180.02 1707.00 init_ps_display 1.40 58775.03 1595.00 DirectFunctionCall9 1.26 60211.03 1436.00 hash_search 1.14 61511.03 1300.00 GetSnapshotData 1.11 62780.03 1269.00 SpinAcquire 1.10 64028.03 1248.00 LockAcquire 1.04 70148.03 1190.00 heap_fetch 0.91 71182.03 1034.00 _bt_orderkeys 0.89 72201.03 1019.00 LockRelease 0.75 73058.03 857.00 InitBufferPoolAccess . . . I reran the benchmarks on the SMP machine with a scale of 12 instead of 2. The numbers still show a clear performance drop off at approximately 100 clients, albeit not as sharp. (But still quite pronounced.) In terms of raw performance, the numbers are comparable. The scale factor certainly helped -- but it still seems that we might have a problem here. Thoughts? Neil -- Neil Padgett Red Hat Canada Ltd. E-Mail: [EMAIL PROTECTED] 2323 Yonge Street, Suite #300, Toronto, ON M4P 2C9 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Spinlock performance improvement proposal
SpinAcquire 1.61 59733.03 1840.00 LockBuffer 1.60 61560.03 1827.00 FunctionCall2 1.56 63339.03 1779.00 tag_hash 1.46 65007.03 1668.00 set_ps_display 1.20 66372.03 1365.00 SearchCatCache 1.14 67666.03 1294.00 LockAcquire . . . Our current suspicion isn't that the lock implementation is the only problem (though there is certainly room for improvement), or perhaps isn't even the main problem. For example, there has been some suggestion that perhaps some component of the database is causing large lock contention. My opinion is that rather than guessing and taking stabs in the dark, we need to take a more reasoned approach to these things. IMHO, the next step should be to apply instrumentation (likely via some neat macros) to all lock acquires / releases. Then, it will be possible to determine what components are the greatest consumers of locks, and to determine whether it is a component problem or a systemic problem. (i.e. some component vs. simply just the lock implementation.) Neil -- Neil Padgett Red Hat Canada Ltd. E-Mail: [EMAIL PROTECTED] 2323 Yonge Street, Suite #300, Toronto, ON M4P 2C9 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Spinlock performance improvement proposal
Tom Lane wrote: Neil Padgett [EMAIL PROTECTED] writes: Initial results (top five -- if you would like a complete profile, let me know): Each sample counts as 1 samples. % cumulative self self total time samples samplescalls T1/call T1/call name 26.57 42255.02 42255.02 FindLockCycleRecurse Yipes. It would be interesting to know more about the locking pattern of your benchmark --- are there long waits-for chains, or not? The present deadlock detector was certainly written with an eye to get it right rather than make it fast, but I wonder whether this shows a performance problem in the detector, or just too many executions because you're waiting too long to get locks. However, this seems to be a red herring. Removing the deadlock detector had no effect. In fact, benchmarking showed removing it yielded no improvement in transaction processing rate on uniprocessor or SMP systems. Instead, it seems that the deadlock detector simply amounts to something to do for the blocked backend while it waits for lock acquisition. Do you have any idea about the typical lock-acquisition delay in this benchmark? Our docs advise trying to set DEADLOCK_TIMEOUT higher than the typical acquisition delay, so that the deadlock detector does not run unnecessarily. Well. Currently the runs are the typical pg_bench runs. This was useful since it was a handy benchmark that was already done, and I was hoping it might be useful for comparison since it seems to be popular. More benchmarks of different types would of course be useful though. I think the large time consumed by the deadlock detector in the profile is simply due to too many executions while waiting to acquire to contended locks. But, I agree that it seems DEADLOCK_TIMEOUT was set too low, since it appears from the profile output that the deadlock detector was running unnecessarily. But the deadlock detector isn't causing the SMP performance hit right now, since the throughput is the same with it in place or with it removed completely. I therefore didn't make any attempt to tune DEADLOCK_TIMEOUT. As I mentioned before, it apparently just gives the backend something to do while it waits for a lock. I'm thinking that the deadlock detector unnecessarily has no effect on performance since the shared memory is causing some level of serialization. So, one CPU (or two, or three, but not all) is doing useful work, while the others are idle (that is to say, doing no useful work). If they are idle spinning, or idle running the deadlock detector the net throughput is still the same. (This might also indicate that improving the lock design won't help here.) Of course, another possibility is that you spend so long spinning simply because you do spin (rather than sleep), and this is wasting much CPU time so the useful work backends take longer to get things done. Either is just speculation right now without any data to back things up. For example, there has been some suggestion that perhaps some component of the database is causing large lock contention. My thought as well. I would certainly recommend that you use more than one test case while looking at these things. Yes. That is another suggestion for a next step. Several cases might serve to better expose the path causing the slowdown. I think that several test cases of varying usage patterns, coupled with hold time instrumentation (which can tell what routine acquired the lock and how long it held it, and yield wait-for data in the analysis), are the right way to go about attacking SMP performance. Any other thoughts? Neil -- Neil Padgett Red Hat Canada Ltd. E-Mail: [EMAIL PROTECTED] 2323 Yonge Street, Suite #300, Toronto, ON M4P 2C9 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] CVS commit messages
Question: What has changed with the CVS repository lately? I notice that all of the commit messages I've read lately on pgsql-committers seem to come from Marc Fournier. Has Marc just been committing all recent changes, or are all commit messages, regardless of committer, showing as from Marc? Neil -- Neil Padgett Red Hat Canada Ltd. E-Mail: [EMAIL PROTECTED] 2323 Yonge Street, Suite #300, Toronto, ON M4P 2C9 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Guide to PostgreSQL source tree
On Sun, 19 Aug 2001, Tom Lane wrote: One thing that I find absolutely essential for dealing with any large project is a full-text indexer (I use Glimpse, but I think there are others out there). Being able to quickly look at every use of a particular identifier goes a long way towards answering questions. Agreed -- you can't find your way around PostgreSQL without such a program. Personally, I use Source Navigator which you can grab at http://sources.redhat.com/sourcenav/ . The really useful thing about source navigator is that it parses the source into functions, variables, etc. rather than just indexing it all as text. This means when you are looking at a source file with it, you can do neat things like click on a function call and then see things like the declaration and a x-ref tree. Very handy. Neil -- Neil Padgett Red Hat Canada Ltd. E-Mail: [EMAIL PROTECTED] 2323 Yonge Street, Suite #300, Toronto, ON M4P 2C9 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] int8 sequences --- small implementation problem
Tom Lane wrote: [clip] This would work, I think, but my goodness it's an ugly solution. Has any hacker got a better one? regards, tom lane How about: #ifdef INT64_IS_BUSTED #define int64aligned(name) int32 name##_; int64 name #else #define int64aligned(name) int64 name #endif typedef struct FormData_pg_sequence { NameDatasequence_name; int64aligned(last_value); int64aligned(increment_by); int64aligned(max_value); int64aligned(min_value); int64aligned(cache_value); int64aligned(log_cnt); charis_cycled; charis_called; } FormData_pg_sequence; Neil -- Neil Padgett Red Hat Canada Ltd. E-Mail: [EMAIL PROTECTED] 2323 Yonge Street, Suite #300, Toronto, ON M4P 2C9 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://www.postgresql.org/search.mpl
Re: [HACKERS] Re: AW: Re: OID wraparound: summary and proposal
mlw wrote: The way I see it there are 4 options for the OID: [snip] (2) Allow the ability to have tables without OIDs. This is a source of debate. I think Tom Lane has already committed some patches to allow for this. So, I think you should be able to try this from the latest CVS. (Tom?) Neil -- Neil Padgett Red Hat Canada Ltd. E-Mail: [EMAIL PROTECTED] 2323 Yonge Street, Suite #300, Toronto, ON M4P 2C9 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])