On 2014-10-10 17:18:46 +0530, Amit Kapila wrote: > On Fri, Oct 10, 2014 at 1:27 PM, Andres Freund <and...@2ndquadrant.com> > wrote: > > > Observations > > > ---------------------- > > > a. The patch performs really well (increase upto ~40%) incase all the > > > data fits in shared buffers (scale factor -100). > > > b. Incase data doesn't fit in shared buffers, but fits in RAM > > > (scale factor -3000), there is performance increase upto 16 client > count, > > > however after that it starts dipping (in above config unto ~4.4%). > > > > Hm. Interesting. I don't see that dip on x86. > > Is it possible that implementation of some atomic operation is costlier > for particular architecture?
Yes, sure. And IIRC POWER improved atomics performance considerably for POWER8... > I have tried again for scale factor 3000 and could see the dip and this > time I have even tried with 175 client count and the dip is approximately > 5% which is slightly more than 160 client count. FWIW, the profile always looks like - 48.61% postgres postgres [.] s_lock - s_lock + 96.67% StrategyGetBuffer + 1.19% UnpinBuffer + 0.90% PinBuffer + 0.70% hash_search_with_hash_value + 3.11% postgres postgres [.] GetSnapshotData + 2.47% postgres postgres [.] StrategyGetBuffer + 1.93% postgres [kernel.kallsyms] [k] copy_user_generic_string + 1.28% postgres postgres [.] hash_search_with_hash_value - 1.27% postgres postgres [.] LWLockAttemptLock - LWLockAttemptLock - 97.78% LWLockAcquire + 38.76% ReadBuffer_common + 28.62% _bt_getbuf + 8.59% _bt_relandgetbuf + 6.25% GetSnapshotData + 5.93% VirtualXactLockTableInsert + 3.95% VirtualXactLockTableCleanup + 2.35% index_fetch_heap + 1.66% StartBufferIO + 1.56% LockReleaseAll + 1.55% _bt_next + 0.78% LockAcquireExtended + 1.47% _bt_next + 0.75% _bt_relandgetbuf to me. Now that's with the client count 496, but it's similar with lower counts. BTW, that profile *clearly* indicates we should make StrategyGetBuffer() smarter. > Patch_ver/Client_count 175 HEAD 248374 PATCH 235669 > > > Now probably these shouldn't matter much in case backend needs to > > > wait for other Exclusive locker, but I am not sure what else could be > > > the reason for dip in case we need to have Exclusive LWLocks. > > > > Any chance to get a profile? > > Here it goes.. > > Lwlock_contention patches - client_count=128 > ---------------------------------------------------------------------- > > + 7.95% postgres postgres [.] GetSnapshotData > + 3.58% postgres postgres [.] AllocSetAlloc > + 2.51% postgres postgres [.] _bt_compare > + 2.44% postgres postgres [.] > hash_search_with_hash_value > + 2.33% postgres [kernel.kallsyms] [k] .__copy_tofrom_user > + 2.24% postgres postgres [.] AllocSetFreeIndex > + 1.75% postgres postgres [.] > pg_atomic_fetch_add_u32_impl Uh. Huh? Normally that'll be inline. That's compiled with gcc? What were the compiler settings you used? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers