I have made few more changes with the new patch.

1. Ran pgindent.
2. Instead of an atomic state variable to make only one process cache
the snapshot in shared memory, I have used conditional try lwlock.
With this, we have a small and reliable code.
3. Performance benchmarking

Machine - cthulhu
[mithun.cy@cthulhu bin]$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                128
On-line CPU(s) list:   0-127
Thread(s) per core:    2
Core(s) per socket:    8
Socket(s):             8
NUMA node(s):          8
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 47
Model name:            Intel(R) Xeon(R) CPU E7- 8830  @ 2.13GHz
Stepping:              2
CPU MHz:               1197.000
BogoMIPS:              4266.63
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              24576K
NUMA node0 CPU(s):     0,65-71,96-103
NUMA node1 CPU(s):     72-79,104-111
NUMA node2 CPU(s):     80-87,112-119
NUMA node3 CPU(s):     88-95,120-127
NUMA node4 CPU(s):     1-8,33-40
NUMA node5 CPU(s):     9-16,41-48
NUMA node6 CPU(s):     17-24,49-56
NUMA node7 CPU(s):     25-32,57-64

Server configuration:
./postgres -c shared_buffers=8GB -N 300 -c min_wal_size=15GB -c
max_wal_size=20GB -c checkpoint_timeout=900 -c
maintenance_work_mem=1GB -c checkpoint_completion_target=0.9 -c
wal_buffers=256MB &

pgbench configuration:
scale_factor = 300
./pgbench -c $threads -j $threads -T $time_for_reading -M prepared -S  postgres

The machine has 64 cores with this patch I can see server starts
improvement after 64 clients. I have tested up to 256 clients. Which
shows performance improvement nearly max 39%.

Alternatively, I thought instead of storing the snapshot in a shared
memory each backend can hold on to its previously computed snapshot
until next commit/rollback happens in the system. We can have a global
counter value associated with the snapshot when ever it is computed.
Upon any new end of the transaction, the global counter will be
incremented. So when a process wants a new snapshot it can compare
these 2 values to check if it can use previously computed snapshot.
This makes code significantly simple. With the first approach, one
process has to compute and store the snapshot for every end of the
transaction and others can reuse the cached the snapshot.  In the
second approach, every process has to re-compute the snapshot. So I am
keeping with the same approach.

On Mon, Jul 10, 2017 at 10:13 AM, Mithun Cy <mithun...@enterprisedb.com> wrote:
> On Fri, Apr 8, 2016 at 12:13 PM, Robert Haas <robertmh...@gmail.com> wrote:
>> I think that we really shouldn't do anything about this patch until
>> after the CLOG stuff is settled, which it isn't yet.  So I'm going to
>> mark this Returned with Feedback; let's reconsider it for 9.7.
> I am updating a rebased patch have tried to benchmark again could see
> good improvement in the pgbench read-only case at very high clients on
> our cthulhu (8 nodes, 128 hyper thread machines) and power2 (4 nodes,
> 192 hyper threads) machine. There is some issue with base code
> benchmarking which is somehow not consistent so once I could figure
> out what is the issue with that I will update
> --
> Thanks and Regards
> Mithun C Y
> EnterpriseDB: http://www.enterprisedb.com

Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachment: cache_the_snapshot_performance.ods
Description: application/vnd.oasis.opendocument.spreadsheet

Attachment: Cache_data_in_GetSnapshotData_POC_03.patch
Description: Binary data

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to