Hello! PostgreSQL shows very bad results in YCSB Workload A (50% SELECT and 50% UPDATE of random row by PK) on benchmarking with big number of clients using Zipfian distribution. MySQL also has decline but it is not significant as it is in PostgreSQL. MongoDB does not have decline at all. And if pgbench would have Zipfian distribution random number generator, everyone will be able to make research on this topic without using YCSB. This is the reason why I am currently working on random_zipfian function.
The bottleneck of algorithm that I use is that it calculates zeta function (it has linear complexity - https://en.wikipedia.org/wiki/Riemann_zeta_function). It my cause problems on generating huge amount of big numbers. That’s why I added caching for zeta value. And it works good for cases when random_zipfian called with same parameters in script. For example: … \set a random_zipfian(1, 100, 1.2) \set b random_zipfian(1, 100, 1.2) … In other case, second call will override cache of first and caching does not make any sense: … \set a random_zipfian(1, 100, 1.2) \set b random_zipfian(1, 200, 1.4) … That’s why I have a question: should I implement support of caching zeta values for calls with different parameters, or not? P.S. I attaching patch and script - analogue of YCSB Workload A. Run benchmark with command: $ pgbench -f ycsb_read_zipf.sql -f ycsb_update_zipf.sql On scale = 10(1 million rows) it gives following results on machine with 144 cores(with synchronous_commit=off): nclients tps 1 8842.401870 2 18358.140869 4 45999.378785 8 88713.743199 16 170166.998212 32 290069.221493 64 178128.030553 128 88712.825602 256 38364.937573 512 13512.765878 1000 6188.136736
Description: Binary data
— Thanks and Regards, Alik Khilazhev Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
-- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers