Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-10-01 Thread Daniel Gustafsson
> On 06 Sep 2017, at 08:42, Fabien COELHO wrote: > > Hello Alik, > > Applies, compiles, works for me. > > Some minor comments and suggestions. > > Two typos: > - "usinng" -> "using" > - "a rejection method used" -> "a rejection method is used" > > I'm not sure of

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-09-06 Thread Fabien COELHO
Hello Alik, Applies, compiles, works for me. Some minor comments and suggestions. Two typos: - "usinng" -> "using" - "a rejection method used" -> "a rejection method is used" I'm not sure of "least_recently_used_i", this naming style is not used in pgbench. "least_recently_used" would be

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-09-02 Thread Alik Khilazhev
Hello Fabien, Thank you for detailed review. I hope I have fixed all the issues you mentioned in your letter. pgbench-zipf-08v.patch Description: Binary data — Thanks and Regards, Alik Khilazhev Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-23 Thread Fabien COELHO
Hello Alik, I am attaching patch v7. Patch generates multiple warnings with "git apply", apparently because of end-of-line spaces, and fails: pgbench-zipf-07v.patch:52: trailing whitespace. { pgbench-zipf-07v.patch:53: trailing whitespace. "random_zipfian", 3,

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-22 Thread Alik Khilazhev
Hello, Fabien I am attaching patch v7. > Yes, I agree. a >= 1 does not make much sense... If you want uniform you > should use random(), not call random_zipfian with a = 1. Basically it > suggests that too large values of "a" should be rejected. Not sure where to > put the limit,

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-13 Thread Fabien COELHO
Hello Alik, Now “a” does not have upper bound, that’s why on using iterative algorithm with a >= 1 program will stuck on infinite loop because of following line of code: double b = pow(2.0, s - 1.0); Because after overflow “b” becomes “+Inf”. Yep, overflow can happen. So should upper

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-13 Thread Alik Khilazhev
Hello Fabien, > > I think that this method should be used for a>1, and the other very rough one > can be kept for parameter a in [0, 1), a case which does not make much sense > to a mathematician as it diverges if unbounded. Now “a” does not have upper bound, that’s why on using iterative

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-06 Thread Alik Khilazhev
Hello Fabien, > On 5 Aug 2017, at 12:15, Fabien COELHO wrote: > > > Hello Alik, > > I've done some math investigations, which consisted in spending one hour with > Christian, a statistician colleague of mine. He took an old book out of a > shelf, opened it to page 550

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-05 Thread Fabien COELHO
Hello Alik, So I would be in favor of expanding the documentation but not restricting the parameter beyond avoiding value 1.0. I have removed restriction and expanded documentation in attaching patch v5. I've done some math investigations, which consisted in spending one hour with

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-05 Thread Fabien COELHO
Hello Peter, I think that it would also be nice if there was an option to make functions like random_zipfian() actually return a value that has undergone perfect hashing. When this option is used, any given value that the function returns would actually be taken from a random mapping to

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-04 Thread Peter Geoghegan
On Fri, Jul 21, 2017 at 4:51 AM, Alik Khilazhev wrote: > (Latest version of pgbench Zipfian patch) While I'm +1 on this idea, I think that it would also be nice if there was an option to make functions like random_zipfian() actually return a value that has undergone

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-21 Thread Alik Khilazhev
Hello! I realized that I was sending emails as HTML and latest patch is not visible in the archive now. That’s why I am attaching it again. I am sorry for that. pgbench-zipf-05v.patch Description: Binary data — Thanks and Regards, Alik Khilazhev Postgres Professional:

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-21 Thread Alik Khilazhev
Hmmm. On second thought, maybe one or the other is enough, either restrict the parameter to values where the approximation is good, or put out a clear documentation about when the approximation is not very good, but it may be still useful even if not precise.So I would be in favor of expanding the

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-20 Thread Alik Khilazhev
> I think that developping a test would be much simpler with the improved tap > test infrastructure, so I would suggest to wait to know the result of the > corresponding patch. Ok, I will wait then. > Also, could you recod the patch to CF 2017-09? > https://commitfest.postgresql.org/14/

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-20 Thread Fabien COELHO
Hello Alik, About the maths: As already said, I'm not at ease with a random_zipfian function which does not display a (good) zipfian distribution. At the minimum the documentation should be clear about the approximations implied depending on the parameter value. I add one more sentence to

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-20 Thread Alik Khilazhev
Hello Fabien,I am attaching patch v4. On 19 Jul 2017, at 17:21, Fabien COELHO wrote:About the maths: As already said, I'm not at ease with a random_zipfian function which does not display a (good) zipfian distribution. At the minimum the documentation should be clear about

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-19 Thread Fabien COELHO
Hello Alik, I am attaching patch v3. Among other things I fixed small typo in description of random_exponential function in pgbench.sgml file. Ok. Probably this typo should be committed separatly and independently. A few comments about v3: Patch applies cleanly, compiles, works. About

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-17 Thread Fabien COELHO
Hello, Is this bias expected from the drawing method, say because it is approximated and the approximation is weak at some points, or is there an issue with its implementation, says some shift which gets smoothed down for higher indexes? I have checked paper where such implementation was

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-17 Thread Alik Khilazhev
> On 17 Jul 2017, at 13:51, Fabien COELHO wrote: > > > Is this bias expected from the drawing method, say because it is approximated > and the approximation is weak at some points, or is there an issue with its > implementation, says some shift which gets smoothed down

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-17 Thread Fabien COELHO
Ok, so you did not get the large bias for i=3. Strange. I got large bias for i=3 and theta > 1 even with a million outcomes, Ok. So this is similar to what I got. Is this bias expected from the drawing method, say because it is approximated and the approximation is weak at some points, or

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-17 Thread Alik Khilazhev
Hello Fabien,On 14 Jul 2017, at 17:51, Fabien COELHO wrote:Ok, so you did not get the large bias for i=3. Strange.I got large bias for i=3 and theta > 1 even with a million outcomes, but for theta < 1 (I have tested on theta = 0.1 and 0.3) it showed quite good results.I am

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-14 Thread Peter Geoghegan
On Fri, Jul 14, 2017 at 6:37 AM, Alik Khilazhev wrote: > I am attaching results of tests for 32 and 128 clients that were running for > 10 minutes, and TPS remains 305 and 115 ktps respectively. > > Tests was executed with configuration set for YCSB. And there is very

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-14 Thread Fabien COELHO
Algorithm works with theta less than 1. The only problem here is that theta can not be 1, because of next line of code cell->alpha = 1. / (1 - theta); That’s why I put such restriction. Now I see 2 possible solutions for that: 1) Exclude 1, and allow everything in range (0;+∞). Yep. 2)

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-14 Thread Alik Khilazhev
> On 13 Jul 2017, at 23:13, Peter Geoghegan wrote: > > I just figured out that "result.txt" is only a 60 second pgbench run. > Is the same true for other tests? Yes, other tests ran 60 seconds too. > > It would be much more interesting to see runs that lasted 10 minutes > or

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-14 Thread Alik Khilazhev
> On 13 Jul 2017, at 19:14, Fabien COELHO wrote: > > Documentation says that the closer theta is from 0 the flatter the > distribution > but the implementation requires at least 1, including strange error messages: > > zipfian parameter must be greater than 1.00 (not

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-13 Thread Peter Geoghegan
On Thu, Jul 13, 2017 at 12:49 PM, Peter Geoghegan wrote: > To reiterate what I say above: > > The number of leaf pages with dead items is 20 with this most recent > run (128 clients, patched + unpatched). The leftmost internal page one > level up from the leaf level contains 289

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-13 Thread Peter Geoghegan
On Thu, Jul 13, 2017 at 10:02 AM, Peter Geoghegan wrote: > The number of leaf pages at the left hand side of the leaf level seems > to be ~50 less than the unpatched 128 client case was the first time > around, which seems like a significant difference. I wonder why. Maybe >

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-13 Thread Peter Geoghegan
On Thu, Jul 13, 2017 at 4:38 AM, Alik Khilazhev wrote: > I am attaching results of test for 32 and 128 clients for original and > patched(_bt_doinsert) variants. Thanks. The number of leaf pages at the left hand side of the leaf level seems to be ~50 less than the

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-13 Thread Fabien COELHO
Hello Alik, A few comments about the patch v2. Patch applies and compiles. Documentation says that the closer theta is from 0 the flatter the distribution but the implementation requires at least 1, including strange error messages: zipfian parameter must be greater than 1.00 (not

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-13 Thread Alik Khilazhev
On 13 Jul 2017, at 00:20, Peter Geoghegan wrote:Actually, I mean that I wonder how much of a difference it would makeif this entire block was commented out within _bt_doinsert():if (checkUnique != UNIQUE_CHECK_NO){    …}I am attaching results of test for 32 and 128 clients for

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Peter Geoghegan
On Wed, Jul 12, 2017 at 2:17 PM, Peter Geoghegan wrote: > I'd be interested in seeing the difference it makes if Postgres is > built with the call to _bt_check_unique() commented out within > nbtinsert.c. Actually, I mean that I wonder how much of a difference it would make if this

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Peter Geoghegan
On Wed, Jul 12, 2017 at 1:55 PM, Alvaro Herrera wrote: > Not to mention work done with a "buffer cleanup lock" held -- which is > compounded by the fact that acquiring such a lock is prone to starvation > if there are many scanners of that index. I've seen a case where

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Alvaro Herrera
Peter Geoghegan wrote: > Now, that might not seem like that much of a difference, but if you > consider how duplicates are handled in the B-Tree code, and how unique > index enforcement works, I think it could be. It could lead to heavy > buffer lock contention, because we sometimes do a lot of

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Peter Geoghegan
On Wed, Jul 12, 2017 at 12:30 PM, Peter Geoghegan wrote: > On Wed, Jul 12, 2017 at 4:28 AM, Alik Khilazhev > wrote: >> I am attaching results of query that you sent. It shows that there is >> nothing have changed after executing tests. > > But something

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Peter Geoghegan
On Wed, Jul 12, 2017 at 4:28 AM, Alik Khilazhev wrote: > I am attaching results of query that you sent. It shows that there is > nothing have changed after executing tests. But something did change! In the case where performance was good, all internal pages on the

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Alik Khilazhev
Hello! I want to say that our company is already engaged in the search for the causes of the problem and their solution. And also we have few experimental patches that increases performance for 1000 clients by several times. In addition, I have fixed threadsafety issues and implemented

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Alik Khilazhev
On 7 Jul 2017, at 21:53, Peter Geoghegan wrote:Is it possible for you to instrument the number of B-Tree pageaccesses using custom instrumentation for pgbench_accounts_pkey?If that seems like too much work, then it would still be interestingto see what the B-Tree keyspace looks like

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-10 Thread Fabien COELHO
Hello Alik, Your description is not very precise. What version of Postgres is used? If there is a decline, compared to which version? Is there a link to these results? Benchmark have been done in master v10. I am attaching image with results: . Ok, thanks. More precision would be

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-10 Thread Amit Kapila
On Mon, Jul 10, 2017 at 12:19 PM, Alik Khilazhev wrote: > Hello, Fabien! > > Your description is not very precise. What version of Postgres is used? If > there is a decline, compared to which version? Is there a link to these > results? > > > Benchmark have been done

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-10 Thread Alik Khilazhev
Hello, Fabien! > Your description is not very precise. What version of Postgres is used? If > there is a decline, compared to which version? Is there a link to these > results? Benchmark have been done in master v10. I am attaching image with results: . > Indeed, the function computation is

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Peter Geoghegan
On Fri, Jul 7, 2017 at 12:59 PM, Alvaro Herrera wrote: > Hmm, this seems potentially very useful. Care to upload it to > https://wiki.postgresql.org/wiki/Category:Snippets ? Sure. I've added it here, under "index maintenance":

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Alvaro Herrera
Peter Geoghegan wrote: > Here is the query: > > with recursive index_details as ( > select > 'pgbench_accounts_pkey'::text idx > ), [...] Hmm, this seems potentially very useful. Care to upload it to https://wiki.postgresql.org/wiki/Category:Snippets ? -- Álvaro Herrera

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Peter Geoghegan
On Fri, Jul 7, 2017 at 12:45 AM, Alik Khilazhev wrote: > On scale = 10(1 million rows) it gives following results on machine with 144 > cores(with synchronous_commit=off): > nclientstps > 1 8842.401870 > 2

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Peter Geoghegan
On Fri, Jul 7, 2017 at 5:17 AM, Robert Haas wrote: > How is that possible? In a Zipfian distribution, no matter how big > the table is, almost all of the updates will be concentrated on a > handful of rows - and updates to any given row are necessarily > serialized, or so

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Robert Haas
On Fri, Jul 7, 2017 at 3:45 AM, Alik Khilazhev wrote: > PostgreSQL shows very bad results in YCSB Workload A (50% SELECT and 50% > UPDATE of random row by PK) on benchmarking with big number of clients using > Zipfian distribution. MySQL also has decline but it is

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Fabien COELHO
Hello Alik, PostgreSQL shows very bad results in YCSB Workload A (50% SELECT and 50% UPDATE of random row by PK) on benchmarking with big number of clients using Zipfian distribution. MySQL also has decline but it is not significant as it is in PostgreSQL. MongoDB does not have decline at