Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-10-01 Thread Daniel Gustafsson
> On 06 Sep 2017, at 08:42, Fabien COELHO wrote: > > Hello Alik, > > Applies, compiles, works for me. > > Some minor comments and suggestions. > > Two typos: > - "usinng" -> "using" > - "a rejection method used" -> "a rejection method is used" > > I'm not sure of "least_recently_used_i", this

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-09-05 Thread Fabien COELHO
Hello Alik, Applies, compiles, works for me. Some minor comments and suggestions. Two typos: - "usinng" -> "using" - "a rejection method used" -> "a rejection method is used" I'm not sure of "least_recently_used_i", this naming style is not used in pgbench. "least_recently_used" would be o

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-09-02 Thread Alik Khilazhev
Hello Fabien, Thank you for detailed review. I hope I have fixed all the issues you mentioned in your letter. pgbench-zipf-08v.patch Description: Binary data — Thanks and Regards, Alik Khilazhev Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsq

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-23 Thread Fabien COELHO
Hello Alik, I am attaching patch v7. Patch generates multiple warnings with "git apply", apparently because of end-of-line spaces, and fails: pgbench-zipf-07v.patch:52: trailing whitespace. { pgbench-zipf-07v.patch:53: trailing whitespace. "random_zipfian", 3, P

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-22 Thread Alik Khilazhev
Hello, Fabien I am attaching patch v7. > Yes, I agree. a >= 1 does not make much sense... If you want uniform you > should use random(), not call random_zipfian with a = 1. Basically it > suggests that too large values of "a" should be rejected. Not sure where to > put the limit, thoug

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-13 Thread Fabien COELHO
Hello Alik, Now “a” does not have upper bound, that’s why on using iterative algorithm with a >= 1 program will stuck on infinite loop because of following line of code: double b = pow(2.0, s - 1.0); Because after overflow “b” becomes “+Inf”. Yep, overflow can happen. So should upper b

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-13 Thread Alik Khilazhev
Hello Fabien, > > I think that this method should be used for a>1, and the other very rough one > can be kept for parameter a in [0, 1), a case which does not make much sense > to a mathematician as it diverges if unbounded. Now “a” does not have upper bound, that’s why on using iterative algo

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-06 Thread Alik Khilazhev
Hello Fabien, > On 5 Aug 2017, at 12:15, Fabien COELHO wrote: > > > Hello Alik, > > I've done some math investigations, which consisted in spending one hour with > Christian, a statistician colleague of mine. He took an old book out of a > shelf, opened it to page 550 (roughly in the middle

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-08-05 Thread Peter Geoghegan
On Fri, Aug 4, 2017 at 3:30 PM, Peter Geoghegan wrote: > Yura Sokolov of Postgres Pro performed this benchmark at my request. > He took the 9.5 commit immediately proceeding 2ed5b87f9 as a baseline. I attach a simple patch that comments out the release of the buffer pin for logged tables where an

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-05 Thread Fabien COELHO
Hello Alik, So I would be in favor of expanding the documentation but not restricting the parameter beyond avoiding value 1.0. I have removed restriction and expanded documentation in attaching patch v5. I've done some math investigations, which consisted in spending one hour with Christia

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-05 Thread Fabien COELHO
Hello Peter, I think that it would also be nice if there was an option to make functions like random_zipfian() actually return a value that has undergone perfect hashing. When this option is used, any given value that the function returns would actually be taken from a random mapping to some

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-08-04 Thread Peter Geoghegan
On Fri, Jul 21, 2017 at 4:51 AM, Alik Khilazhev wrote: > (Latest version of pgbench Zipfian patch) While I'm +1 on this idea, I think that it would also be nice if there was an option to make functions like random_zipfian() actually return a value that has undergone perfect hashing. When this opt

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-08-04 Thread Peter Geoghegan
On Mon, Jul 31, 2017 at 10:54 AM, Peter Geoghegan wrote: > Let's wait to see what difference it makes if Alik's zipfian > distribution pgbench test case uses unlogged tables. That may gives us a > good sense of the problem for cases with contention/concurrency. Yura Sokolov of Postgres Pro perfor

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-31 Thread Peter Geoghegan
Robert Haas wrote: On Mon, Jul 31, 2017 at 1:54 PM, Peter Geoghegan wrote: That is hard to justify. I don't think that failing to set LP_DEAD hints is the cost that must be paid to realize a benefit elsewhere, though. I don't see much problem with having both benefits consistently. It's actual

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-31 Thread Robert Haas
On Mon, Jul 31, 2017 at 1:54 PM, Peter Geoghegan wrote: > That is hard to justify. I don't think that failing to set LP_DEAD hints > is the cost that must be paid to realize a benefit elsewhere, though. I > don't see much problem with having both benefits consistently. It's > actually very unlikel

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-31 Thread Peter Geoghegan
Robert Haas wrote: On Thu, Jul 27, 2017 at 10:05 PM, Peter Geoghegan wrote: I really don't know if that would be worthwhile. It would certainly fix the regression shown in my test case, but that might not go far enough. I strongly suspect that there are more complicated workloads where LP_DEAD

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-31 Thread Robert Haas
On Thu, Jul 27, 2017 at 10:05 PM, Peter Geoghegan wrote: > I really don't know if that would be worthwhile. It would certainly fix > the regression shown in my test case, but that might not go far enough. > I strongly suspect that there are more complicated workloads where > LP_DEAD cleanup from S

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-28 Thread Peter Geoghegan
Amit Kapila wrote: Isn't it possible to confirm if the problem is due to commit 2ed5b87f9? Basically, if we have unlogged tables, then it won't release the pin. So if the commit in question is the culprit, then the same workload should not lead to bloat. That's a great idea. Alik? -- Peter

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-28 Thread Amit Kapila
On Wed, Jul 26, 2017 at 3:32 AM, Peter Geoghegan wrote: > On Fri, Jul 14, 2017 at 5:06 PM, Peter Geoghegan wrote: >> I think that what this probably comes down to, more than anything >> else, is that you have leftmost hot/bloated leaf pages like this: >> >> >> idx | level | l_i

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-27 Thread Peter Geoghegan
Peter Geoghegan wrote: In Alik's workload, there are two queries: One UPDATE, one SELECT. Even though the bloated index was a unique index, and so still gets _bt_check_unique() item killing, the regression is still going to block LP_DEAD cleanup by the SELECTs, which seems like it might be quit

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-27 Thread Peter Geoghegan
Robert Haas wrote: We now see that no update ever kills items within _bt_killitems(), because our own update to the index leaf page itself nullifies our ability to kill anything, by changing the page LSN from the one stashed in the index scan state variable. Fortunately, we are not really "self-

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-27 Thread Robert Haas
On Tue, Jul 25, 2017 at 11:02 PM, Peter Geoghegan wrote: > While the benchmark Alik came up with is non-trivial to reproduce, I > can show a consistent regression for a simple case with only one > active backend. That's not good. > We now see that no update ever kills items within _bt_killitems(

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-25 Thread Peter Geoghegan
On Tue, Jul 25, 2017 at 8:02 PM, Peter Geoghegan wrote: > Setup: > > Initialize pgbench (any scale factor). > create index on pgbench_accounts (aid); That "create index" was meant to be on "abalance", to make the UPDATE queries always HOT-unsafe. (You'll want to *also* create this index once the

Re: LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-25 Thread Peter Geoghegan
On Tue, Jul 25, 2017 at 3:02 PM, Peter Geoghegan wrote: > I've been thinking about this a lot, because this really does look > like a pathological case to me. I think that this workload is very > sensitive to how effective kill_prior_tuples/LP_DEAD hinting is. Or at > least, I can imagine that mec

LP_DEAD hinting and not holding on to a buffer pin on leaf page (Was: [HACKERS] [WIP] Zipfian distribution in pgbench)

2017-07-25 Thread Peter Geoghegan
On Fri, Jul 14, 2017 at 5:06 PM, Peter Geoghegan wrote: > I think that what this probably comes down to, more than anything > else, is that you have leftmost hot/bloated leaf pages like this: > > > idx | level | l_item | blkno | btpo_prev | > btpo_next | btpo_flags | type | live

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-21 Thread Alik Khilazhev
Hello! I realized that I was sending emails as HTML and latest patch is not visible in the archive now. That’s why I am attaching it again. I am sorry for that. pgbench-zipf-05v.patch Description: Binary data — Thanks and Regards, Alik Khilazhev Postgres Professional: http://www.postgrespro.

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-21 Thread Alik Khilazhev
Hmmm. On second thought, maybe one or the other is enough, either restrict the parameter to values where the approximation is good, or put out a clear documentation about when the approximation is not very good, but it may be still useful even if not precise.So I would be in favor of expanding the

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-20 Thread Alik Khilazhev
> I think that developping a test would be much simpler with the improved tap > test infrastructure, so I would suggest to wait to know the result of the > corresponding patch. Ok, I will wait then. > Also, could you recod the patch to CF 2017-09? > https://commitfest.postgresql.org/14/

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-20 Thread Fabien COELHO
Hello Alik, About the maths: As already said, I'm not at ease with a random_zipfian function which does not display a (good) zipfian distribution. At the minimum the documentation should be clear about the approximations implied depending on the parameter value. I add one more sentence to d

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-20 Thread Alik Khilazhev
Hello Fabien,I am attaching patch v4. On 19 Jul 2017, at 17:21, Fabien COELHO wrote:About the maths: As already said, I'm not at ease with a random_zipfian function which does not display a (good) zipfian distribution. At the minimum the documentation should be clear about the

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-19 Thread Fabien COELHO
Hello Alik, I am attaching patch v3. Among other things I fixed small typo in description of random_exponential function in pgbench.sgml file. Ok. Probably this typo should be committed separatly and independently. A few comments about v3: Patch applies cleanly, compiles, works. About t

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-17 Thread Fabien COELHO
Hello, Is this bias expected from the drawing method, say because it is approximated and the approximation is weak at some points, or is there an issue with its implementation, says some shift which gets smoothed down for higher indexes? I have checked paper where such implementation was pr

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-17 Thread Alik Khilazhev
> On 17 Jul 2017, at 13:51, Fabien COELHO wrote: > > > Is this bias expected from the drawing method, say because it is approximated > and the approximation is weak at some points, or is there an issue with its > implementation, says some shift which gets smoothed down for higher indexes? >

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-17 Thread Fabien COELHO
Ok, so you did not get the large bias for i=3. Strange. I got large bias for i=3 and theta > 1 even with a million outcomes, Ok. So this is similar to what I got. Is this bias expected from the drawing method, say because it is approximated and the approximation is weak at some points, or

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-17 Thread Alik Khilazhev
Hello Fabien,On 14 Jul 2017, at 17:51, Fabien COELHO wrote:Ok, so you did not get the large bias for i=3. Strange.I got large bias for i=3 and theta > 1 even with a million outcomes, but for theta < 1 (I have tested on theta = 0.1 and 0.3) it showed quite good results.I am att

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-14 Thread Peter Geoghegan
On Fri, Jul 14, 2017 at 6:37 AM, Alik Khilazhev wrote: > I am attaching results of tests for 32 and 128 clients that were running for > 10 minutes, and TPS remains 305 and 115 ktps respectively. > > Tests was executed with configuration set for YCSB. And there is very > aggressively autovacuum,

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-14 Thread Fabien COELHO
Algorithm works with theta less than 1. The only problem here is that theta can not be 1, because of next line of code cell->alpha = 1. / (1 - theta); That’s why I put such restriction. Now I see 2 possible solutions for that: 1) Exclude 1, and allow everything in range (0;+∞). Yep. 2) Or

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-14 Thread Alik Khilazhev
> On 13 Jul 2017, at 23:13, Peter Geoghegan wrote: > > I just figured out that "result.txt" is only a 60 second pgbench run. > Is the same true for other tests? Yes, other tests ran 60 seconds too. > > It would be much more interesting to see runs that lasted 10 minutes > or more. I am atta

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-14 Thread Alik Khilazhev
> On 13 Jul 2017, at 19:14, Fabien COELHO wrote: > > Documentation says that the closer theta is from 0 the flatter the > distribution > but the implementation requires at least 1, including strange error messages: > > zipfian parameter must be greater than 1.00 (not 1.00) > > Could

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-13 Thread Peter Geoghegan
On Thu, Jul 13, 2017 at 12:49 PM, Peter Geoghegan wrote: > To reiterate what I say above: > > The number of leaf pages with dead items is 20 with this most recent > run (128 clients, patched + unpatched). The leftmost internal page one > level up from the leaf level contains 289 items. Whereas las

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-13 Thread Peter Geoghegan
On Thu, Jul 13, 2017 at 10:02 AM, Peter Geoghegan wrote: > The number of leaf pages at the left hand side of the leaf level seems > to be ~50 less than the unpatched 128 client case was the first time > around, which seems like a significant difference. I wonder why. Maybe > autovacuum ran at the

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-13 Thread Peter Geoghegan
On Thu, Jul 13, 2017 at 4:38 AM, Alik Khilazhev wrote: > I am attaching results of test for 32 and 128 clients for original and > patched(_bt_doinsert) variants. Thanks. The number of leaf pages at the left hand side of the leaf level seems to be ~50 less than the unpatched 128 client case was t

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-13 Thread Fabien COELHO
Hello Alik, A few comments about the patch v2. Patch applies and compiles. Documentation says that the closer theta is from 0 the flatter the distribution but the implementation requires at least 1, including strange error messages: zipfian parameter must be greater than 1.00 (not 1.000

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-13 Thread Alik Khilazhev
On 13 Jul 2017, at 00:20, Peter Geoghegan wrote:Actually, I mean that I wonder how much of a difference it would makeif this entire block was commented out within _bt_doinsert():if (checkUnique != UNIQUE_CHECK_NO){    …}I am attaching results of test for 32 and 128 clients for origin

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Peter Geoghegan
On Wed, Jul 12, 2017 at 2:17 PM, Peter Geoghegan wrote: > I'd be interested in seeing the difference it makes if Postgres is > built with the call to _bt_check_unique() commented out within > nbtinsert.c. Actually, I mean that I wonder how much of a difference it would make if this entire block w

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Peter Geoghegan
On Wed, Jul 12, 2017 at 1:55 PM, Alvaro Herrera wrote: > Not to mention work done with a "buffer cleanup lock" held -- which is > compounded by the fact that acquiring such a lock is prone to starvation > if there are many scanners of that index. I've seen a case where a very > hot table is scann

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Alvaro Herrera
Peter Geoghegan wrote: > Now, that might not seem like that much of a difference, but if you > consider how duplicates are handled in the B-Tree code, and how unique > index enforcement works, I think it could be. It could lead to heavy > buffer lock contention, because we sometimes do a lot of wo

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Peter Geoghegan
On Wed, Jul 12, 2017 at 12:30 PM, Peter Geoghegan wrote: > On Wed, Jul 12, 2017 at 4:28 AM, Alik Khilazhev > wrote: >> I am attaching results of query that you sent. It shows that there is >> nothing have changed after executing tests. > > But something did change! In the case where performance w

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Peter Geoghegan
On Wed, Jul 12, 2017 at 4:28 AM, Alik Khilazhev wrote: > I am attaching results of query that you sent. It shows that there is > nothing have changed after executing tests. But something did change! In the case where performance was good, all internal pages on the level above the leaf level have

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Alik Khilazhev
Hello! I want to say that our company is already engaged in the search for the causes of the problem and their solution. And also we have few experimental patches that increases performance for 1000 clients by several times. In addition, I have fixed threadsafety issues and implemented per-thr

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-12 Thread Alik Khilazhev
On 7 Jul 2017, at 21:53, Peter Geoghegan wrote:Is it possible for you to instrument the number of B-Tree pageaccesses using custom instrumentation for pgbench_accounts_pkey?If that seems like too much work, then it would still be interestingto see what the B-Tree keyspace looks like

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-10 Thread Fabien COELHO
Hello Alik, Your description is not very precise. What version of Postgres is used? If there is a decline, compared to which version? Is there a link to these results? Benchmark have been done in master v10. I am attaching image with results: . Ok, thanks. More precision would be helpful,

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-10 Thread Amit Kapila
On Mon, Jul 10, 2017 at 12:19 PM, Alik Khilazhev wrote: > Hello, Fabien! > > Your description is not very precise. What version of Postgres is used? If > there is a decline, compared to which version? Is there a link to these > results? > > > Benchmark have been done in master v10. I am attaching

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-09 Thread Alik Khilazhev
Hello, Fabien! > Your description is not very precise. What version of Postgres is used? If > there is a decline, compared to which version? Is there a link to these > results? Benchmark have been done in master v10. I am attaching image with results: . > Indeed, the function computation is ov

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Peter Geoghegan
On Fri, Jul 7, 2017 at 12:59 PM, Alvaro Herrera wrote: > Hmm, this seems potentially very useful. Care to upload it to > https://wiki.postgresql.org/wiki/Category:Snippets ? Sure. I've added it here, under "index maintenance": https://wiki.postgresql.org/wiki/Index_Maintenance#Summarize_keyspac

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Alvaro Herrera
Peter Geoghegan wrote: > Here is the query: > > with recursive index_details as ( > select > 'pgbench_accounts_pkey'::text idx > ), [...] Hmm, this seems potentially very useful. Care to upload it to https://wiki.postgresql.org/wiki/Category:Snippets ? -- Álvaro Herrerah

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Peter Geoghegan
On Fri, Jul 7, 2017 at 12:45 AM, Alik Khilazhev wrote: > On scale = 10(1 million rows) it gives following results on machine with 144 > cores(with synchronous_commit=off): > nclientstps > 1 8842.401870 > 2 18358.140869 > 4

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Peter Geoghegan
On Fri, Jul 7, 2017 at 5:17 AM, Robert Haas wrote: > How is that possible? In a Zipfian distribution, no matter how big > the table is, almost all of the updates will be concentrated on a > handful of rows - and updates to any given row are necessarily > serialized, or so I would think. Maybe Mo

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Robert Haas
On Fri, Jul 7, 2017 at 3:45 AM, Alik Khilazhev wrote: > PostgreSQL shows very bad results in YCSB Workload A (50% SELECT and 50% > UPDATE of random row by PK) on benchmarking with big number of clients using > Zipfian distribution. MySQL also has decline but it is not significant as it > is in

Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Fabien COELHO
Hello Alik, PostgreSQL shows very bad results in YCSB Workload A (50% SELECT and 50% UPDATE of random row by PK) on benchmarking with big number of clients using Zipfian distribution. MySQL also has decline but it is not significant as it is in PostgreSQL. MongoDB does not have decline at al

[HACKERS] [WIP] Zipfian distribution in pgbench

2017-07-07 Thread Alik Khilazhev
Hello! PostgreSQL shows very bad results in YCSB Workload A (50% SELECT and 50% UPDATE of random row by PK) on benchmarking with big number of clients using Zipfian distribution. MySQL also has decline but it is not significant as it is in PostgreSQL. MongoDB does not have decline at all. And i