Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 8:32 PM, Andres Freund wrote: > LGTM, pushed. Closing CF entry. Yay! Only 110 to go. Thanks Andres! -- Peter Geoghegan

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Andres Freund
On 2018-03-31 20:25:24 -0700, Peter Geoghegan wrote: > On Sat, Mar 31, 2018 at 8:09 PM, Peter Geoghegan wrote: > > I was thinking of using rint(), which is what you get if you call > > round(float8) from SQL. > > Attached patch does it that way. Note that there are float/int cast >

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 8:09 PM, Peter Geoghegan wrote: > I was thinking of using rint(), which is what you get if you call > round(float8) from SQL. Attached patch does it that way. Note that there are float/int cast regression tests that ensure that rint() behaves consistently on

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 8:08 PM, Andres Freund wrote: >> round() is from C99, apparently. I'll investigate a fix. > > Just replacing it with a floor(val + 0.5) ought to do the trick, right? I was thinking of using rint(), which is what you get if you call round(float8) from

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Andres Freund
On 2018-03-31 19:43:45 -0700, Peter Geoghegan wrote: > On Sat, Mar 31, 2018 at 3:15 PM, Peter Geoghegan wrote: > > On Sat, Mar 31, 2018 at 2:59 PM, Peter Geoghegan wrote: > >> WFM. I have all the information I need to produce the next revision now. > > > > I might as

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 3:15 PM, Peter Geoghegan wrote: > On Sat, Mar 31, 2018 at 2:59 PM, Peter Geoghegan wrote: >> WFM. I have all the information I need to produce the next revision now. > > I might as well post this one first. I'll have 0002 for you in a short

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 3:15 PM, Peter Geoghegan wrote: >> WFM. I have all the information I need to produce the next revision now. > > I might as well post this one first. I'll have 0002 for you in a short while. Attached is 0002 -- the amcheck enhancement itself. As requested by

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 2:59 PM, Peter Geoghegan wrote: > WFM. I have all the information I need to produce the next revision now. I might as well post this one first. I'll have 0002 for you in a short while. -- Peter Geoghegan

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Sat, Mar 31, 2018 at 2:56 PM, Andres Freund wrote: >> So you're asking for something like bt_index_check_heap() + >> bt_index_parent_check_heap()? Or, are you talking about function >> overloading? > > The latter. That addresses my concerns about dropping the function and >

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Andres Freund
On 2018-03-31 11:27:14 -0700, Peter Geoghegan wrote: > On Fri, Mar 30, 2018 at 7:04 PM, Andres Freund wrote: > > I'm just saying that there should be two functions here, rather than > > dropping the old definition, and creating s new one with a default argument. > > So

Re: [HACKERS] A design for amcheck heapam verification

2018-03-31 Thread Peter Geoghegan
On Fri, Mar 30, 2018 at 7:04 PM, Andres Freund wrote: > I'm just saying that there should be two functions here, rather than dropping > the old definition, and creating s new one with a default argument. So you're asking for something like bt_index_check_heap() +

Re: [HACKERS] A design for amcheck heapam verification

2018-03-30 Thread Peter Geoghegan
On Fri, Mar 30, 2018 at 6:55 PM, Peter Geoghegan wrote: > On Fri, Mar 30, 2018 at 6:20 PM, Andres Freund wrote: >>> What would the upcasting you have in mind look like? >> >> Just use UINT64CONST()? Let's try not to introduce further code that'll >> need to get

Re: [HACKERS] A design for amcheck heapam verification

2018-03-30 Thread Andres Freund
On March 30, 2018 6:55:50 PM PDT, Peter Geoghegan wrote: >On Fri, Mar 30, 2018 at 6:20 PM, Andres Freund >wrote: >>> What would the upcasting you have in mind look like? >> >> Just use UINT64CONST()? Let's try not to introduce further code >that'll >> need to

Re: [HACKERS] A design for amcheck heapam verification

2018-03-30 Thread Peter Geoghegan
On Fri, Mar 30, 2018 at 6:20 PM, Andres Freund wrote: >> What would the upcasting you have in mind look like? > > Just use UINT64CONST()? Let's try not to introduce further code that'll > need to get painfully fixed. What I have right now is based on imitating the style that

Re: [HACKERS] A design for amcheck heapam verification

2018-03-30 Thread Andres Freund
On 2018-03-29 19:42:42 -0700, Peter Geoghegan wrote: > >> + /* > >> + * Aim for two bytes per element; this is sufficient to get a false > >> + * positive rate below 1%, independent of the size of the bitset or > >> total > >> + * number of elements. Also, if rounding down the

Re: [HACKERS] A design for amcheck heapam verification

2018-03-30 Thread Peter Geoghegan
On Thu, Mar 29, 2018 at 7:42 PM, Peter Geoghegan wrote: >> We should be able to get this into v11... > > That's a relief. Thanks. I have a new revision lined up. I won't send anything to the list until you clear up what you meant in those few cases where it seemed unclear. I also

Re: [HACKERS] A design for amcheck heapam verification

2018-03-29 Thread Peter Geoghegan
On Thu, Mar 29, 2018 at 6:18 PM, Andres Freund wrote: >> This commit adds a test harness extension module, test_bloomfilter. It >> can be used to get a sense of how the Bloom filter implementation >> performs under varying conditions. > > Maybe add a short paragraph

Re: [HACKERS] A design for amcheck heapam verification

2018-03-29 Thread Andres Freund
Hi, On 2018-03-26 20:20:57 -0700, Peter Geoghegan wrote: > From ede1ba731dc818172a94adbb6331323c1f2b1170 Mon Sep 17 00:00:00 2001 > From: Peter Geoghegan > Date: Thu, 24 Aug 2017 20:58:21 -0700 > Subject: [PATCH v7 1/2] Add Bloom filter data structure implementation. > > A Bloom

Re: [HACKERS] A design for amcheck heapam verification

2018-03-29 Thread Peter Geoghegan
On Thu, Mar 29, 2018 at 2:28 AM, Simon Riggs wrote: > I understand we are adding a check to verify heap against index and > this will take longer than before. When it runs does it report > progress of the run via pg_stat_activity, so we can monitor how long > it will take?

Re: [HACKERS] A design for amcheck heapam verification

2018-03-29 Thread Simon Riggs
On 29 March 2018 at 01:49, Peter Geoghegan wrote: >>> IndexBuildHeapRangeScan() doesn't mention anything about CIC's heap >>> ShareUpdateExclusiveLock (it just says SharedLock), because that lock >>> strength doesn't have anything to do with IndexBuildHeapRangeScan() >>> when it

Re: [HACKERS] A design for amcheck heapam verification

2018-03-28 Thread Peter Geoghegan
On Wed, Mar 28, 2018 at 5:47 AM, Pavan Deolasee wrote: > Mostly a nitpick, but I guess we should leave a comment after > IndexBuildHeapScan() saying heap_endscan() is not necessary since > IndexBuildHeapScan() does that internally. I stumbled upon that while > looking

Re: [HACKERS] A design for amcheck heapam verification

2018-03-28 Thread Peter Geoghegan
On Wed, Mar 28, 2018 at 5:04 AM, Pavan Deolasee wrote: > Hmm Ok. It still sounds backwards to me, but then English is not my first or > even second language. "A heap scan later verifies the presence in the heap > of all index tuples fingerprinted" sounds as if we expect

Re: [HACKERS] A design for amcheck heapam verification

2018-03-28 Thread Pavan Deolasee
On Wed, Mar 28, 2018 at 2:48 AM, Peter Geoghegan wrote: > > I don't think so. The transaction involved is still an ordinary user > transaction. > > Mostly a nitpick, but I guess we should leave a comment after IndexBuildHeapScan() saying heap_endscan() is not necessary since

Re: [HACKERS] A design for amcheck heapam verification

2018-03-28 Thread Pavan Deolasee
On Wed, Mar 28, 2018 at 2:48 AM, Peter Geoghegan wrote: > On Tue, Mar 27, 2018 at 6:48 AM, Pavan Deolasee > wrote: > > + * When index-to-heap verification is requested, a Bloom filter is used > to > > + * fingerprint all tuples in the target index, as the

Re: [HACKERS] A design for amcheck heapam verification

2018-03-27 Thread Peter Geoghegan
On Tue, Mar 27, 2018 at 6:48 AM, Pavan Deolasee wrote: > + * When index-to-heap verification is requested, a Bloom filter is used to > + * fingerprint all tuples in the target index, as the index is traversed to > + * verify its structure. A heap scan later verifies the

Re: [HACKERS] A design for amcheck heapam verification

2018-03-27 Thread Pavan Deolasee
On Tue, Mar 27, 2018 at 8:50 AM, Peter Geoghegan wrote: > On Fri, Mar 23, 2018 at 7:13 AM, Andrey Borodin > wrote: > > I've just flipped patch to WoA. But if above issues will be fixed I > think that patch is ready for committer. > > Attached is v7, which has

Re: [HACKERS] A design for amcheck heapam verification

2018-03-26 Thread Peter Geoghegan
On Fri, Mar 23, 2018 at 7:13 AM, Andrey Borodin wrote: > I've just flipped patch to WoA. But if above issues will be fixed I think > that patch is ready for committer. Attached is v7, which has the small tweaks that you suggested. Thank you for the review. I hope that

Re: [HACKERS] A design for amcheck heapam verification

2018-03-23 Thread Andrey Borodin
Hi! > 8 февр. 2018 г., в 22:45, Peter Geoghegan написал(а): > > On Thu, Feb 8, 2018 at 6:05 AM, Andrey Borodin wrote: >> I do not see a reason behind hashing the seed. > > It made some sense when I was XOR'ing it to mix. A uniform > distribution of bits

Re: [HACKERS] A design for amcheck heapam verification

2018-02-08 Thread Peter Geoghegan
On Thu, Feb 8, 2018 at 6:05 AM, Andrey Borodin wrote: > I do not see a reason behind hashing the seed. It made some sense when I was XOR'ing it to mix. A uniform distribution of bits seemed desirable then, since random() won't use the most significant bit -- it generates

Re: [HACKERS] A design for amcheck heapam verification

2018-02-08 Thread Andrey Borodin
Hi, Peter! > 8 февр. 2018 г., в 4:56, Peter Geoghegan написал(а): > > * Faster modulo operations. > > * Removed sdbmhash(). Thanks! I definitely like how Bloom filter is implemented now. I could not understand meaning of this, but apparently this will not harm + /* +

Re: [HACKERS] A design for amcheck heapam verification

2018-02-07 Thread Peter Geoghegan
On Mon, Feb 5, 2018 at 12:55 PM, Peter Geoghegan wrote: > Anyway, parallel CREATE INDEX added a new "scan" argument to > IndexBuildHeapScan(), which caused this patch to bitrot. At a minimum, > an additional NULL argument should be passed by amcheck. However, I > have a better idea.

Re: [HACKERS] A design for amcheck heapam verification

2018-02-05 Thread Peter Geoghegan
On Mon, Jan 22, 2018 at 6:07 PM, Michael Paquier wrote: > Yep. I have provided the feedback I wanted for 0001 (no API change in > the bloom facility by the way :( ), but I still wanted to look at 0002 > in depths. I don't see a point in adding complexity that no caller

Re: [HACKERS] A design for amcheck heapam verification

2018-01-22 Thread Michael Paquier
On Mon, Jan 22, 2018 at 03:22:05PM -0800, Peter Geoghegan wrote: > Michael said he'd do more review. I generally feel this is close, though. Yep. I have provided the feedback I wanted for 0001 (no API change in the bloom facility by the way :( ), but I still wanted to look at 0002 in depths. --

Re: [HACKERS] A design for amcheck heapam verification

2018-01-22 Thread Peter Geoghegan
Hi, On Fri, Jan 12, 2018 at 1:41 AM, Andrey Borodin wrote: > I've looked into the code and here's my review. > > The new functionality works and is useful right now. I believe it should be > shipped in the Postgres box (currently, I install it with packet managers). > The

Re: [HACKERS] A design for amcheck heapam verification

2018-01-11 Thread Andrey Borodin
Hello! I like heapam verification functionality and use it right now. So, I'm planning to provide review for this patch, probably, this week. From my current use I have some thoughts on interface. Here's what I get. # select bt_index_check('messagefiltervalue_group_id_59490523e6ee451f',true);

Re: [HACKERS] A design for amcheck heapam verification

2017-12-13 Thread Peter Geoghegan
On Mon, Dec 11, 2017 at 9:35 PM, Michael Paquier wrote: > + /* > +* Generate random seed, or use caller's. Seed should always be a > positive > +* value less than or equal to PG_INT32_MAX, to ensure that any random > seed > +* can be recreated through

Re: [HACKERS] A design for amcheck heapam verification

2017-12-11 Thread Michael Paquier
On Fri, Dec 8, 2017 at 4:37 AM, Peter Geoghegan wrote: > Here, we repeat the same test 3 times, varying only the seed value > used for each run. Thanks for the updated version! Looking at 0001, I have run a coverage test and can see all code paths covered, which is nice. It is also

Re: [HACKERS] A design for amcheck heapam verification

2017-12-07 Thread Peter Geoghegan
On Tue, Nov 28, 2017 at 9:54 PM, Peter Geoghegan wrote: > On Tue, Nov 28, 2017 at 9:50 PM, Michael Paquier > wrote: >>> Would that address your concern? There would be an SQL interface, but >>> it would be trivial. >> >> That's exactly what I think you

Re: [HACKERS] A design for amcheck heapam verification

2017-11-28 Thread Michael Paquier
On Wed, Nov 29, 2017 at 2:54 PM, Peter Geoghegan wrote: > My understanding of your earlier remarks, rightly or wrongly, was that > you wanted me to adopt the Bloom filter to actually be usable from SQL > in some kind of general way. As opposed to what I just said -- adding > a stub

Re: [HACKERS] A design for amcheck heapam verification

2017-11-28 Thread Peter Geoghegan
On Tue, Nov 28, 2017 at 9:50 PM, Michael Paquier wrote: >> Would that address your concern? There would be an SQL interface, but >> it would be trivial. > > That's exactly what I think you should do, and mentioned so upthread. > A SQL interface can also show a good

Re: [HACKERS] A design for amcheck heapam verification

2017-11-28 Thread Peter Geoghegan
On Tue, Nov 28, 2017 at 9:38 PM, Michael Paquier wrote: > My apologies for slacking here. I would still welcome some regression > tests to stress the bloom API you are proposing! For now I am moving > this patch to next CF. I still don't think that regression tests as

Re: [HACKERS] A design for amcheck heapam verification

2017-11-28 Thread Michael Paquier
On Sat, Oct 21, 2017 at 9:34 AM, Peter Geoghegan wrote: > I should point out that I shipped virtually the same code yesterday, > as v1.1 of the Github version of amcheck (also known as amcheck_next). > Early adopters will be able to use this new "heapallindexed" > functionality in