Re: [HACKERS] A design for amcheck heapam verification

2017-10-20 Thread Peter Geoghegan
On Thu, Oct 5, 2017 at 7:00 PM, Peter Geoghegan wrote: > v3 of the patch series, attached, does it that way -- it adds a > bloom_create(). The new bloom_create() function still allocates its > own memory, but does so while using a FLEXIBLE_ARRAY_MEMBER. A > separate bloom_init()

Re: [HACKERS] A design for amcheck heapam verification

2017-10-05 Thread Peter Geoghegan
On Fri, Sep 29, 2017 at 10:54 AM, Peter Geoghegan wrote: >> Something that allocates new memory as the patch's bloom_init() >> function does I'd tend to call 'make' or 'create' or 'new' or >> something, rather than 'init'. > > I tend to agree. I'll adopt that style in the next

Re: [HACKERS] A design for amcheck heapam verification

2017-09-29 Thread Robert Haas
On Fri, Sep 29, 2017 at 1:57 PM, Peter Geoghegan wrote: > On Fri, Sep 29, 2017 at 10:29 AM, Robert Haas wrote: >> I am also wondering whether this patch should consider >> 81c5e46c490e2426db243eada186995da5bb0ba7 as a way of obtaining >> multiple hash values.

Re: [HACKERS] A design for amcheck heapam verification

2017-09-29 Thread Peter Geoghegan
On Fri, Sep 29, 2017 at 10:29 AM, Robert Haas wrote: > I am also wondering whether this patch should consider > 81c5e46c490e2426db243eada186995da5bb0ba7 as a way of obtaining > multiple hash values. I suppose that's probably inferior to what is > already being done on

Re: [HACKERS] A design for amcheck heapam verification

2017-09-29 Thread Peter Geoghegan
On Thu, Sep 28, 2017 at 8:34 PM, Thomas Munro wrote: > FWIW I think if I were attacking that problem the first thing I'd > probably try would be getting rid of that internal pointer > filter->bitset in favour of a FLEXIBLE_ARRAY_MEMBER and then making > the

Re: [HACKERS] A design for amcheck heapam verification

2017-09-29 Thread Robert Haas
On Thu, Sep 28, 2017 at 11:34 PM, Thomas Munro wrote: > FWIW I think if I were attacking that problem the first thing I'd > probably try would be getting rid of that internal pointer > filter->bitset in favour of a FLEXIBLE_ARRAY_MEMBER and then making > the

Re: [HACKERS] A design for amcheck heapam verification

2017-09-28 Thread Thomas Munro
On Fri, Sep 29, 2017 at 4:17 PM, Michael Paquier wrote: >> As for DSM, I think that that can come later, and can be written by >> somebody closer to that problem. There can be more than one >> initialization function. > > I don't completely disagree with that, there

Re: [HACKERS] A design for amcheck heapam verification

2017-09-28 Thread Michael Paquier
On Thu, Sep 28, 2017 at 3:32 AM, Peter Geoghegan wrote: > On Wed, Sep 27, 2017 at 1:45 AM, Michael Paquier > wrote: >> I have signed up as a reviewer of this patch, and I have looked at the >> bloom filter implementation for now. This is the kind of

Re: [HACKERS] A design for amcheck heapam verification

2017-09-27 Thread Peter Geoghegan
On Wed, Sep 27, 2017 at 1:45 AM, Michael Paquier wrote: > I have signed up as a reviewer of this patch, and I have looked at the > bloom filter implementation for now. This is the kind of facility that > people have asked for on this list for many years. > > One first

Re: [HACKERS] A design for amcheck heapam verification

2017-09-27 Thread Michael Paquier
On Thu, Sep 7, 2017 at 11:26 AM, Peter Geoghegan wrote: > On Wed, Aug 30, 2017 at 9:29 AM, Peter Geoghegan wrote: >> On Wed, Aug 30, 2017 at 5:02 AM, Alvaro Herrera >> wrote: >>> Eh, if you want to optimize it for the case where debug output

Re: [HACKERS] A design for amcheck heapam verification

2017-09-16 Thread Peter Geoghegan
On Wed, Sep 6, 2017 at 7:26 PM, Peter Geoghegan wrote: > On Wed, Aug 30, 2017 at 9:29 AM, Peter Geoghegan wrote: >> On Wed, Aug 30, 2017 at 5:02 AM, Alvaro Herrera >> wrote: >>> Eh, if you want to optimize it for the case where debug output

Re: [HACKERS] A design for amcheck heapam verification

2017-09-06 Thread Peter Geoghegan
On Wed, Aug 30, 2017 at 9:29 AM, Peter Geoghegan wrote: > On Wed, Aug 30, 2017 at 5:02 AM, Alvaro Herrera > wrote: >> Eh, if you want to optimize it for the case where debug output is not >> enabled, make sure to use ereport() not elog(). ereport() >>

Re: [HACKERS] A design for amcheck heapam verification

2017-08-30 Thread Peter Geoghegan
On Wed, Aug 30, 2017 at 5:02 AM, Alvaro Herrera wrote: > Eh, if you want to optimize it for the case where debug output is not > enabled, make sure to use ereport() not elog(). ereport() > short-circuits evaluation of arguments, whereas elog() does not. I should do

Re: [HACKERS] A design for amcheck heapam verification

2017-08-30 Thread Alvaro Herrera
Peter Geoghegan wrote: > > Your patch brings us one step closer to that goal. (The book says > > that this approach is good far sparse bitsets, but your comment says > > that we expect something near 50%. That's irrelevant anyway since a > > future centralised popcount() implementation would do

Re: [HACKERS] A design for amcheck heapam verification

2017-08-29 Thread Peter Geoghegan
On Tue, Aug 29, 2017 at 7:22 PM, Thomas Munro wrote: > Indeed. Thank you for working on this! To summarise a couple of > ideas that Peter and I discussed off-list a while back: (1) While > building the hash table for a hash join we could build a Bloom filter >

Re: [HACKERS] A design for amcheck heapam verification

2017-08-29 Thread Thomas Munro
On Wed, Aug 30, 2017 at 1:00 PM, Peter Geoghegan wrote: > On Tue, Aug 29, 2017 at 4:34 PM, Thomas Munro > wrote: >> Some drive-by comments on the lib patch: > > I was hoping that you'd look at this, since you'll probably want to > use a bloom filter

Re: [HACKERS] A design for amcheck heapam verification

2017-08-29 Thread Peter Geoghegan
On Tue, Aug 29, 2017 at 4:34 PM, Thomas Munro wrote: > Some drive-by comments on the lib patch: I was hoping that you'd look at this, since you'll probably want to use a bloom filter for parallel hash join at some point. I've tried to keep this one as simple as

Re: [HACKERS] A design for amcheck heapam verification

2017-08-29 Thread Michael Paquier
On Wed, Aug 30, 2017 at 8:34 AM, Thomas Munro wrote: > It'd be nice to replace both with fls() or flsl(), though it's > annoying to have to think about long vs int64 etc. We already use > fls() in two places and supply an implementation in src/port/fls.c for >

Re: [HACKERS] A design for amcheck heapam verification

2017-08-29 Thread Thomas Munro
On Wed, Aug 30, 2017 at 7:58 AM, Peter Geoghegan wrote: > On Thu, May 11, 2017 at 4:30 PM, Peter Geoghegan wrote: >> I spent only a few hours writing a rough prototype, and came up with >> something that does an IndexBuildHeapScan() scan following the >> existing

Re: [HACKERS] A design for amcheck heapam verification

2017-08-29 Thread Peter Geoghegan
On Thu, May 11, 2017 at 4:30 PM, Peter Geoghegan wrote: > I spent only a few hours writing a rough prototype, and came up with > something that does an IndexBuildHeapScan() scan following the > existing index verification steps. Its amcheck callback does an > index_form_tuple()

Re: [HACKERS] A design for amcheck heapam verification

2017-05-11 Thread Peter Geoghegan
On Mon, May 1, 2017 at 6:39 PM, Peter Geoghegan wrote: > On Mon, May 1, 2017 at 6:20 PM, Tom Lane wrote: >> Maybe you can fix this by assuming that your own session's advertised xmin >> is a safe upper bound on everybody else's RecentGlobalXmin. But I'm not >>

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Robert Haas
On Mon, May 1, 2017 at 9:20 PM, Tom Lane wrote: > ISTM if you want to do that you have an inherent race condition. > That is, no matter what you do, the moment after you look the currently > oldest open transaction could commit, allowing some other session's > view of

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Peter Geoghegan
On Mon, May 1, 2017 at 6:20 PM, Tom Lane wrote: > Maybe you can fix this by assuming that your own session's advertised xmin > is a safe upper bound on everybody else's RecentGlobalXmin. But I'm not > sure if that rule does what you want. That's what you might ultimately

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Tom Lane
Peter Geoghegan writes: > If it's not clear what I mean: existing code that cares about > RecentGlobalXmin is using it as a *conservative* point before which > every snapshot sees every transaction as committed/aborted (and > therefore nobody can care if that other backend hot

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Peter Geoghegan
On Mon, May 1, 2017 at 4:28 PM, Peter Geoghegan wrote: > Anyone have an opinion on any of this? Offhand, I think that calling > GetOldestXmin() once per index when its "amcheck whole index scan" > finishes would be safe, and yet provide appreciably better test > coverage than only

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Peter Geoghegan
On Mon, May 1, 2017 at 2:10 PM, Peter Geoghegan wrote: > Actually, I guess amcheck would need to use its own scan's snapshot > xmin instead. This is true because it cares about visibility in a way > that's "backwards" relative to existing code that tests something > against

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Greg Stark
On 1 May 2017 at 20:46, Robert Haas wrote: > One problem is that Bloom filters assume you can get > n independent hash functions for a given value, which we have not got. > That problem would need to be solved somehow. If you only have one > hash function, the size of the

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Peter Geoghegan
On Fri, Apr 28, 2017 at 6:02 PM, Peter Geoghegan wrote: > - Is committed, and committed before RecentGlobalXmin. Actually, I guess amcheck would need to use its own scan's snapshot xmin instead. This is true because it cares about visibility in a way that's "backwards" relative to

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Peter Geoghegan
On Mon, May 1, 2017 at 12:46 PM, Robert Haas wrote: > Bloom filters are one of those things that come up on this mailing > list incredibly frequently but rarely get used in committed code; thus > far, contrib/bloom is the only example we've got, and not for lack of > other

Re: [HACKERS] A design for amcheck heapam verification

2017-05-01 Thread Robert Haas
On Fri, Apr 28, 2017 at 9:02 PM, Peter Geoghegan wrote: > I'd like to hear feedback on the general idea, and what the > user-visible interface ought to look like. The non-deterministic false > negatives may need to be considered by the user visible interface, > which is the main

[HACKERS] A design for amcheck heapam verification

2017-04-28 Thread Peter Geoghegan
It seems like a good next step for amcheck would be to add functionality that verifies that heap tuples have matching index tuples, and that heap pages are generally sane. I've been thinking about a design for this for a while now, and would like to present some tentative ideas before I start