On Fri, Nov 3, 2017 at 1:34 PM, Tom Lane wrote:
> Jeff Janes writes:
> > With this v3 patch (assuming this is the one you just committed
> > as ec42a1dcb30de235b252f6d4) am now getting make check failures.
>
> There's a followup commit already :-(
>
> regards, tom lane
>
Jeff Janes writes:
> With this v3 patch (assuming this is the one you just committed
> as ec42a1dcb30de235b252f6d4) am now getting make check failures.
There's a followup commit already :-(
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgre
Hi Alvaro,
With this v3 patch (assuming this is the one you just committed
as ec42a1dcb30de235b252f6d4) am now getting make check failures.
brin_summarize_range is returning unexpected values.
CentOS6,
PostgreSQL 11devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7
20120313 (Red Hat 4.4.7
Alvaro Herrera wrote:
> I think your argument is sensible for some uses (where people run manual
> VACUUM after loading data) but not others (where people just use manual
> VACUUM in place of autovacuuming -- because they don't trust autovac, or
> the schedule isn't convenient, or whatever other r
Tom Lane wrote:
> Alvaro Herrera writes:
> > Tom Lane wrote:
> >> Do we still need the complication in brinsummarize to discriminate
> >> against the last partial range? Now that the lock consideration
> >> is gone, I think that might be a wart.
>
> > In the case of VACUUM, it's not desirable to
Alvaro Herrera writes:
> Tom Lane wrote:
>> Do we still need the complication in brinsummarize to discriminate
>> against the last partial range? Now that the lock consideration
>> is gone, I think that might be a wart.
> In the case of VACUUM, it's not desirable to create a summarization for
>
Tom Lane wrote:
> Alvaro Herrera writes:
>
> > Yeah, I think this approach results in better code. The attached patch
> > implements that, and it passes the test for me (incl. calling
> > brin_summarize_new_values concurrently with vacuum, when running the
> > insert; the former does include the
Alvaro Herrera writes:
> Alvaro Herrera wrote:
>> Maybe a solution is to call RelationGetNumberOfBlocks() after the
>> placeholder tuple has been inserted, for the case where we would be
>> scanning past end of relation; passing InvalidBlockNumber as stop point
>> would indicate to do things that
Alvaro Herrera wrote:
> Maybe a solution is to call RelationGetNumberOfBlocks() after the
> placeholder tuple has been inserted, for the case where we would be
> scanning past end of relation; passing InvalidBlockNumber as stop point
> would indicate to do things that way. I'll try with that appr
Tom Lane wrote:
> Alvaro Herrera writes:
> > Rather than remove the capability, I'd be inclined to make
> > brin_summarize_new_values summarize the final partial range, and have
> > VACUUM not do it. Would that be too inconsistent?
>
> That doesn't really get you out of the problem that this is
Hi,
On 11/02/2017 06:45 PM, Alvaro Herrera wrote:
> Tomas Vondra wrote:
>
>> Unfortunately, I think we still have a problem ... I've been wondering
>> if we end up producing correct indexes, so I've done a simple test.
>
> Here's a proposed patch that should fix this problem (and does, in my
> t
Alvaro Herrera writes:
> Tom Lane wrote:
>> So what would happen if we just don't summarize partial ranges?
> Index scan would always have to read all the heap pages for that partial
> range. Maybe not a big issue, but when you finish loading a table, it'd
> be good to have a mechanism to summar
Tom Lane wrote:
> Alvaro Herrera writes:
> > Tom Lane wrote:
> >> If VACUUM and brin_summarize_new_values both ignore the partial
> >> range, then what else would request this? Can't we just decree
> >> that we don't summarize the partial range, period?
>
> > brin_summarize_range() can do it.
>
Alvaro Herrera writes:
> Tom Lane wrote:
>> If VACUUM and brin_summarize_new_values both ignore the partial
>> range, then what else would request this? Can't we just decree
>> that we don't summarize the partial range, period?
> brin_summarize_range() can do it.
So what would happen if we just
Tom Lane wrote:
> Alvaro Herrera writes:
> > 2. when summarization is requested on the partial range at the end of a
> > table, we acquire extension lock on the rel, then compute relation size
> > and run summarization with the lock held. This guarantees that we don't
> > miss any pages. This i
Alvaro Herrera writes:
> 1. in VACUUM or brin_summarize_new_values, we only process fully loaded
> ranges, and ignore the partial range at end of table.
OK.
> 2. when summarization is requested on the partial range at the end of a
> table, we acquire extension lock on the rel, then compute relat
Tomas Vondra wrote:
> Unfortunately, I think we still have a problem ... I've been wondering
> if we end up producing correct indexes, so I've done a simple test.
Here's a proposed patch that should fix this problem (and does, in my
testing). Would you please give it a try?
This patch changes t
Alvaro Herrera writes:
> Tom Lane wrote:
>> Where are we on this --- do you want me to push the brin_doupdate
>> fix I proposed, or were you intending to merge that into a
>> larger patch?
> Please push your fixes, I'll post my proposed patch for the other bug
> afterwards; they are unrelated pro
Tom Lane wrote:
> Alvaro Herrera writes:
> >> Hmm, I'm pretty sure we stress-tested brin in pretty much the same way.
> >> But I see this misbehavior too. Looking ...
>
> > Turns out that this is related to concurrent growth of the table while
> > the summarization process is scanning -- so new
Alvaro Herrera writes:
>> Hmm, I'm pretty sure we stress-tested brin in pretty much the same way.
>> But I see this misbehavior too. Looking ...
> Turns out that this is related to concurrent growth of the table while
> the summarization process is scanning -- so new pages have appeared at
> the
Alvaro Herrera wrote:
> Tomas Vondra wrote:
>
> > FWIW I can reproduce this on 9.5, and I don't even need to run the
> > UPDATE part. That is, INSERT + VACUUM running concurrently is enough to
> > produce broken BRIN indexes :-(
>
> Hmm, I'm pretty sure we stress-tested brin in pretty much the sa
Tomas Vondra wrote:
> FWIW I can reproduce this on 9.5, and I don't even need to run the
> UPDATE part. That is, INSERT + VACUUM running concurrently is enough to
> produce broken BRIN indexes :-(
Hmm, I'm pretty sure we stress-tested brin in pretty much the same way.
But I see this misbehavior t
On 10/31/2017 11:44 PM, Tomas Vondra wrote:
> ...
> Unfortunately, I think we still have a problem ... I've been wondering
> if we end up producing correct indexes, so I've done a simple test.
>
> 1) create the table as before
>
> 2) let the insert + vacuum run for some time, to see if there ar
Hi,
On 10/31/2017 08:46 PM, Tom Lane wrote:
> I wrote:
>> maybe
>> we just have some run-of-the-mill bugs to find, like the off-the-end
>> bug I spotted in brin_doupdate. There's apparently at least one
>> more, but given the error message it must be something like not
>> checking for a page to h
I wrote:
> maybe
> we just have some run-of-the-mill bugs to find, like the off-the-end
> bug I spotted in brin_doupdate. There's apparently at least one
> more, but given the error message it must be something like not
> checking for a page to have turned into a revmap page. Shouldn't
> be too h
Alvaro Herrera writes:
> Tom Lane wrote:
>> I really don't understand how any of this "let's release the buffer
>> lock and then take it back later" logic is supposed to work reliably.
> So summarize_range first inserts the placeholder tuple, which is there
> purposefully for other processes to u
Tom Lane wrote:
> I really don't understand how any of this "let's release the buffer
> lock and then take it back later" logic is supposed to work reliably.
So summarize_range first inserts the placeholder tuple, which is there
purposefully for other processes to update concurrently; then, witho
Tom Lane wrote:
> So in a few more runs this morning using Alvaro's simplified test case,
> I have seen the following behaviors not previously reported:
> 1. Crashes in PageIndexTupleOverwrite, which has the same "invalid index
> offnum: %u" error report as PageIndexTupleDeleteNoCompact. I note t
So in a few more runs this morning using Alvaro's simplified test case,
I have seen the following behaviors not previously reported:
1. Crashes in PageIndexTupleOverwrite, which has the same "invalid index
offnum: %u" error report as PageIndexTupleDeleteNoCompact. I note the
same message appears
Michael Paquier writes:
> On Tue, Oct 31, 2017 at 4:56 AM, Tom Lane wrote:
>> Yeah, we're still missing an understanding of why we didn't see it
>> before; the inadequate locking was surely there before.
> Because 24992c6d has added a check on the offset number by using
> PageIndexTupleDeleteNoC
On Tue, Oct 31, 2017 at 4:56 AM, Tom Lane wrote:
> Alvaro Herrera writes:
>> Tom Lane wrote:
>>> So: I put the blame on the fact that summarize_range() thinks that
>>> the tuple offset it has for the placeholder tuple is guaranteed to
>>> hold good, even across possibly-long intervals where it's
Alvaro Herrera writes:
> Tom Lane wrote:
>> So: I put the blame on the fact that summarize_range() thinks that
>> the tuple offset it has for the placeholder tuple is guaranteed to
>> hold good, even across possibly-long intervals where it's holding
>> no lock on the containing buffer.
> Yeah, I
Tom Lane wrote:
> So: I put the blame on the fact that summarize_range() thinks that
> the tuple offset it has for the placeholder tuple is guaranteed to
> hold good, even across possibly-long intervals where it's holding
> no lock on the containing buffer.
Yeah, I think this is a pretty reasonab
Thanks everyone for the analysis downthread. Here's a test case that
provokes the crash faster. Initialize with
create table brin_test (a text);
create index on brin_test using brin (a) with (pages_per_range = 1);
Then in one psql, run this:
select brin_summarize_new_values('brin_test_a_idx') \
I wrote:
> Hmm. The index offnum being complained of is one past the end of the
> lp array. I think I see what about that commit changed the behavior:
> the old code for PageIndexDeleteNoCompact never changed the length
> of the lp array, except in the corner case where the page is becoming
> com
Tomas Vondra writes:
> On 10/30/2017 09:03 PM, Tom Lane wrote:
>> [ scratches head ... ] Not sure how that could've introduced any
>> problems? Will look.
> Not sure, but I can confirm Michael's findings - I've been unable to
> reproduce the issue on 1a4be103a5 even after 20 minutes, and on
> 2
On 10/30/2017 09:03 PM, Tom Lane wrote:
> Michael Paquier writes:
>> b1328d78f is close enough, but per what I see that's actually not
>> true. I have been able to reproduce the problem, which shows up within
>> a window of 2-4 minutes. Still sometimes it can take way longer, and I
>> spotted on
Michael Paquier writes:
> b1328d78f is close enough, but per what I see that's actually not
> true. I have been able to reproduce the problem, which shows up within
> a window of 2-4 minutes. Still sometimes it can take way longer, and I
> spotted one test where it took 15 minutes to show up... So
On Mon, Oct 30, 2017 at 9:42 AM, Alvaro Herrera wrote:
> Tomas Vondra wrote:
>> FWIW I can reproduce this on REL_10_STABLE, but not on REL9_6_STABLE. So
>> it seems to be due to something that changed in the last release.
>
> I've been trying to reproduce it for half an hour with no success (I
> t
Tomas Vondra wrote:
> FWIW I can reproduce this on REL_10_STABLE, but not on REL9_6_STABLE. So
> it seems to be due to something that changed in the last release.
I've been trying to reproduce it for half an hour with no success (I
think my laptop is just too slow). I bet this is related to the
a
FWIW I can reproduce this on REL_10_STABLE, but not on REL9_6_STABLE. So
it seems to be due to something that changed in the last release.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsq
41 matches
Mail list logo