Re: [HACKERS] a raft of parallelism-related bug fixes

Andres Freund Mon, 08 Feb 2016 14:29:17 -0800

Hi Robert,

On 2016-02-08 13:45:37 -0500, Robert Haas wrote:
> > I realize that this stuff has all been brewing long, and that there's
> > still a lot to do. So you gotta keep moving. And I'm not sure that
> > there's anything wrong or if there's any actually better approach. But
> > pushing an unreviewed, complex patch, that originated in a thread
> > orginally about different relatively small/mundane items, for a
> > contentious issue, a few days after the initial post. Hm. Not sure how
> > you'd react if you weren't the author.
> 
> Probably not very well.  Do you want me to revert it?


No. I want(ed) to express that I am not comfortable with how this got
in. My aim wasn't to generate a flurry of responses with everybody
piling on, or anything like that. But it's unfortunately hard to
avoid. I wish I knew a way, besides only sending private mails. Which I
don't think is a great approach either.

I do agree that we need something to tackle this problem, and that this
quite possibly is the least bad way to do this. And certainly the only
one that's been implemented and posted with any degree of completeness.

But even given the last paragraph, posting a complex new patch in a
somewhat related thread, and then pushing it 5 days later is pretty darn
quick.


> I mean, look.  [explanation why we need the infrastructure].  Do I really
> think anybody was going to spend the time to understand deadlock.c
> well enough to verify my changes?  No, I don't.  What I think would
> have happened is that the patch would have sat around like an
> albatross around my neck - totally ignored by everyone - until the end
> of the last CF, and then the discussion would have gone one of three
> ways:

Yes, believe me, I really get that. It's awfully hard to get substantial
review for pieces of code that require a lot of context.

But I think posting this patch in a new thread, posting a message that
you're intending to commit unless somebody protests with a substantial
arguments and/or a timeline of review, and then waiting a few days, are
something that should be done for a major piece of new infrastructure,
especially when it's somewhat controversial.

This doesn't just affect parallel execution, it affects one of least
understood parts of postgres code. And where hard to find bugs, likely
to only trigger in production, are to be expected.


> And, by the way, the patch, aside from the deadlock.c portion, was
> posted back in October, admittedly without much fanfare, but nobody
> reviewed that or any other patch on this thread.

I think it's unrealistic to expect random patches without a commitest
entry, posted somewhere deep in a thread, to get a review when there's
so many open commitfest entries that haven't gotten feedback, and which
we are supposed to look at.


> If I'd waited for those reviews to come in, parallel query would not
> be committed now, nor probably in 9.6, nor probably in 9.7 or 9.8
> either.  The whole project would just be impossible on its face.

Yes, that's a problem. But you're not the only one facing it, and you've
argued hard against such an approach in some other cases.


> I think it's myopic to say "well, but this patch might have bugs".
> Very true.  But also, all the other parallelism patches that are
> already committed or that are still under review but which can't be
> properly tested without this patch might have bugs, too, so you've got
> to weigh the risk that this patch might get better if I wait longer to
> commit it against the possibility that not having committed it reduces
> the chances of finding bugs elsewhere.  I don't want it to seem like
> I'm forcing this down the community's throat - I don't have a right to
> do that, and I will certainly revert this patch if that is the
> consensus.  But that is not what I think best myself.  What I think
> would be better is to (1) make an effort to get the buildfarm testing
> which this patch enables up and running as soon as possible and (2)
> for somebody to read over the committed code and raise any issues that
> they find.  Or, for that matter, to read over the committed code for
> any of the *other* parallelism patches and raise any issues that they
> find with *that* code.  There's certainly scads of code here and this
> is far from the only bit that might have bugs.

I think you are, and *you have to*, walk a very thin line here. I agree
that realistically there's just nobody with the bandwidth to keep up
with a fully loaded Robert. Not without ignoring their own stuff at
least. And I think the importance of what you're building means we need
to be flexible.  But I think that thin line in turn means that you have
to be *doubly* careful about communication. I.e. post new infrastructure
to new threads, "warn" that you're intending to commit something
potentially needing debate/review, etc.


> Oh: another thing that I would like to do is commit the isolation
> tests I wrote for the deadlock detector a while back, which nobody has
> reviewed either, though Tom and Alvaro seemed reasonably positive
> about the concept.

I think adding new regression tests should have a barrier to commit
that's about two magnitudes lower than something like group locks. I
mean the worst that they could so is to flap around for some reason, or
take a bit too long. So please please go ahead.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] a raft of parallelism-related bug fixes

Reply via email to