On 06/04/16 03:29, Robert Haas wrote:

I don't understand why it seems to be considered OK for logical slots to
just vanish on failover. The only other things I can think of where that's
considered OK are unlogged tables (because that's the point and we have
failover-safe ones too) and the old hash indexes nobody's quite willing to
remove yet.

First, it wasn't until 9.3 that physical standbys could follow
timeline switches, but that doesn't mean that streaming replication
was useless in 9.0 - 9.2, or that warm standby was useless in earlier
versions.  Logical decoding isn't useless without that capability
either.  Would it be nice if we did have that capability?  Of course.

Except that in 9.0 - 9.2 there was working workaround for that, there is no such thing for logical decoding.

Second, I'm not sure whether it was a good design decision to make
logical slots a special kind of object that sit off to the side,
neither configuration (like postgresql.conf) nor WAL-protected data
(like pg_clog and the data files themselves), but it was certainly a
very deliberate decision.  I sort of expected them to be WAL-logged,
but Andres argued (not unconvincingly) that we'd want to have slots on
standbys, and making them WAL-logged would preclude that.

I do think it was good design decision. We just need to make them failoverable bit differently and the failover slots patch IMHO isn't the right way either as I said in another reply in this thread.

Review and test responses have been pretty underwhelming for pglogical, and
quite a bit seem to have boiled down to "this should live as an extension,
we don't need it in core". It often feels like we can't win: if we seek to
get it into core we're told it's not wanted/needed, but if we try to focus
on solving issues in core to make it work better and let it live as an
extension we're told we shouldn't bother until it's in core.

To be honest, I was shocked that pglogical and pglogical_output didn't
go into this release.  I assumed that you and other folks at
2ndQuadrant were going to make a big push to get that done.  I did
take a brief look at one of them - pglogical, I think - a week or two
ago but there were unaddressed review comments that had been pending
for months and there were a lot of fairly obvious things that needed
to be done before it could be seriously considered as a core
submission.  Like, for example, rewriting the documentation heavily
and making it look like the rest of our docs, and putting it in SGML
format.  The code seemed to need quite a bit of cleanup, too.  Now,
logical replication is a sufficiently important feature that if the
only way it's going to get into core is if I work on it myself, or get
other people at EnterpriseDB to do so, then I'll try to make that
happen.  But I was assuming that that was your/2ndQuadrant's patch,
that you were going to get it in shape, and that me poking my nose
into it wasn't going to be particularly welcome.  Maybe I've misread
the whole dynamic here.

I guess you did, me and I think Craig as well hoped for some feedback on the general ideas in terms of protocol, node setup (I mean catalogs) and general architecture from the community. That didn't really happen. And without any of that happening I didn't feel confident trying to get it right within last month of dev cycle. Especially given the size of the patch and the fact we also had other patches that we worked on and had realistically higher chance of getting in. Not sure how Craig feels about it. Converting documentation, renaming some params in function names etc (those unaddressed comments) seemed like secondary to me.

(As a side note I was also 2 weeks without proper working laptop around FOSDEM time which had effect on my responses to -hackers about the topic, especially to Steve Singer who did good job of reviewing the usability at the time, but even if I had it it would not saved the patch)

In general I think project of this size requires more attention from committer to help shepherding it and neither Craig or me are that. I am glad that Andres said he plans to give some time in next cycle to logical replication because that should be big help.

That being said, if we get a logical replication system into core that
doesn't do DDL, doesn't do multi-master, doesn't know squat about
sequences, and rolls over and dies if a timeline switch happens, I
would consider that a huge step forward and I think a lot of other
people would, too.

I agree with the exception of working HA. I would consider it very sad if we got logical replication in core without having any provision for continuity of service. Doing that is relatively trivial in comparison to the logical replication itself however.

