On 26/05/17 16:51, Alvaro Herrera wrote: > Erik Rijkers wrote: > >> I wouldn't say that problems (re)appeared at a certain point; my impression >> is rather that logical replication has become better and better. But I kept >> getting the odd failure, without a clear cause, but always (eventually) >> repeatable on other machines. I did the 1-minute pgbench-derail version >> exactly because of the earlier problems with snapbuild: I wanted a test that >> does a lot of starting and stopping of publication and subscription. > > I think it is pretty unlikely that the logical replication plumbing is > the buggy place. You're just seeing it now becaues we didn't have any > mechanism as convenient to consume logical decoding output. In other > words, I strongly suspect that the hyphothetical bugs are in the logical > decoding side (and snapbuild sounds the most promising candidate) rather > than logical replication per se. >
Well, that was true for the previous issues Erik found as well (mostly snapshot builder was problematic). But that does not mean there are no issues elsewhere. We could do with some more output from the tests (do they log some intermediary states of those md5 checksums, maybe numbers of rows etc?), description of the problems, errors from logs, etc. I for example don't get any issues from similar test as the one described in this thread so without more info it might be hard to reproduce and fix whatever the underlaying issue is. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers