So, because we are talking very asynchronously, I want to put the proposal out there clearly for everyone. Because a serialization error is a known (and frankly expected) event on busy systems, we should treat that as very different from all other errors that a KID may encounter. Specifically, we need to try again, without reporting a serious problem back to the client via listen/notify.
We should continue the sleep setting to be sure, but should it give up after X tries? Slowly increment the sleep over time? I'm strongly inclined to do neither of those, but thought I should throw it out there. In addition to trying again (whether via cleanup and a goto KID, or asking the controller to start up a new kid), we should have a new notify that is fired to let listeners know that yeah, the sync failed, but it's only a serialization error and we will try again. The payload should tell how long we are sleeping, and perhaps some other information (e.g. which table it was on when this occurred). By "listener" I basically mean the bucardo program. Of course, it would be nice to find a good way to cause serialization errors on demand for the test suite; I seem to recall trying to do so once and fialing, but I'm sure it is possible somehow. -- Greg Sabino Mullane [email protected] End Point Corporation PGP Key: 0x14964AC8
pgp9hkoKMTtB5.pgp
Description: PGP signature
_______________________________________________ Bucardo-general mailing list [email protected] https://mail.endcrypt.com/mailman/listinfo/bucardo-general
