I think this discussion has been triggered by a discussion we have had for ZOOKEEPER-1413. In the patch Thawan proposed there, there was a method reads txn logs and it simply logs an error in the case of an exception while reading the log. I raised the question of whether we should do more than simply logging an error message and the discussion about txn log started, but it seems to be a discussion that is out of the scope of 1413, so we thought it would be good to have this discussion separately,
Here are a few thoughts about the issue. We can't really tolerate arbitrary corruptions of the txn log because it could imply that we lose quorum for a txn that has been processed and a response has been returned to the client. In the case that a faulty server only partially writes a txn into a txn log because it crashes, the logged txn is corrupt, but we don't really have an issue because the server has not acked the txn, so if there is a quorum for that txn, the faulty server is not really part of it. Cases like this I believe we can do something about, but more generally taking care of txn log integrity sounds like a hard problem. -Flavio On Jun 1, 2013, at 4:29 PM, Camille Fournier <[email protected]> wrote: > I think it's an interesting idea certainly worth discussing. Do you have > any proposals for how we might modify? What should we think about wrt > migration/backwards compatibility? > > C > > > On Fri, May 31, 2013 at 8:26 PM, Thawan Kooburat <[email protected]> wrote: > >> Hi, >> >> I just want to start a discussion about the usage of txnlog. Here is the >> list of features that need to lookup information from txnlog. Theses >> feature need to ensure the integrity of txnlog and having an efficient >> lookup is good for performance as well. >> >> ZOOKEEPER-1413 - The leader use txnlog to synchronize with the >> learners.It need to read txnlog in sequential manner starting from a given >> zxid. >> ZOOKEEPER-22 – The design proposal mentioned that the leader should lookup >> txnlog to response to the client if a request is accepted by the client or >> not. The server need to lookup txn by sessionId and cxid >> ZOOKEEPER-1416 – The server need to be able to tell the list of deleted >> nodes starting a given zxid. One possible implementation is to walk txnlog >> staring from a given zxid and look for delete txn. >> >> Do we need to change the way we store txnlog so that we can ensure >> integrity and more efficient lookup? >> >> -- >> Thawan Kooburat >>
