xkrogen commented on PR #4560: URL: https://github.com/apache/hadoop/pull/4560#issuecomment-1220049310
Thanks for trying to tackle this issue! Actually @shvachko and I discussed this potential issue long ago but had not observed problems in practice; I guess it is made much worse by using cross-DC JNs. I don't feel that a `CacheMissException` is correct. The situation where the NN requests edits newer than what the JNs have is expected to be common, especially if the transaction rate is low, since in this situation the NN will constantly poll the JNs for new edits by sending `sinceTxID = highestWrittenTxId + 1`. I see you're trying to handle this by special-casing when the `sinceTxId` is `getHighestWrittenTxId() + 1`, but it seems pretty hacky/brittle. My initial thought is that we should make a special-case return value when `sinceTxId > highestWrittenTxId` (maybe `-1`) and on the NN side, if you find some responses with `txnCount > 0` and some responses with `txnCount < 0`, then you only use the responses with `txnCount > 0`. The main issue I see with this is that `AsyncLoggerSet#waitForWriteQuorum()` isn't set up to handle this kind of situation; it will just return as soon as there are a quorum of non-error responses. As an alternative, we could create a new exception different from `CacheMissException`, like `NewerTxnIdException`, which the JN throws in the situation of `startTxId > highestWrittenTxId`. Since it's an exception, `waitForWriteQuorum()` will try to throw away JNs that threw it. If only some JNs throw the exception, then we still get a valid result from `waitForWriteQuorum()`. If too many JNs throw the exception, then we can catch it on the NN side and swallow the exception to treat it as a normal/expected situation. I think this would avoid us having to special-case `startTxId + 1` on the JN side. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
