[
https://issues.apache.org/jira/browse/HDFS-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912440#comment-16912440
]
Erik Krogen commented on HDFS-14726:
------------------------------------
Thanks [~vagarychen]! I actually looked more closely at this and noticed two
things. First, I think we need to backport HDFS-13145 to branch-2 as well,
since it can cause standby/observer nodes to crash when in-progress edit
tailing is available.
Second, I think your current if-statement is slightly wrong. You have this:
{code}
if (onlyDurableTxns && inProgressOk
&& committedTxnId != UNDEFINED_COMMITTED_ID) {
{code}
This code block is attempting to limit to only committed transactions if the
caller requested {{onlyDurableTxns}}. If the JNs don't keep track of committed
txns, I think we should do the safe thing of assuming that none of them are
durable. However I believe your current patch is optimistic and assumes all
transactions are durable in this case. I think we can leave this whole block
unmodified:
{code}
if (onlyDurableTxns && inProgressOk) {
endTxId = Math.min(endTxId, committedTxnId);
if (endTxId < remoteLog.getStartTxId()) {
LOG.warn("Found endTxId (" + endTxId + ") that is less than " +
"the startTxId (" + remoteLog.getStartTxId() +
") - setting it to startTxId.");
endTxId = remoteLog.getStartTxId();
}
}
{code}
In the case of an undefined committed txn ID, {{endTxnId}} will end up being -1
from {{Math.min()}}, then become equal to {{remoteLog.getStartTxId()}}, which
is safe since we know the previous finalized segments contain only committed
txns.
> Fix JN incompatibility issue in branch-2 due to backport of HDFS-10519
> ----------------------------------------------------------------------
>
> Key: HDFS-14726
> URL: https://issues.apache.org/jira/browse/HDFS-14726
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: journal-node
> Affects Versions: 2.10.0
> Reporter: Chen Liang
> Assignee: Chen Liang
> Priority: Blocker
> Attachments: HDFS-14726-branch-2.001.patch,
> HDFS-14726-branch-2.002.patch
>
>
> HDFS-10519 has been backported to branch-2. However HDFS-10519 introduced an
> incompatibility issue between NN and JN due to the new protobuf field
> {{committedTxnId}} in {{HdfsServer.proto}}. This field was introduced as a
> required field so if JN and NN are not on same version, it will run into
> missing field exception. Although currently we can get around by making sure
> JN always gets upgraded properly before NN, we can potentially fix this
> incompatibility by changing the field to optional.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]