[ 
https://issues.apache.org/jira/browse/HDFS-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912440#comment-16912440
 ] 

Erik Krogen commented on HDFS-14726:
------------------------------------

Thanks [~vagarychen]! I actually looked more closely at this and noticed two 
things. First, I think we need to backport HDFS-13145 to branch-2 as well, 
since it can cause standby/observer nodes to crash when in-progress edit 
tailing is available.

Second, I think your current if-statement is slightly wrong. You have this:
{code}
        if (onlyDurableTxns && inProgressOk
            && committedTxnId != UNDEFINED_COMMITTED_ID) {
{code}
This code block is attempting to limit to only committed transactions if the 
caller requested {{onlyDurableTxns}}. If the JNs don't keep track of committed 
txns, I think we should do the safe thing of assuming that none of them are 
durable. However I believe your current patch is optimistic and assumes all 
transactions are durable in this case. I think we can leave this whole block 
unmodified:
{code}
        if (onlyDurableTxns && inProgressOk) {
          endTxId = Math.min(endTxId, committedTxnId);
          if (endTxId < remoteLog.getStartTxId()) {
            LOG.warn("Found endTxId (" + endTxId + ") that is less than " +
                "the startTxId (" + remoteLog.getStartTxId() +
                ") - setting it to startTxId.");
            endTxId = remoteLog.getStartTxId();
          }
        }
{code}
In the case of an undefined committed txn ID, {{endTxnId}} will end up being -1 
from {{Math.min()}}, then become equal to {{remoteLog.getStartTxId()}}, which 
is safe since we know the previous finalized segments contain only committed 
txns.

> Fix JN incompatibility issue in branch-2 due to backport of HDFS-10519
> ----------------------------------------------------------------------
>
>                 Key: HDFS-14726
>                 URL: https://issues.apache.org/jira/browse/HDFS-14726
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: journal-node
>    Affects Versions: 2.10.0
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>            Priority: Blocker
>         Attachments: HDFS-14726-branch-2.001.patch, 
> HDFS-14726-branch-2.002.patch
>
>
> HDFS-10519 has been backported to branch-2. However HDFS-10519 introduced an 
> incompatibility issue between NN and JN due to the new protobuf field 
> {{committedTxnId}} in {{HdfsServer.proto}}. This field was introduced as a 
> required field so if JN and NN are not on same version, it will run into 
> missing field exception. Although currently we can get around by making sure 
> JN always gets upgraded properly before NN, we can potentially fix this 
> incompatibility by changing the field to optional. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to