[ 
https://issues.apache.org/jira/browse/HDFS-14806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16922707#comment-16922707
 ] 

Erik Krogen edited comment on HDFS-14806 at 9/4/19 5:51 PM:
------------------------------------------------------------

I think what you've shared will technically work, but isn't really what we 
want. Calling the RPC mechanism in a loop will load all of the edits, but all 
we want to do as this point is verify that the edits exist. The current code 
achieves this by opening the stream, but not reading anything from it. For 
RPCs, however, the data is already sent by this time. I suppose that this is an 
infrequent operation, so if it doesn't have high performance, it might be okay. 
But sending many (could be thousands) of RPCs as opposed to a single stream 
open/close seems overkill for the goal here.

The QJM already has a flag {{inProgressTailingEnabled}} 
("dfs.ha.tail-edits.in-progress") which, if set to false, will force the QJM to 
use the streaming path:
{code:title=QuorumJournalManager}
  public void selectInputStreams(Collection<EditLogInputStream> streams,
      long fromTxnId, boolean inProgressOk,
      boolean onlyDurableTxns) throws IOException {
    // Some calls will use inProgressOK to get in-progress edits even if
    // the cache used for RPC calls is not enabled; fall back to using the
    // streaming mechanism to serve such requests
    if (inProgressOk && inProgressTailingEnabled) {
      // ...
    }
    selectStreamingInputStreams(streams, fromTxnId, inProgressOk,
        onlyDurableTxns);
{code}
Since {{BootstrapStandby}} creates a QJM that only it uses, I think we can just 
have it set its configuration of {{dfs.ha.tail-edits.in-progress}} to false, 
forcing it not to use RPCs.

Attached a v002 patch to demonstrate my idea.

{quote}
The config {{dfs.ha.tail-edits.qjm.rpc.max-txns}} was introduced in HDFS-13609. 
Erik Krogen would you mind sharing some thoughts about making this config 
exposed?
{quote}
I think we didn't expose it since it is deeply technical and 
implementation-specific. I would prefer not to expose it unless we see a strong 
reason to.


was (Author: xkrogen):
I think what you've shared will technically work, but isn't really what we 
want. Calling the RPC mechanism in a loop will load all of the edits, but all 
we want to do as this point is verify that the edits exist. The current code 
achieves this by opening the stream, but not reading anything from it. For 
RPCs, however, the data is already sent by this time. I suppose that this is an 
infrequent operation, so if it doesn't have high performance, it might be okay. 
But sending many (could be thousands) of RPCs as opposed to a single stream 
open/close seems overkill for the goal here.

The QJM already has a flag {{inProgressTailingEnabled}} 
("dfs.ha.tail-edits.in-progress") which, if set to false, will force the QJM to 
use the streaming path:
{code:title=QuorumJournalManager}
  public void selectInputStreams(Collection<EditLogInputStream> streams,
      long fromTxnId, boolean inProgressOk,
      boolean onlyDurableTxns) throws IOException {
    // Some calls will use inProgressOK to get in-progress edits even if
    // the cache used for RPC calls is not enabled; fall back to using the
    // streaming mechanism to serve such requests
    if (inProgressOk && inProgressTailingEnabled) {
      // ...
    }
    selectStreamingInputStreams(streams, fromTxnId, inProgressOk,
        onlyDurableTxns);
{code}
Since {{BootstrapStandby}} creates a QJM that only it uses, I think we can just 
have it set its configuration of {{dfs.ha.tail-edits.in-progress}} to false, 
forcing it not to use RPCs.

> Bootstrap standby may fail if used in-progress tailing
> ------------------------------------------------------
>
>                 Key: HDFS-14806
>                 URL: https://issues.apache.org/jira/browse/HDFS-14806
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.3.0
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>            Priority: Major
>         Attachments: HDFS-14806.001.patch, HDFS-14806.002.patch
>
>
> One issue we went across was that if in-progress tailing is enabled, 
> bootstrap standby could fail.
> When in-progress tailing is enabled, Bootstrap uses the RPC mechanism to get 
> edits. There is a config {{dfs.ha.tail-edits.qjm.rpc.max-txns}} that sets an 
> upper bound on how many txnid can be included in one RPC call. The default is 
> 5000. Meaning bootstraping NN (say NN1) can only pull at most 5000 edits from 
> JN. However, as part of bootstrap, NN1 queries another NN (say NN2) for NN2's 
> current transactionID, NN2 may return a state that is > 5000 txnid from NN1's 
> current image. But NN1 can only see 5000 more txnid from JNs. At this point 
> NN1 goes panic, because txnid retuned by JNs is behind NN2's returned state, 
> bootstrap then fail.
> Essentially, bootstrap standby can fail if both of two following conditions 
> are met:
>  # in-progress tailing is enabled AND
>  # the boostraping NN is too far (>5000 txid)  behind 
> Increasing the value of {{dfs.ha.tail-edits.qjm.rpc.max-txns}} to some super 
> large value allowed bootstrap to continue. But this is hardly the ideal 
> solution.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to