Todd Lipcon created HDFS-3901:
---------------------------------
Summary: QJM: send 'heartbeat' messages to JNs even when they are
out-of-sync
Key: HDFS-3901
URL: https://issues.apache.org/jira/browse/HDFS-3901
Project: Hadoop HDFS
Issue Type: Sub-task
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Currently, if one of the JNs has fallen out of sync with the writer (eg because
it went down), it will be marked as such until the next log roll. This causes
the writer to no longer send any RPCs to it. This means that the JN's metrics
will no longer reflect up-to-date information on how far laggy they are.
This patch will introduce a heartbeat() RPC that has no effect except to update
the JN's view of the latest committed txid. When the writer is talking to an
out-of-sync logger, it will send these heartbeat messages once a second.
In a future patch we can extend the heartbeat functionality so that NNs
periodically check their connections to JNs if no edits arrive, such that a
fenced NN won't accidentally continue to serve reads indefinitely.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira