Erik Krogen created HDFS-13125:
----------------------------------

             Summary: Improve efficiency of JN -> Standby Pipeline Under 
Frequent Edit Tailing
                 Key: HDFS-13125
                 URL: https://issues.apache.org/jira/browse/HDFS-13125
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: journal-node, namenode
            Reporter: Erik Krogen
            Assignee: Erik Krogen


The current edit tailing pipeline is designed for
* High resiliency
* High throughput
and was _not_ designed for low latency.

It was designed under the assumption that each edit log segment would typically 
be read all at once, e.g. on startup or the SbNN tailing the entire thing after 
it is finalized. The ObserverNode should be reading constantly from the 
JournalNodes' in-progress edit logs with low latency, to reduce the lag time 
from when a transaction is committed on the ANN and when it is visible on the 
ObserverNode.

Due to the critical nature of this pipeline to the health of HDFS, it would be 
better not to redesign it altogether. Based on some experiments it seems if we 
mitigate the following issues, lag times are reduced to low levels (low 
hundreds of milliseconds even under very high write load):
* The overhead of creating a new HTTP connection for each time new edits are 
fetched. This makes sense when you're expecting to tail an entire segment; it 
does not when you may only be fetching a small number of edits. We can mitigate 
this by allowing edits to be tailed via an RPC call, or by adding a connection 
pool for the existing connections to the journal.
* The overhead of transmitting a whole file at once. Right now when an edit 
segment is requested, the JN sends the entire segment, and on the SbNN it will 
ignore edits up to the ones it wants. How to solve this one may be more tricky, 
but one suggestion would be to keep recently logged edits in memory, avoiding 
the need to serve them from file at all, allowing the JN to quickly serve only 
the required edits.

We can implement these as optimizations on top of the existing logic, with 
fallbacks to the current slow-but-resilient pipeline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to