[ 
https://issues.apache.org/jira/browse/HDFS-12943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16723272#comment-16723272
 ] 

Chen Liang edited comment on HDFS-12943 at 12/17/18 10:45 PM:
--------------------------------------------------------------

Hi [~brahmareddy],

Thanks for testing! The timeout issue seems interesting. To start with, it is 
expected to see some performance degradation *from CLI*, because CLI initiates 
a DFSClient every time for each command, a fresh DFSClient has to get status of 
name nodes every time. But if it is the same DFSClient being reused, this would 
not be an issue. I have never seen the second-call issue. Here is an output 
from our cluster (log outpu part omitted), and I think you are right about 
lowering dfs.ha.tail-edits.period, we had similar numbers here:
{code:java}
$time hdfs --loglevel debug dfs 
-Ddfs.client.failover.proxy.provider.***=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -mkdir /TestsORF1
real    0m2.254s
user    0m3.608s
sys     0m0.331s
$time hdfs --loglevel debug dfs 
-Ddfs.client.failover.proxy.provider.***=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -mkdir /TestsORF2
real    0m2.159s
user    0m3.855s
sys     0m0.330s{code}
Curious, how many NN you had in the testing? and was there any error from NN 
logs?


was (Author: vagarychen):
Hi [~brahmareddy],

Thanks for testing! The timeout issue seems interesting. To start with, it is 
expected to see some performance degradation *from CLI*, because CLI initiates 
a DFSClient every time for each command, a fresh DFSClient has to get status of 
name nodes every time. But if it is the same DFSClient being reused, this would 
not be an issue. I have never seen the second-call issue. Here is an output 
from our cluster (log outpu part omitted), and I think you are right about 
lowering dfs.ha.tail-edits.period, we had similar numbers here:
{code:java}
$time hdfs --loglevel debug dfs 
-Ddfs.client.failover.proxy.provider.ltx1-unonn01=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -mkdir /TestsORF1
real    0m2.254s
user    0m3.608s
sys     0m0.331s
$time hdfs --loglevel debug dfs 
-Ddfs.client.failover.proxy.provider.ltx1-unonn01=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
 -mkdir /TestsORF2
real    0m2.159s
user    0m3.855s
sys     0m0.330s{code}
 ** Curious, how many NN you had in the testing? and was there any error from 
NN logs?

> Consistent Reads from Standby Node
> ----------------------------------
>
>                 Key: HDFS-12943
>                 URL: https://issues.apache.org/jira/browse/HDFS-12943
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: hdfs
>            Reporter: Konstantin Shvachko
>            Priority: Major
>         Attachments: ConsistentReadsFromStandbyNode.pdf, 
> ConsistentReadsFromStandbyNode.pdf, HDFS-12943-001.patch, 
> HDFS-12943-002.patch, TestPlan-ConsistentReadsFromStandbyNode.pdf
>
>
> StandbyNode in HDFS is a replica of the active NameNode. The states of the 
> NameNodes are coordinated via the journal. It is natural to consider 
> StandbyNode as a read-only replica. As with any replicated distributed system 
> the problem of stale reads should be resolved. Our main goal is to provide 
> reads from standby in a consistent way in order to enable a wide range of 
> existing applications running on top of HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to