[
https://issues.apache.org/jira/browse/HDFS-14058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16689993#comment-16689993
]
Chen Liang commented on HDFS-14058:
-----------------------------------
The tests I've run include the following. Please note that the following tests
were done without several recent changes such as HDFS-14035 and HDFS-14017, but
with some hacky code change and workaround. Although the required changes have
been formalized to recent Jiras, the following tests haven't all been re-run
along with those change. Post here for record.
The tests were done with the setup of 100+ datanodes, 1 Active NameNode and 1
Observer NameNode. No other standby nodes. The cluster has light HDFS workload,
has YARN deployed, and has security (Kerberos) enabled. The purpose here was
not evaluate performance gain, but only to prove the functionality. In all the
tests below, it is verified from Observer node audit log that the reads
actually went to Observer node.
1. basic hdfs IO
- From hdfs command:
-- create/delete directory
-- basic file put/get/delete
- From a simple Java program. I wrote some code which creates a DFSClient
instance and perform some basic operations against it:
-- create/delete directory
-- get/renew delegation token
One observation on this is that, from command line, depending on the relative
order of ANN and ONN in config, the failover may happen every single time, with
an exception printed. I believe this is because from command, every single
command line call will create a new DFSClient instance. Which may start with
calling Observer for write, causing failover. But for reused DFSClient (e.g.
from a Java program where it create and reuse same DFSClient), there is no this
issue.
2. simple MR job: a simple wordcount job from mapreduce-examples jar, on a very
small input.
3. SliveTest: ran Slive from hadoop-mapreduce-client-jobclient jar, without
parameters (so it uses default). I ran Slive 3 times for both with Observer
enabled and disabled. I saw roughly the same ops/sec.
4.DFSIO: ran DFSIO read test several times from
hadoop-mapreduce-client-jobclient jar, but only with very small input size. (10
files with 1KB each).
5. TeraGen/Sort/Validate: ran TeraGen/Sort/Validate from
hadoop-mapreduce-examples jar with 1TB of data. TeraSort used 1800+ mappers and
500 reducers. All three jobs finished successfully.
> Test reads from standby on a secure cluster with IP failover
> ------------------------------------------------------------
>
> Key: HDFS-14058
> URL: https://issues.apache.org/jira/browse/HDFS-14058
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: test
> Reporter: Konstantin Shvachko
> Assignee: Chen Liang
> Priority: Major
>
> Run standard HDFS tests to verify reading from ObserverNode on a secure HA
> cluster with {{IPFailoverProxyProvider}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]