[
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357832#comment-15357832
]
Bob Hansen commented on HDFS-10441:
-----------------------------------
Small issues: all of these are small, but should probably be fixed before
landing
* Do we pass information in the data pointer for the NN failover event? If so,
document it in events.h
* Comment in retry_policy.h for FixedDelayWithFailover still references static
values of 3 and 2
* We should include blocks of comments in retry_policy.h describing the
behavior in human terms because the backoff behavior is less than obvious
* status.cc: find the right value for kSnapshotException
* filesystem.cc: nn_.Connect() call: get rid of commented-out code
* rpc_connection.cc: HandleRpcResponse should push req back to the head of the
queue; alternately, don't dequeue it if we got a standby exception.
* If HandleRpcResponse gets a kStandbyException, will CommsError be called
twice (once in HandleRpcResponse and again in OnRecvComplete)?
* rpc_engine.cc: let's use both namenodes if servers.size() >= 2 rather than
just bailing out.
* rpc_engine.h: IsCurrentActive/IsCurrentStandby are dangerous as designed:
they're asking for race conditions as we acquire the lock, check, release the
lock, then take action. Just before we take action, someone else could change
the value
* rpc_engine.cc: Remove RpcEngine::Start instead of deprecating it.
* Don't forget to file bugs to handle more than two namenodes.
Minor issues: it would be nice to see these fixed, but aren't blockers:
* status.h: is having both is_server_exception_ and exception_class_ redundant?
* hdfs_configuration.c: We have a (faster) split function in uri.cc; let's
refactor that into a Util method
* HdfsConfiguration::LookupNameService: if the URI parsing failed, we should
just ignore the URI as mal-formed, not bail out of the entire function. There
may be a well-formed URI in a later value.
* HdfsConfiguration: I'm a little uncomfortable using the URI parser to break
apart host:port. If the user enters "foo:bar@baz", it will interpret that as a
password and silently drop everything before the baz. Just using split(':')
and converting the port to int if it exists is solid enough.
* status.cc: I don't think the java exception name should go in the
(user-visible) output message. A string describing the error ("Invalid
Argument") would be nice, though.
* filesystem.cc: why do we call InitRpc before checking if there's an
io_service_?
* rpc_engine.h: Are ha_persisted_info_ and ha_enabled_ redundant?
> libhdfs++: HA namenode support
> ------------------------------
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Reporter: James Clampffer
> Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch,
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch,
> HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch,
> HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch,
> HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch,
> HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]