[ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357832#comment-15357832
 ] 

Bob Hansen commented on HDFS-10441:
-----------------------------------

Small issues: all of these are small, but should probably be fixed before 
landing
* Do we pass information in the data pointer for the NN failover event?  If so, 
document it in events.h
* Comment in retry_policy.h for FixedDelayWithFailover still references static 
values of 3 and 2
* We should include blocks of comments in retry_policy.h describing the 
behavior in human terms because the backoff behavior is less than obvious
* status.cc: find the right value for kSnapshotException
* filesystem.cc: nn_.Connect() call: get rid of commented-out code
* rpc_connection.cc: HandleRpcResponse should push req back to the head of the 
queue; alternately, don't dequeue it if we got a standby exception.
* If HandleRpcResponse gets a kStandbyException, will CommsError be called 
twice (once in HandleRpcResponse and again in OnRecvComplete)?
* rpc_engine.cc: let's use both namenodes if servers.size() >= 2 rather than 
just bailing out.
* rpc_engine.h: IsCurrentActive/IsCurrentStandby are dangerous as designed: 
they're asking for race conditions as we acquire the lock, check, release the 
lock, then take action.  Just before we take action, someone else could change 
the value
* rpc_engine.cc: Remove RpcEngine::Start instead of deprecating it.
* Don't forget to file bugs to handle more than two namenodes.


Minor issues: it would be nice to see these fixed, but aren't blockers:
* status.h: is having both is_server_exception_  and exception_class_ redundant?
* hdfs_configuration.c: We have a (faster) split function in uri.cc; let's 
refactor that into a Util method
* HdfsConfiguration::LookupNameService: if the URI parsing failed, we should 
just ignore the URI as mal-formed, not bail out of the entire function.  There 
may be a well-formed URI in a later value.
* HdfsConfiguration: I'm a little uncomfortable using the URI parser to break 
apart host:port.  If the user enters "foo:bar@baz", it will interpret that as a 
password and silently drop everything before the baz.  Just using split(':') 
and converting the port to int if it exists is solid enough.
* status.cc: I don't think the java exception name should go in the 
(user-visible) output message.  A string describing the error ("Invalid 
Argument") would be nice, though.
* filesystem.cc: why do we call InitRpc before checking if there's an 
io_service_?
* rpc_engine.h: Are ha_persisted_info_ and ha_enabled_ redundant?


> libhdfs++: HA namenode support
> ------------------------------
>
>                 Key: HDFS-10441
>                 URL: https://issues.apache.org/jira/browse/HDFS-10441
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: James Clampffer
>         Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, 
> HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, 
> HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, 
> HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to