Github user liuml07 commented on the issue:
https://github.com/apache/spark/pull/15648
Sorry for coming late. The change is very reasonable. Glad it's merged.
Steve:
> Where my knowledge of HDFS-HA fails is what happens then; Does the RPC
client try another NN? Or just it just fail? Maybe @liuml07 could assist there.
So in general, `setSafeMode(true)` is a READ operation while the
`setSafeMode(false)` is an unchecked operation.
1. The standby NN will throw an exception for READ operations unless it's
configured to allow so via config key `dfs.ha.allow.stale.reads`. And if the
StandbyException is caught by the client side, the DFSClient internal proxy
will automatically retry multiple times (say, 10) against the other NN
(hopefully active). So the `setSafeMode(true)` request will eventually go to
the active NN.
1. In contrast, the `setSafeMode(false)` will go to a NameNode (can be
standby) which will return its safe mode status instead of throwing a
StandbyException. This return value may fool the client which was expecting the
active NN is out of safe mode.
By the way, the above comments make sense only if we're using the logical
HDFS service name.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]