[
https://issues.apache.org/jira/browse/HDFS-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793857#comment-16793857
]
Ayush Saxena commented on HDFS-14316:
-------------------------------------
Thanx [~elgoiri] for the patch.
Had a quick look at this!!!
* I guess we are retrying here for all exceptions encountered? Might be we
should restrict retrying to just certain cases and let fail for some genuine
ones like AccessControlException,Which are supposed to fail for all subclusters.
*
{code:java}
final List<RemoteLocation> locations = new ArrayList<>();
for (RemoteLocation loc : rpcServer.getLocationsForPath(src, true)) {
if (!loc.equals(createLocation)) {
locations.add(loc);
}
{code}
I guess this isn't working as intended, if in case the namenode is in
StandbyState and isn't able to give the block locations.Thus throwing the
exception at :
{code:java}
createLocation = rpcServer.getCreateLocation(src);
{code}
The createLocation stays null. So in the above Loop we land up iterating
checking no null entry.Literally doing nothing Got the Log from the UT too as :
{noformat}
2019-03-15 23:57:47,751 [IPC Server handler 6 on default port 38833] ERROR
router.RouterClientProtocol (RouterClientProtocol.java:create(253)) - Cannot
create /HASH_ALL-failsubcluster/dir100/file5.txt in null: No namenode available
to invoke getBlockLocations [/HASH_ALL-failsubcluster/dir100/file5.txt, 0,
1]{noformat}
> RBF: Support unavailable subclusters for mount points with multiple
> destinations
> --------------------------------------------------------------------------------
>
> Key: HDFS-14316
> URL: https://issues.apache.org/jira/browse/HDFS-14316
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Íñigo Goiri
> Assignee: Íñigo Goiri
> Priority: Major
> Attachments: HDFS-14316-HDFS-13891.000.patch,
> HDFS-14316-HDFS-13891.001.patch, HDFS-14316-HDFS-13891.002.patch,
> HDFS-14316-HDFS-13891.003.patch, HDFS-14316-HDFS-13891.004.patch,
> HDFS-14316-HDFS-13891.005.patch, HDFS-14316-HDFS-13891.006.patch,
> HDFS-14316-HDFS-13891.007.patch
>
>
> Currently mount points with multiple destinations (e.g., HASH_ALL) fail
> writes when the destination subcluster is down. We need an option to allow
> writing in other subclusters when one is down.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]