[ 
https://issues.apache.org/jira/browse/HDFS-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793857#comment-16793857
 ] 

Ayush Saxena commented on HDFS-14316:
-------------------------------------

Thanx [~elgoiri] for the patch.

Had a quick look at this!!!
 * I guess we are retrying here for all exceptions encountered? Might be we 
should restrict retrying to just certain cases and let fail for some genuine 
ones like AccessControlException,Which are supposed to fail for all subclusters.
 * 
{code:java}
      final List<RemoteLocation> locations = new ArrayList<>();
      for (RemoteLocation loc : rpcServer.getLocationsForPath(src, true)) {
        if (!loc.equals(createLocation)) {
          locations.add(loc);
        }
{code}
I guess this isn't working as intended, if in case the namenode is in 
StandbyState and isn't able to give the block locations.Thus throwing the 
exception at :
{code:java}
      createLocation = rpcServer.getCreateLocation(src);
{code}
The createLocation stays null. So in the above Loop we land up iterating 
checking no null entry.Literally doing nothing Got the Log from the UT too as :

 
{noformat}
2019-03-15 23:57:47,751 [IPC Server handler 6 on default port 38833] ERROR 
router.RouterClientProtocol (RouterClientProtocol.java:create(253)) - Cannot 
create /HASH_ALL-failsubcluster/dir100/file5.txt in null: No namenode available 
to invoke getBlockLocations [/HASH_ALL-failsubcluster/dir100/file5.txt, 0, 
1]{noformat}
 

 

> RBF: Support unavailable subclusters for mount points with multiple 
> destinations
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-14316
>                 URL: https://issues.apache.org/jira/browse/HDFS-14316
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Íñigo Goiri
>            Assignee: Íñigo Goiri
>            Priority: Major
>         Attachments: HDFS-14316-HDFS-13891.000.patch, 
> HDFS-14316-HDFS-13891.001.patch, HDFS-14316-HDFS-13891.002.patch, 
> HDFS-14316-HDFS-13891.003.patch, HDFS-14316-HDFS-13891.004.patch, 
> HDFS-14316-HDFS-13891.005.patch, HDFS-14316-HDFS-13891.006.patch, 
> HDFS-14316-HDFS-13891.007.patch
>
>
> Currently mount points with multiple destinations (e.g., HASH_ALL) fail 
> writes when the destination subcluster is down. We need an option to allow 
> writing in other subclusters when one is down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to