carp84 commented on a change in pull request #205: HBASE-20902 when WAL sync
failed, we should bypass the failed DN that previously used
URL: https://github.com/apache/hbase/pull/205#discussion_r282750188
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/io/asyncfs/FanOutOneBlockAsyncDFSOutputHelper.java
##########
@@ -751,6 +752,24 @@ private static FanOutOneBlockAsyncDFSOutput
createOutput(DistributedFileSystem d
int createMaxRetries = conf.getInt(ASYNC_DFS_OUTPUT_CREATE_MAX_RETRIES,
DEFAULT_ASYNC_DFS_OUTPUT_CREATE_MAX_RETRIES);
DatanodeInfo[] excludesNodes = EMPTY_DN_ARRAY;
+ if (oldPath != null) {
+ String oldPathStr = oldPath.toUri().getPath();
+ long len = namenode.getFileInfo(oldPathStr).getLen();
+ for(LocatedBlock block : namenode.getBlockLocations(oldPathStr,
Math.max(0, len - 1), len)
+ .getLocatedBlocks()) {
+ for(DatanodeInfo dn : block.getLocations()) {
+ excludesNodes = ArrayUtils.add(excludesNodes, dn);
Review comment:
Here it always adding nodes into exclude list but never check and remove
even after the DN recovers, right? So it's possible that one day all DN nodes
are excluded and the OutputStream will fail due to `could only be replicated to
0 nodes`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services