[
https://issues.apache.org/jira/browse/HDFS-16382?focusedWorklogId=698237&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-698237
]
ASF GitHub Bot logged work on HDFS-16382:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 18/Dec/21 12:30
Start Date: 18/Dec/21 12:30
Worklog Time Spent: 10m
Work Description: taiyang-li commented on a change in pull request #3797:
URL: https://github.com/apache/hadoop/pull/3797#discussion_r771820322
##########
File path:
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterClientProtocol.java
##########
@@ -1204,16 +1207,38 @@ public void setBalancerBandwidth(long bandwidth) throws
IOException {
@Override
public ContentSummary getContentSummary(String path) throws IOException {
+ return getContentSummary(path, new HashMap<String, Set<String>>());
+ }
+
+ public ContentSummary getContentSummary(String path, Map<String,
Set<String>> excludeNamespace) throws IOException {
rpcServer.checkOperation(NameNode.OperationCategory.READ);
// Get the summaries from regular files
final Collection<ContentSummary> summaries = new ArrayList<>();
final List<RemoteLocation> locations =
rpcServer.getLocationsForPath(path, false, false);
+ Map<String, Set<String>> currentExcludePathNsMap = new HashMap<>();
+ Set<String> curExcludeNamespace = new HashSet<>();
+ String destPath =
subclusterResolver.getDestinationForPath(path).getDefaultLocation().getDest();
+ List<String> parentExistLocations =
excludeNamespace.keySet().stream().filter(s -> destPath.startsWith(s))
+ .collect(Collectors.toList());
+ boolean parentAlreadyComputed = parentExistLocations.size() > 0;
+ List<RemoteLocation> filterLoctions =
+ locations.stream().filter(remoteLocation -> excludeNamespace.isEmpty()
|| !parentAlreadyComputed ||
+ !isParentPathNamespaceComputed(remoteLocation, excludeNamespace,
parentExistLocations))
+ .collect(Collectors.toList());
+ filterLoctions.forEach(remoteLocation -> {
+ curExcludeNamespace.add(remoteLocation.getNameserviceId());
+ });
+ if (excludeNamespace.get(destPath) != null) {
+ excludeNamespace.get(destPath).addAll(curExcludeNamespace);
+ } else {
+ excludeNamespace.put(destPath, curExcludeNamespace);
+ }
final RemoteMethod method = new RemoteMethod("getContentSummary",
new Class<?>[] {String.class}, new RemoteParam());
final List<RemoteResult<RemoteLocation, ContentSummary>> results =
- rpcClient.invokeConcurrent(locations, method,
+ rpcClient.invokeConcurrent(filterLoctions, method,
Review comment:
Find wrong spelling of `filterLocations`..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 698237)
Time Spent: 0.5h (was: 20m)
> RBF: getContentSummary RPC compute sub-directory repeatedly
> -----------------------------------------------------------
>
> Key: HDFS-16382
> URL: https://issues.apache.org/jira/browse/HDFS-16382
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: rbf
> Affects Versions: 3.3.1
> Reporter: zhanghaobo
> Priority: Major
> Labels: pull-request-available
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Router getContentSummary rpc compute sub-directory repeatedly when a
> direactory and its ancestor directory are both mounted in the form of
> original src path.
> For example, suppose we have mount table entries below:
> /A---ns1---/A
> /A/B—ns1,ns2—/A/B
> we put a file test.txt to directory /A/B in namepsace ns1, then execute `hdfs
> dfs -count hdfs://router:8888/A`, the result is wrong, because we compute
> /A/B/test.txt repeatedly
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]