[
https://issues.apache.org/jira/browse/HDFS-17632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18053231#comment-18053231
]
ASF GitHub Bot commented on HDFS-17632:
---------------------------------------
kokonguyen191 commented on code in PR #8072:
URL: https://github.com/apache/hadoop/pull/8072#discussion_r2711211934
##########
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterClientProtocol.java:
##########
@@ -1977,8 +1979,68 @@ public BatchedEntries<OpenFileEntry> listOpenFiles(long
prevId)
public BatchedEntries<OpenFileEntry> listOpenFiles(long prevId,
EnumSet<OpenFilesIterator.OpenFilesType> openFilesTypes, String path)
throws IOException {
- rpcServer.checkOperation(NameNode.OperationCategory.READ, false);
- return null;
+ rpcServer.checkOperation(NameNode.OperationCategory.READ, true);
+ List<RemoteLocation> locations = rpcServer.getLocationsForPath(path,
false, false);
+ RemoteMethod method =
+ new RemoteMethod("listOpenFiles", new Class<?>[] {long.class,
EnumSet.class, String.class},
+ prevId, openFilesTypes, new RemoteParam());
+ Map<RemoteLocation, BatchedEntries> results =
+ rpcClient.invokeConcurrent(locations, method, true, false, -1,
BatchedEntries.class);
+
+ // Get the largest inodeIds for each namespace, and the smallest inodeId
of them
+ // then ignore all entries above this id to keep a consistent prevId for
the next listOpenFiles
+ long minOfMax = Long.MAX_VALUE;
+ for (BatchedEntries nsEntries : results.values()) {
+ // Only need to care about namespaces that still have more files to
report
+ if (!nsEntries.hasMore()) {
+ continue;
+ }
+ long max = 0;
+ for (int i = 0; i < nsEntries.size(); i++) {
+ max = Math.max(max, ((OpenFileEntry) nsEntries.get(i)).getId());
+ }
+ minOfMax = Math.min(minOfMax, max);
Review Comment:
Hi @KeeProMise, `minOfMax` is calculated based on inodeId instead of file
count. There are 2 situations that can happen with 2 namespaces both having
data:
1. Only files from one NS: 2 files on NS1 have IDs 1 and 2, 5 files on NS2
have IDs 3-7, then `minOfMax=2`, all files with IDs > 2 are ignored (i.e. all
files from NS2). Client receives 2 files and asks router for more starting with
`prevId=2`.
2. A mix of both namespaces: 2 files on NS1 have IDs 1 and 10, files on NS2
have IDs 5-7 and 11-12, then `minOfMax=10`. All files from NS1 and the first 3
files from NS2 are included. The files with IDs 11 and 12 are not included.
Client asks for more entries starting with `prevId=10`, which will include the
last 2 files in this next batch.
> RBF: Support listOpenFiles for routers
> --------------------------------------
>
> Key: HDFS-17632
> URL: https://issues.apache.org/jira/browse/HDFS-17632
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs, rbf
> Reporter: Felix N
> Assignee: Felix N
> Priority: Major
> Labels: pull-request-available
>
> {code:java}
> @Override
> public BatchedEntries<OpenFileEntry> listOpenFiles(long prevId,
> EnumSet<OpenFilesIterator.OpenFilesType> openFilesTypes, String path)
> throws IOException {
> rpcServer.checkOperation(NameNode.OperationCategory.READ, false);
> return null;
> } {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]