[ 
https://issues.apache.org/jira/browse/HDFS-12934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312149#comment-16312149
 ] 

Íñigo Goiri commented on HDFS-12934:
------------------------------------

Still working on testing it in our cluster (I need to backport {{QuotaUsage}} 
as we use 2.7.1) but meanwhile a few comments on a full pass on the patch.
Most of the comments are just regarding naming and readability, nothing major.

{{RouterQuotaLocalCache}}:
* I would rename it to {{RouterQuotaManager}}, then in {{Router}}, we could 
{{getQuotaManager()}} instead of {{getQuotaUsage()}}.
* We could call the field {{quotaUsageCache}} just {{cache}}.
* We could create {{isQuotaSet(RouterQuotaUsage)}} and use it here and in 
{{QuotaCacheUpdateService#isQuotaSet(MountTable mountTable)}}.
* Refactor {{getQuotaUsage(String path)}} to do the {{lastIndexOf}} and 
{{substring}} sequence only once. Basically compose the {{if}} that does the 
check in the cache and the new {{isQuotaSet()}}.
* What about simplifying names like {{putUsage()}} to just {{put()}}? The class 
gives the full context. In addition, {{getChildrenPaths()}} could just be 
{{get(String path)}} and {{getAllPaths()}} be {{getAll()}}
* Can {{getChildrenPaths()}} use subMap()?
* Create a function for (which is used twice):
{code}
final List<RemoteLocation> locations = new LinkedList<>();
List<String> childrenPaths = this.router.getChildrenPaths(path);
if (childrenPaths != null) {
  for (String childPath : childrenPaths) {
    locations.addAll(getLocationsForPath(childPath));
  }
}
{code}

{{QuotaCacheUpdateService}}:
* Rename to {{RouterQuotaUpdateService}}?
* We could have a field directly for the 
{{RouterQuotaLocalCache}}/{{RouterQuotaManager}} and no need for 
{{this.router.getQuotaUsage()}} all the time.
* {{periodicInvoke()}} can directly do {{List<MountTable> updateMountTables = 
getUpdateMountTables();}}, without {{List<MountTable> updateMountTables = 
null;}}
* We could shorten one of the comments in {{periodicInvoke()}} to {{Call 
RouterRpcServer#getQuotaUsage for getting current quota usage}} to fit it in 80 
characters.
* It may make sense to use have a direct way to go from {{QuotaUsage}} and 
{{RouterQuotaUsage}} to {{RouterQuotaUsage}} (line 90).
* Instead of writing all the mount table entries at the end of 
{{periodicInvoke()}}, could we do something that tracks if the 
{{RouterQuotaUsage}} has changed and update then? We are doing some of the 
check in {{getUpdateMountTables()}}
* Make {{new LinkedList<MountTable>();}} into {{new LinkedList<>();}}.
* I would have a simple method {{List<MountTable> getMountTableEntries()}} and 
use it in {{getUpdateMountTables()}}
* Change {{for (String k : stalePaths)}} by {{for (String stalePath : 
stalePaths)}}

{{Router}}:
* Rename {{getQuotaUsage()}} to {{getQuotaManager()}}
* Get rid of {{getChildrenPaths()}} and then when used in {{RouterRpcServer}}, 
just extract there.
* {{isQuotaServiceEnabled()}} is just used one, try to be consistent either 
using this or the {{null}} check with {{getQuotaUsage()}}/{{getQuotaManager()}}

{{RouterRpcServer}}:
* {{getQuotaUsage()}} is a little long, we could do something like 
{{getCreateLocation(Stirng path)}} called {{getQuotaLocations(String path)}} 
which would do the get and the filtering. Then we can differentiate the part of 
getting the subclusters, calling the method and aggregating the results.
* I would create an internal {{getQuotaUsage()}}. This would allow us to track 
the times that we call this method from the pure RPC server and those called 
from the Router quota management. I did something along those lines for the 
{{getDatanodeReport()}}.
* Instead of calling each location individually, you can use 
{{invokeConcurrent()}} and merge using the output map.
* When merging the quotas in {{getQuotaUsage()}}, a single subcluster with 
{{quota}} unset removes the whole thing for everybody? Should we sum the 
individual quotas?

{{RouterQuotaUsage}}:
* It has a function for {{isNsViolated()}} and it could use the same pattern as 
{{DirectoryWithQuotaFeature#verifyNamespaceQuota()}}; same for 
{{isSsViolated()}}. Not sure if we can fully reuse 
{{DirectoryWithQuotaFeature}}.

Lots of comments and some of them may not make sense (I did a lot of back and 
forward across classes) so feel free to ask for clarifications.

> RBF: Federation supports global quota
> -------------------------------------
>
>                 Key: HDFS-12934
>                 URL: https://issues.apache.org/jira/browse/HDFS-12934
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>              Labels: RBF
>         Attachments: HDFS-12934.001.patch, HDFS-12934.002.patch, 
> HDFS-12934.003.patch, HDFS-12934.004.patch, RBF support  global quota.pdf
>
>
> Now federation doesn't support set the global quota for each folder. 
> Currently the quota will be applied for each subcluster under the specified 
> folder via RPC call.
> It will be very useful for users that federation can support setting global 
> quota and exposing the command of this.
> In a federated environment, a folder can be spread across multiple 
> subclusters. For this reason, we plan to solve this by following way:
> # Set global quota across each subcluster. We don't allow each subcluster can 
> exceed maximun quota value.
> # We need to construct one <Path, QuotaUsage> cache map for storing the sum  
> quota usage of these subclusters under federation folder. Every time we want 
> to do WRITE operation under specified folder, we will get its quota usage 
> from cache and verify its quota. If quota exceeded, throw exception, 
> otherwise update its quota usage in cache when finishing operations.
> The quota will be set to mount table and as a new field in mount table. The 
> set/unset command will be like:
> {noformat}
>  hdfs dfsrouteradmin -setQuota -ns <nsQuota> -ss <ssQuota> <mount table>
>  hdfs dfsrouteradmin -clrQuota  <mount table>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to