[ 
https://issues.apache.org/jira/browse/HDFS-14403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812546#comment-16812546
 ] 

Erik Krogen commented on HDFS-14403:
------------------------------------

[~cgregori], thanks for getting us started with the v001 patch! Besides fixing 
the checkstyle and test failures (at least TestHdfsConfigFields and TestRPC 
look related, you should check the others also), I have a few comments:
* {{DecayRpcScheduler#decayCurrentCounts()}} -- you changed this from 
{{private}} to {{package-private}}, should it have a {{@VisibleForTesting}} 
annotation?
* For {{DecayRpcScheduler#getCallCost()}}, now seems like a good time to remove 
the {{throws InterruptedException}} (it doesn't) and to fix the Javadoc 
formatting by adding a blank line before the {{@param}} tag.
* The Javadoc change for {{computePriorityLevel()}} is a bit confusing; it made 
me think that the {{cost}} argument was the total cost across all users' 
operations (i.e. {{totalDecayedCallCost}}). Can you make it more clear that it 
is the total cost for a specific user?
* It doesn't seem that {{getPriorityLevelAndUpdate()}} is a useful function to 
have just for testing, when a test could call the two existing methods 
{{getPriorityLevel()}} and {{updateFromCostProvider()}}. Can we remove it?
* I think a default value of {{DFS_NAMENODE_LOCKCOSTPROVIDER_WRITEMULTIPLIER}} 
of higher than 1 probably makes sense, maybe 5 or 10?
* The two new methods on {{FSNamesystem}} that don't have Javadoc should get 
it. It would also be helpful if you could add Javadoc to the 
{{FSNamesystemLock#userLockTimes}} field explaining how it works.
* I'm not sure if I like the method name {{removeUserLockTimes()}}, maybe 
something like {{stopTrackingUserLockTimes()}} would be better?
* Within {{FSNamesystemLock}}, the lock hold times are stored within a 
{{List<AtomicLong>}}. I know this is how it's done within 
{{DecayRpcScheduler}}, but I'm not really a fan, as it's confusing upon first 
glance about what it actually contains. My personal preference would be to 
create a private inner class that has two {{AtomicLong}} fields, but let me 
know if you disagree.
* For the {{addCallQueueConfigurationCallback}} method, since we are on Java 8, 
we should take advantage of the built-in {{Consumer}} functional interface 
rather than using Guava.
* Within the callback created within {{NameNodeRpcServer}}, I'm thinking we 
should probably call {{namesystem.setUserLockTimeCollection(false)}} if the new 
{{scheduler}} does not have a {{LockCostProvider}} so that the collection will 
be disabled if the RPC scheduler is no longer using this.
* For the {{testUser(Read|Write)HoldTimes}} tests, I think the {{spy}} makes 
sense to mock out multiple users. But for the other two tests, with only one 
user, can we avoid the spy by just saying {{user1 = 
fsLock.getCurrentUsername()}} ?


[~starphin] - thanks for taking a look! Are you requesting information on what 
the total QPS for the abusive user (1000 entry listStatus) and the non-abusive 
users (1 entry listStatus or mkdir) was in each of the benchmarks? I think we 
should be able to supply this -- [~cgregori] we should be able to collect these 
from the Dynamometer results?
{quote}
Is it possible that scheduler with LockCostProvider schedules more low cost 
operations like listStatus on a directory with one subdirectories, which 
results in a lower queue time.
{quote}
Yes, I agree that this is why the queue time goes down, and it is part of the 
intended effect: the expensive operation is throttled, in favor of the lower 
cost operations.

> Cost-Based RPC FairCallQueue
> ----------------------------
>
>                 Key: HDFS-14403
>                 URL: https://issues.apache.org/jira/browse/HDFS-14403
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: ipc, namenode
>            Reporter: Erik Krogen
>            Assignee: Christopher Gregorian
>            Priority: Major
>              Labels: qos, rpc
>         Attachments: CostBasedFairCallQueueDesign_v0.pdf, HDFS-14403.001.patch
>
>
> HADOOP-15016 initially described extensions to the Hadoop FairCallQueue 
> encompassing both cost-based analysis of incoming RPCs, as well as support 
> for reservations of RPC capacity for system/platform users. This JIRA intends 
> to track the former, as HADOOP-15016 was repurposed to more specifically 
> focus on the reservation portion of the work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to