[jira] [Comment Edited] (HDFS-14090) RBF: Improved isolation for downstream name nodes. {Static}

2020-11-12 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17230612#comment-17230612
 ] 

Yiqun Lin edited comment on HDFS-14090 at 11/12/20, 2:38 PM:
-

Hi [~fengnanli], three nits for the latest patch:

1 Will look good to  rename dfs.federation.router.fairness.handler.count.NS to 
dfs.federation.router.fairness.handler.count.EXAMPLENAMESERVICE.

2 {noformat}
smaller or equal to the total number of router handlers; if the special
  *concurrent* is not specified, the sum of all configured values must be
  strictly smaller than the router handlers thus the left will be allocated
  to the concurrent calls.
{noformat}
Can we mention related setting ''strictly smaller than the router handlers 
(dfs.federation.router.handler.count)...

3
Can you fix related failed unit test?
|hadoop.hdfs.server.federation.router.TestRBFConfigFields|

Others look good to me.


was (Author: linyiqun):
Hi [~fengnanli], two nits for the latest patch:
{noformat}
smaller or equal to the total number of router handlers; if the special
  *concurrent* is not specified, the sum of all configured values must be
  strictly smaller than the router handlers thus the left will be allocated
  to the concurrent calls.
{noformat}
Can we mention related setting ''strictly smaller than the router handlers 
(dfs.federation.router.handler.count)...

Can you fix related failed unit test?
|hadoop.hdfs.server.federation.router.TestRBFConfigFields|

Others look good to me.

> RBF: Improved isolation for downstream name nodes. {Static}
> ---
>
> Key: HDFS-14090
> URL: https://issues.apache.org/jira/browse/HDFS-14090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: CR Hota
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14090-HDFS-13891.001.patch, 
> HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, 
> HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, 
> HDFS-14090.006.patch, HDFS-14090.007.patch, HDFS-14090.008.patch, 
> HDFS-14090.009.patch, HDFS-14090.010.patch, HDFS-14090.011.patch, 
> HDFS-14090.012.patch, HDFS-14090.013.patch, HDFS-14090.014.patch, 
> HDFS-14090.015.patch, HDFS-14090.016.patch, HDFS-14090.017.patch, 
> HDFS-14090.018.patch, HDFS-14090.019.patch, HDFS-14090.020.patch, 
> HDFS-14090.021.patch, HDFS-14090.022.patch, HDFS-14090.023.patch, 
> HDFS-14090.024.patch, RBF_ Isolation design.pdf
>
>
> Router is a gateway to underlying name nodes. Gateway architectures, should 
> help minimize impact of clients connecting to healthy clusters vs unhealthy 
> clusters.
> For example - If there are 2 name nodes downstream, and one of them is 
> heavily loaded with calls spiking rpc queue times, due to back pressure the 
> same with start reflecting on the router. As a result of this, clients 
> connecting to healthy/faster name nodes will also slow down as same rpc queue 
> is maintained for all calls at the router layer. Essentially the same IPC 
> thread pool is used by router to connect to all name nodes.
> Currently router uses one single rpc queue for all calls. Lets discuss how we 
> can change the architecture and add some throttling logic for 
> unhealthy/slow/overloaded name nodes.
> One way could be to read from current call queue, immediately identify 
> downstream name node and maintain a separate queue for each underlying name 
> node. Another simpler way is to maintain some sort of rate limiter configured 
> for each name node and let routers drop/reject/send error requests after 
> certain threshold. 
> This won’t be a simple change as router’s ‘Server’ layer would need redesign 
> and implementation. Currently this layer is the same as name node.
> Opening this ticket to discuss, design and implement this feature.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14090) RBF: Improved isolation for downstream name nodes. {Static}

2020-11-12 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17230612#comment-17230612
 ] 

Yiqun Lin edited comment on HDFS-14090 at 11/12/20, 1:07 PM:
-

Hi [~fengnanli], two nits for the latest patch:
{noformat}
smaller or equal to the total number of router handlers; if the special
  *concurrent* is not specified, the sum of all configured values must be
  strictly smaller than the router handlers thus the left will be allocated
  to the concurrent calls.
{noformat}
Can we mention related setting ''strictly smaller than the router handlers 
(dfs.federation.router.handler.count)...

Can you fix related failed unit test?
|hadoop.hdfs.server.federation.router.TestRBFConfigFields|

Others look good to me.


was (Author: linyiqun):
Hi [~fengnanli], two nits for the latest patch:
{noformat}
smaller or equal to the total number of router handlers; if the special
  *concurrent* is not specified, the sum of all configured values must be
  strictly smaller than the router handlers thus the left will be allocated
  to the concurrent calls.
{noformat}
Can we mention related setting ''strictly smaller than the router handlers 
(dfs.federation.router.handler.count)...

Can you fix related failed unit test?

Others look good to me.

> RBF: Improved isolation for downstream name nodes. {Static}
> ---
>
> Key: HDFS-14090
> URL: https://issues.apache.org/jira/browse/HDFS-14090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: CR Hota
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14090-HDFS-13891.001.patch, 
> HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, 
> HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, 
> HDFS-14090.006.patch, HDFS-14090.007.patch, HDFS-14090.008.patch, 
> HDFS-14090.009.patch, HDFS-14090.010.patch, HDFS-14090.011.patch, 
> HDFS-14090.012.patch, HDFS-14090.013.patch, HDFS-14090.014.patch, 
> HDFS-14090.015.patch, HDFS-14090.016.patch, HDFS-14090.017.patch, 
> HDFS-14090.018.patch, HDFS-14090.019.patch, HDFS-14090.020.patch, 
> HDFS-14090.021.patch, HDFS-14090.022.patch, HDFS-14090.023.patch, 
> HDFS-14090.024.patch, RBF_ Isolation design.pdf
>
>
> Router is a gateway to underlying name nodes. Gateway architectures, should 
> help minimize impact of clients connecting to healthy clusters vs unhealthy 
> clusters.
> For example - If there are 2 name nodes downstream, and one of them is 
> heavily loaded with calls spiking rpc queue times, due to back pressure the 
> same with start reflecting on the router. As a result of this, clients 
> connecting to healthy/faster name nodes will also slow down as same rpc queue 
> is maintained for all calls at the router layer. Essentially the same IPC 
> thread pool is used by router to connect to all name nodes.
> Currently router uses one single rpc queue for all calls. Lets discuss how we 
> can change the architecture and add some throttling logic for 
> unhealthy/slow/overloaded name nodes.
> One way could be to read from current call queue, immediately identify 
> downstream name node and maintain a separate queue for each underlying name 
> node. Another simpler way is to maintain some sort of rate limiter configured 
> for each name node and let routers drop/reject/send error requests after 
> certain threshold. 
> This won’t be a simple change as router’s ‘Server’ layer would need redesign 
> and implementation. Currently this layer is the same as name node.
> Opening this ticket to discuss, design and implement this feature.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14090) RBF: Improved isolation for downstream name nodes. {Static}

2020-11-11 Thread Fengnan Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17230394#comment-17230394
 ] 

Fengnan Li edited comment on HDFS-14090 at 11/12/20, 7:07 AM:
--

Uploaded  [^HDFS-14090.024.patch] to add configs for it.
I feel like there should be more optimization about how this config can be 
specified and be made less verbose (like specify certain default values so we 
don't need to specify all nameservices), but I cannot come up with a clean way 
of doing this now. Will revisit when I start tackle the dynamic allocations.
Thanks!


was (Author: fengnanli):
Uploaded  [^HDFS-14090.024.patch] to add configs for it.
I feel like there should be more optimization about how this config can be 
specified but make it less verbose (like specify certain default values so we 
don't need to specify all nameservices), but I cannot come up with a clean way 
of doing this now. Will revisit when I start tackle the dynamic allocations.
Thanks!

> RBF: Improved isolation for downstream name nodes. {Static}
> ---
>
> Key: HDFS-14090
> URL: https://issues.apache.org/jira/browse/HDFS-14090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: CR Hota
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14090-HDFS-13891.001.patch, 
> HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, 
> HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, 
> HDFS-14090.006.patch, HDFS-14090.007.patch, HDFS-14090.008.patch, 
> HDFS-14090.009.patch, HDFS-14090.010.patch, HDFS-14090.011.patch, 
> HDFS-14090.012.patch, HDFS-14090.013.patch, HDFS-14090.014.patch, 
> HDFS-14090.015.patch, HDFS-14090.016.patch, HDFS-14090.017.patch, 
> HDFS-14090.018.patch, HDFS-14090.019.patch, HDFS-14090.020.patch, 
> HDFS-14090.021.patch, HDFS-14090.022.patch, HDFS-14090.023.patch, 
> HDFS-14090.024.patch, RBF_ Isolation design.pdf
>
>
> Router is a gateway to underlying name nodes. Gateway architectures, should 
> help minimize impact of clients connecting to healthy clusters vs unhealthy 
> clusters.
> For example - If there are 2 name nodes downstream, and one of them is 
> heavily loaded with calls spiking rpc queue times, due to back pressure the 
> same with start reflecting on the router. As a result of this, clients 
> connecting to healthy/faster name nodes will also slow down as same rpc queue 
> is maintained for all calls at the router layer. Essentially the same IPC 
> thread pool is used by router to connect to all name nodes.
> Currently router uses one single rpc queue for all calls. Lets discuss how we 
> can change the architecture and add some throttling logic for 
> unhealthy/slow/overloaded name nodes.
> One way could be to read from current call queue, immediately identify 
> downstream name node and maintain a separate queue for each underlying name 
> node. Another simpler way is to maintain some sort of rate limiter configured 
> for each name node and let routers drop/reject/send error requests after 
> certain threshold. 
> This won’t be a simple change as router’s ‘Server’ layer would need redesign 
> and implementation. Currently this layer is the same as name node.
> Opening this ticket to discuss, design and implement this feature.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14090) RBF: Improved isolation for downstream name nodes. {Static}

2020-11-05 Thread Yiqun Lin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227189#comment-17227189
 ] 

Yiqun Lin edited comment on HDFS-14090 at 11/6/20, 6:58 AM:


Hi [~fengnanli], some minor comments from me:


 1. I see here we introduce CONCURRENT_NS for concurrent call, why not acquire 
permit to corresponding ns instead of?

2. Current description of setting hdfs-rbf-default.xml can describe more. At 
least, we need to mention: 
 * The setting name for how to configure handler count for each ns, also 
include CONCURRENT_NS ns.
 * The sum of dedicated handler count should be less than value of 
dfs.federation.router.handler.count

3. It would be better to add this improvement in HDFSRouterFederation.md.

Comment #2 and #3 can be addressed in a follow-up JIRA,  :).


was (Author: linyiqun):
Hi [~fengnanli], some minor comments from me:


 1. I see here we introduce CONCURRENT_NS for concurrent call, why not acquire 
permit to corresponding ns instead of?

2. Current description of setting hdfs-rbf-default.xml can describe more. At 
least, we need to mention: 
 * The setting name for how to configure handler count for each ns, also 
include CONCURRENT_NS ns.
 * The sum of dedicated handler count should be less than value of 
dfs.federation.router.handler.count

3. It would be better to add this improvement in HDFSRouterFederation.md.

> RBF: Improved isolation for downstream name nodes. {Static}
> ---
>
> Key: HDFS-14090
> URL: https://issues.apache.org/jira/browse/HDFS-14090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: CR Hota
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14090-HDFS-13891.001.patch, 
> HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, 
> HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, 
> HDFS-14090.006.patch, HDFS-14090.007.patch, HDFS-14090.008.patch, 
> HDFS-14090.009.patch, HDFS-14090.010.patch, HDFS-14090.011.patch, 
> HDFS-14090.012.patch, HDFS-14090.013.patch, HDFS-14090.014.patch, 
> HDFS-14090.015.patch, HDFS-14090.016.patch, HDFS-14090.017.patch, 
> HDFS-14090.018.patch, HDFS-14090.019.patch, HDFS-14090.020.patch, 
> HDFS-14090.021.patch, HDFS-14090.022.patch, HDFS-14090.023.patch, RBF_ 
> Isolation design.pdf
>
>
> Router is a gateway to underlying name nodes. Gateway architectures, should 
> help minimize impact of clients connecting to healthy clusters vs unhealthy 
> clusters.
> For example - If there are 2 name nodes downstream, and one of them is 
> heavily loaded with calls spiking rpc queue times, due to back pressure the 
> same with start reflecting on the router. As a result of this, clients 
> connecting to healthy/faster name nodes will also slow down as same rpc queue 
> is maintained for all calls at the router layer. Essentially the same IPC 
> thread pool is used by router to connect to all name nodes.
> Currently router uses one single rpc queue for all calls. Lets discuss how we 
> can change the architecture and add some throttling logic for 
> unhealthy/slow/overloaded name nodes.
> One way could be to read from current call queue, immediately identify 
> downstream name node and maintain a separate queue for each underlying name 
> node. Another simpler way is to maintain some sort of rate limiter configured 
> for each name node and let routers drop/reject/send error requests after 
> certain threshold. 
> This won’t be a simple change as router’s ‘Server’ layer would need redesign 
> and implementation. Currently this layer is the same as name node.
> Opening this ticket to discuss, design and implement this feature.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14090) RBF: Improved isolation for downstream name nodes.

2019-08-15 Thread CR Hota (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908758#comment-16908758
 ] 

CR Hota edited comment on HDFS-14090 at 8/16/19 5:58 AM:
-

[~xkrogen] [~elgoiri] Many thanks for the detailed reviews. Very helpful :) 
Have incorporated almost all the points you folks mentioned in 010.patch.

On a high level, changes are
 # "permit" is still the word being used.
 # One configuration controls the feature, {{NoFairnessPolicyController}} is 
dummy whereas {{StaticFairnessPolicyController}} is the fairness implementation.
 # The whole start-up will fail if fairness class loading has issues. Test 
cases are appropriately changed to reflect that.
 # {{NoPermitAvailableException}} is renamed to 
{{PermitLimitExceededException.}}

 

To [~xkrogen] observations,
{quote}I was considering the scenario where there are two routers R1 and R2, 
and two NameNodes N1 and N2. Assume most clients need to access both N1 and N2. 
What happens in the situation when all of R1's N1-handlers are full (but 
N2-handlers mostly empty), and all of R2's N2-handlers are full (but 
N1-handlers mostly empty)? I'm not sure if this is a situation that is likely 
to arise, or if the system will easily self-heal based on the backoff behavior. 
Maybe worth thinking about a little--not a blocking concern for me, more of a 
thought experiment.
{quote}
 It should ideally not happen that all handlers of a specific router are busy 
and other handlers are completely free, since clients are expected to use 
random order while connecting. However, from the beginning the design  focuses 
on getting the system to self-heal as much as possible to eventually get 
similar traffic across all routers in a cluster.
{quote}The configuration for this seems like it will be really tricky to get 
right, particularly knowing how many fan-out handlers to allocate. I imagine as 
an administrator, my thought process would be like:
 I want 35% allocated to NN1 and 65% allocated to NN2, since NN2 is about 2x as 
loaded as NN1. This part is fairly intuitive.
 Then I encounter the fan-out configuration... What am I supposed to do with it?
 Are there perhaps any heuristics we can provide for reasonable values?
{quote}
Yes, configurations values are something, which users have to pay attention to 
specially concurrent calls. In the documentation sub-Jira HDFS-14558, I plan to 
write more about the concurrent calls and some points for users to focus on. 
Also configurations may need to be changed by users based on new use cases and 
load on downstream clusters etc.

[~aajisaka] [~brahmareddy] [~linyiqun] [~hexiaoqiao] FYI.


was (Author: crh):
[~xkrogen] [~elgoiri] Many thanks for the detailed reviews. Very helpful :) 
Have incorporated almost all the points you folks mentioned.

On a high level, changes are
 # "permit" is still the word being used.
 # One configuration controls the feature, {{NoFairnessPolicyController}} is 
dummy whereas {{StaticFairnessPolicyController}} is the fairness implementation.
 # The whole start-up will fail if fairness class loading has issues. Test 
cases are appropriately changed to reflect that.
 # {{NoPermitAvailableException}} is renamed to 
{{PermitLimitExceededException.}}

 

To [~xkrogen] observations,
{quote}I was considering the scenario where there are two routers R1 and R2, 
and two NameNodes N1 and N2. Assume most clients need to access both N1 and N2. 
What happens in the situation when all of R1's N1-handlers are full (but 
N2-handlers mostly empty), and all of R2's N2-handlers are full (but 
N1-handlers mostly empty)? I'm not sure if this is a situation that is likely 
to arise, or if the system will easily self-heal based on the backoff behavior. 
Maybe worth thinking about a little--not a blocking concern for me, more of a 
thought experiment.
{quote}
 It should ideally not happen that all handlers of a specific router are busy 
and other handlers are completely free, since clients are expected to use 
random order while connecting. However, from the beginning the design  focuses 
on getting the system to self-heal as much as possible to eventually get 
similar traffic across all routers in a cluster.
{quote}The configuration for this seems like it will be really tricky to get 
right, particularly knowing how many fan-out handlers to allocate. I imagine as 
an administrator, my thought process would be like:
 I want 35% allocated to NN1 and 65% allocated to NN2, since NN2 is about 2x as 
loaded as NN1. This part is fairly intuitive.
 Then I encounter the fan-out configuration... What am I supposed to do with it?
 Are there perhaps any heuristics we can provide for reasonable values?
{quote}
Yes, configurations values are something, which users have to pay attention to 
specially concurrent calls. In the documentation sub-Jira HDFS-14558, I plan to 
write more about the concurrent calls and some 

[jira] [Comment Edited] (HDFS-14090) RBF: Improved isolation for downstream name nodes.

2019-08-13 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906493#comment-16906493
 ] 

Erik Krogen edited comment on HDFS-14090 at 8/13/19 6:27 PM:
-

[~crh] thanks for keeping me honest here :) Finally got around to taking a look 
at this.

I read through the design again and had two additional concerns that came up:
 # I was considering the scenario where there are two routers R1 and R2, and 
two NameNodes N1 and N2. Assume most clients need to access both N1 and N2. 
What happens in the situation when all of R1's N1-handlers are full (but 
N2-handlers mostly empty), and all of R2's N2-handlers are full (but 
N1-handlers mostly empty)? I'm not sure if this is a situation that is likely 
to arise, or if the system will easily self-heal based on the backoff behavior. 
Maybe worth thinking about a little--not a blocking concern for me, more of a 
thought experiment.
 # The configuration for this seems like it will be really tricky to get right, 
particularly knowing how many fan-out handlers to allocate. I imagine as an 
administrator, my thought process would be like:
 ** I want 35% allocated to NN1 and 65% allocated to NN2, since NN2 is about 2x 
as loaded as NN1. This part is fairly intuitive.
 ** Then I encounter the fan-out configuration... What am I supposed to do with 
it?
 ** Are there perhaps any heuristics we can provide for reasonable values?

Regarding the terminology, I actually think that "permit" better conveys the 
concept to me personally, however I think "quota" more closely matches similar 
terminology used throughout Hadoop (maybe this is bad – overloaded term?). My 
one concern with "permit" would be that it might imply that some number of 
permits are requested at a dynamic startup phase (e.g. a job requests permits 
at startup), rather than it being a constant allocation count. I don't have a 
strong preference here. I do agree that {{NoPermitAvailableException}} could 
use a better name to more readily indicate that it is an overloaded situation; 
{{PermitLimitExceededException}} might be better.

 
{quote} * Instead of having two configs, one for enabling and one for the 
implementation, we could have just the implementation and by default provide a 
dummy implementation that doesn't do fairness. Then we would rename the current 
DefaultFairnessPolicyController to something more descriptive (to reflect equal 
or linear or similar).
 * Coming back to PermitAllocationException, right now we are kind of logging 
and swallowing; what about failing the whole startup?{quote}
+1 on these two ideas from [~elgoiri]

I didn't do a thorough review, but from an initial look through the code, I am 
also impressed by how isolated the changes were able to be. It's great to see. 
I have some more isolated comments below.
 # This code seems to assume that the set of nameservices controlled by a 
router will never change. I haven't been following the router closely, but I 
thought that you could dynamically change the set of mount points. I see we're 
loading the nameservices from {{DFS_ROUTER_MONITOR_NAMENODE}} – is it actually 
accurate that the set of monitored NameNodes matches the set of mount points?
 # If someone actually has a nameservice called "concurrent" (used for the 
{{concurrentNS}}), this is going to cause problems. Given that this name will 
appear in user configurations, maybe it's nice for it to be an easily 
human-readable name, but we should add some logic to detect this collision and 
complain about it.
 # I think a {{WARN}} log on a permit allocation failure is a bit strong. This 
could really flood the logs when things get busy. I would suggest downgrading 
it to a {{DEBUG}}, or using the {{LogThrottlingHelper}} to limit how frequently 
this will be logged.
 # I think we need to document the nameservice-specific configurations in 
{{hdfs-rbf-default.xml}}, including the presence of the special "concurrent" 
nameservice.
 # Nits:
 ## You've sometimes used {{this.}} as a prefix for fields and sometimes not, 
can we make it more consistent?
 ## You should include diamond-types ( {{<>}} ) in your {{HashMap}} / 
{{HashSet}} instantiations
 ## Can we combine the log statements on L113 and L115 of 
{{DefaultFairnessPolicyController}}?


was (Author: xkrogen):
[~crh] thanks for keeping me honest here :) Finally got around to taking a look 
at this.

I read through the design again and had two additional concerns that came up:
 # I was considering the scenario where there are two routers R1 and R2, and 
two NameNodes N1 and N2. Assume most clients need to access both N1 and N2. 
What happens in the situation when all of R1's N1-handlers are full (but 
N2-handlers mostly empty), and all of R2's N2-handlers are full (but 
N1-handlers mostly empty)? I'm not sure if this is a situation that is likely 
to arise, or if the system will easily self-heal based on 

[jira] [Comment Edited] (HDFS-14090) RBF: Improved isolation for downstream name nodes.

2019-07-23 Thread He Xiaoqiao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891583#comment-16891583
 ] 

He Xiaoqiao edited comment on HDFS-14090 at 7/24/19 4:00 AM:
-

Thanks [~crh] for your contribution, [^HDFS-14090.006.patch] almost looks good 
to me.
minor comment about permit acquires and releases,
 {{RouterRpcClient#acquirePermit}} & {{RouterRpcClient#releasePermit}} should 
invoke pairs, and it certainly did the same thing. However some logic could not 
process exception correctly, it may lead {{Permit}} could not release as 
expected,
 1. {{RouterRpcClient#invokeSequential}}, when #getNamenodesForNameservice 
throw exception, we could not release permit as expected.
{code:java}
  public  T invokeSequential(
  final List locations,
  final RemoteMethod remoteMethod, Class expectedResultClass,
  Object expectedResultValue) throws IOException {
..
for (final RemoteLocationContext loc : locations) {
  String ns = loc.getNameserviceId();
  acquirePermit(ns, ugi, m);
  List namenodes =
  getNamenodesForNameservice(ns); // if throw exception, it could never 
release permit anymore.
  try {
  } catch{} finally {
releasePermit(ns, ugi, m);
  }
}
..
  }
{code}
2. in {{RouterRpcClient#invokeConcurrent}}, the issue could also exist after 
the second time to invoke {{acquirePermit}}.
 One minor suggestion, the whole code segment between {{acquirePermit}} and 
{{releasePermit}} should be encase by {{try{} catch{} finally}} statement to 
ensure that we could release the acquired permit in any case.
Thanks [~crh] again, please let me know if there is something I missed.


was (Author: hexiaoqiao):
Thanks [~crh] for your contribution, [^HDFS-14090.006.patch] almost looks good 
to me.
minor comment about permit acquires and releases,
 {{RouterRpcClient#acquirePermit}} & {{RouterRpcClient#releasePermit}} should 
invoke pairs, and it certainly did the same thing. However some logic could not 
process exception correctly, it may lead {{Permit}} could not release as 
expected,
 1. {{RouterRpcClient#invokeSequential}}, when #getNamenodesForNameservice 
throw exception, we could not release permit as expected.
{code:java}
  public  T invokeSequential(
  final List locations,
  final RemoteMethod remoteMethod, Class expectedResultClass,
  Object expectedResultValue) throws IOException {
..
for (final RemoteLocationContext loc : locations) {
  String ns = loc.getNameserviceId();
  acquirePermit(ns, ugi, m);
  List namenodes =
  getNamenodesForNameservice(ns);
  try {
  } catch{ finally{
releasePermit(ns, ugi, m);
  }
}
..
  }
{code}
2. in {{RouterRpcClient#invokeConcurrent}}, the issue could also exist after 
the second time to invoke {{acquirePermit}}.
 One minor suggestion, the whole code segment between {{acquirePermit}} and 
{{releasePermit}} should be encase by {{try{} catch{} finally}} statement to 
ensure that we could release the acquired permit in any case.
Thanks [~crh] again, please let me know if there is something I missed.

> RBF: Improved isolation for downstream name nodes.
> --
>
> Key: HDFS-14090
> URL: https://issues.apache.org/jira/browse/HDFS-14090
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14090-HDFS-13891.001.patch, 
> HDFS-14090-HDFS-13891.002.patch, HDFS-14090-HDFS-13891.003.patch, 
> HDFS-14090-HDFS-13891.004.patch, HDFS-14090-HDFS-13891.005.patch, 
> HDFS-14090.006.patch, RBF_ Isolation design.pdf
>
>
> Router is a gateway to underlying name nodes. Gateway architectures, should 
> help minimize impact of clients connecting to healthy clusters vs unhealthy 
> clusters.
> For example - If there are 2 name nodes downstream, and one of them is 
> heavily loaded with calls spiking rpc queue times, due to back pressure the 
> same with start reflecting on the router. As a result of this, clients 
> connecting to healthy/faster name nodes will also slow down as same rpc queue 
> is maintained for all calls at the router layer. Essentially the same IPC 
> thread pool is used by router to connect to all name nodes.
> Currently router uses one single rpc queue for all calls. Lets discuss how we 
> can change the architecture and add some throttling logic for 
> unhealthy/slow/overloaded name nodes.
> One way could be to read from current call queue, immediately identify 
> downstream name node and maintain a separate queue for each underlying name 
> node. Another simpler way is to maintain some sort of rate limiter configured 
> for each name node and let routers drop/reject/send error requests after 
> certain threshold. 
>