[ 
https://issues.apache.org/jira/browse/HDFS-12745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16279198#comment-16279198
 ] 

Chen Liang edited comment on HDFS-12745 at 12/5/17 9:19 PM:
------------------------------------------------------------

Thanks [~msingh] for working on this! I think this is a very important 
improvement. I have some comments below, all about {{PipelineManager}}

1. this is more like a question...PipelineManager will this be accessed by 
multiple threads? if so, do we need protection on activePipelines?

2. I wonder is there a situation where pipeline should be removed from the list 
activePipelines? Also, would it be better to make it private rather than 
protected? 

3. looks like activePipelines may contain pipelines with different types and 
factors? looks to me that there might be two corner cases in findOpenPipeline() 
due to this (please correct if I'm wrong). 
case a. For example, say we have three pipelines, (only looking at factors here)
\[A(factor=1), B(1), C(3)\]
(1). When current index on A, and we look for a factor=1 pipeline, we return A, 
next current index will be B
(2). now we look for a factor=3 pipeline, we skip B and move to C, and return 
C, next current index will be A
(3). Then again we look for a factor=1 pipeline, A has factor=1, we return A. 
next current index will be B.
Now we have returned A twice but never B. If we further repeat 2,3, we will 
have all factor=1 container requests going to A, never B.
case b. If we have, say, 100 pipelines, but only 1 with factor=1, and if all 
requests are for factor=1, it seems every time we may have to skip the other 99 
factor=3 pipelines only to get the only one that satisfies.

An alternative way might be to maintain different lists for different (factor, 
type) combination. This could be done by having a map, a map such as 
key=factor, value is another whose key=type, value is a list of pipeline.


was (Author: vagarychen):
1. this is more like a question...PipelineManager will this be accessed by 
multiple threads? if so, do we need protection on activePipelines?

2. I wonder is there a situation where pipeline should be removed from the list 
activePipelines? Also, would it be better to make it private rather than 
protected? 

3. looks like activePipelines may contain pipelines with different types and 
factors? looks to me that there might be two corner cases in findOpenPipeline() 
due to this (please correct if I'm wrong). 
case a. For example, say we have three pipelines, (only looking at factors here)
\[A(factor=1), B(1), C(3)\]
(1). When current index on A, and we look for a factor=1 pipeline, we return A, 
next current index will be B
(2). now we look for a factor=3 pipeline, we skip B and move to C, and return 
C, next current index will be A
(3). Then again we look for a factor=1 pipeline, A has factor=1, we return A. 
next current index will be B.
Now we have returned A twice but never B. If we further repeat 2,3, we will 
have all factor=1 container requests going to A, never B.
case b. If we have, say, 100 pipelines, but only 1 with factor=1, and if all 
requests are for factor=1, it seems every time we may have to skip the other 99 
factor=3 pipelines only to get the only one that satisfies.

An alternative way might be to maintain different lists for different (factor, 
type) combination. This could be done by having a map, a map such as 
key=factor, value is another whose key=type, value is a list of pipeline.

> Ozone: XceiverClientManager should cache objects based on pipeline name
> -----------------------------------------------------------------------
>
>                 Key: HDFS-12745
>                 URL: https://issues.apache.org/jira/browse/HDFS-12745
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Mukul Kumar Singh
>            Assignee: Mukul Kumar Singh
>             Fix For: HDFS-7240
>
>         Attachments: HDFS-12745-HDFS-7240.001.patch, 
> HDFS-12745-HDFS-7240.002.patch, HDFS-12745-HDFS-7240.003.patch, 
> HDFS-12745-HDFS-7240.004.patch, HDFS-12745-HDFS-7240.005.patch, 
> HDFS-12745-HDFS-7240.006.patch, HDFS-12745-HDFS-7240.007.patch
>
>
> With just the standalone pipeline, a new pipeline was created for each and 
> every container.
> This code can be optimized so that pipelines are craeted less frequently. 
> Caching using pipeline names will help with Ratis clients as well.
> a) Remove Container name from Pipeline object.
> b) XceiverClientManager should cache objects based on pipeline name
> c) XceiverClient and XceiverServer should be renamed to 
> XceiverClientStandAlone & XceiverServerRatis
> d) StandAlone pipeline should have notion of re-using pipeline objects.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to