[ 
https://issues.apache.org/jira/browse/HDDS-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-3240:
-----------------------------
    Description: 
Now follower cannot create container until leader finish creating container. 
But follower and leader can create container in parallel rather than in 
sequential.
*Why leader and follower create container in sequential now:*
1. From the code,  the [future 
thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L672]
 do getCachedStateMachineData  in readStateMachineData and the [future 
thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L459]
 do createContainer in writeStateMachineData  are the same 
[thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L505].
 Because writeStateMachineData  called before readStateMachineData. So leader 
must wait createContainer finish then getCachedStateMachineData and append logs 
to the follower, so leader and follower are not independent in createContainer, 
follower must wait leader finish createContainer.  
2. From the jaeger UI, you can also see follower create container after leader 
finishing it currently.
 !screenshot-2.png! 

*How to improve it:*
I think this order can be improved by distinguishing the thread used by 
getCachedStateMachineData  and createContainer , and  [data = 
readStateMachineData(requestProto, term, 
logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L619]
  use same thread with createContainer . If 
[stateMachineDataCache.get(logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L617]
 does not return null,  leader can get stateMachineData from cache and need not 
wait createContainer finish, thus leader and follower can be independent. But 
if it return null, leader must finish createContainer and then apennd logs to 
the follower, so I think [data = readStateMachineData(requestProto, term, 
logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L619]
 should use the same thread with createContainer rather than the whole 
[getCachedStateMachineData|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L614].
 

  was:
Now follower cannot create container until leader finish creating container. 
But follower and leader can create container in parallel rather than in 
sequential.

1. From the code,  the [future 
thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L672]
 do getCachedStateMachineData  in readStateMachineData and the [future 
thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L459]
 do createContainer in writeStateMachineData  are the same 
[thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L505].
 Because writeStateMachineData  called before readStateMachineData. So leader 
must wait createContainer finish then getCachedStateMachineData and append logs 
to the follower, so leader and follower are not independent in createContainer, 
follower must wait leader finish createContainer.  
2. From the jaeger UI, you can also see follower create container after leader 
finishing it currently.

*How to improve it:*
I think this order can be improved by distinguishing the thread used by 
getCachedStateMachineData  and createContainer , and  [data = 
readStateMachineData(requestProto, term, 
logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L619]
  use same thread with createContainer . If 
[stateMachineDataCache.get(logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L617]
 does not return null,  leader can get stateMachineData from cache and need not 
wait createContainer finish, thus leader and follower can be independent. But 
if it return null, leader must finish createContainer and then apennd logs to 
the follower, so I think [data = readStateMachineData(requestProto, term, 
logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L619]
 should use the same thread with createContainer rather than the whole 
[getCachedStateMachineData|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L614].
 


> Improve write efficiency by creating container in parallel.
> -----------------------------------------------------------
>
>                 Key: HDDS-3240
>                 URL: https://issues.apache.org/jira/browse/HDDS-3240
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>            Reporter: runzhiwang
>            Assignee: runzhiwang
>            Priority: Major
>         Attachments: screenshot-1.png, screenshot-2.png
>
>
> Now follower cannot create container until leader finish creating container. 
> But follower and leader can create container in parallel rather than in 
> sequential.
> *Why leader and follower create container in sequential now:*
> 1. From the code,  the [future 
> thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L672]
>  do getCachedStateMachineData  in readStateMachineData and the [future 
> thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L459]
>  do createContainer in writeStateMachineData  are the same 
> [thread|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L505].
>  Because writeStateMachineData  called before readStateMachineData. So leader 
> must wait createContainer finish then getCachedStateMachineData and append 
> logs to the follower, so leader and follower are not independent in 
> createContainer, follower must wait leader finish createContainer.  
> 2. From the jaeger UI, you can also see follower create container after 
> leader finishing it currently.
>  !screenshot-2.png! 
> *How to improve it:*
> I think this order can be improved by distinguishing the thread used by 
> getCachedStateMachineData  and createContainer , and  [data = 
> readStateMachineData(requestProto, term, 
> logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L619]
>   use same thread with createContainer . If 
> [stateMachineDataCache.get(logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L617]
>  does not return null,  leader can get stateMachineData from cache and need 
> not wait createContainer finish, thus leader and follower can be independent. 
> But if it return null, leader must finish createContainer and then apennd 
> logs to the follower, so I think [data = readStateMachineData(requestProto, 
> term, 
> logIndex)|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L619]
>  should use the same thread with createContainer rather than the whole 
> [getCachedStateMachineData|https://github.com/apache/hadoop-ozone/blob/master/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java#L614].
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to