[
https://issues.apache.org/jira/browse/HDFS-12387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157850#comment-16157850
]
Xiaoyu Yao commented on HDFS-12387:
-----------------------------------
Thanks for working on this. The patch looks good to me overall. Here are some
comments.
ChunkGroupOutputStream.java
Line 293: comment does not apply and can be removed.
XceiverClientRatis.java
Line 116-123: can we add some document on this?
Line 122: can we add some error log including both the datanode and the
newPeers?
ContainerOperationClient.java
Line 98: can you add a TODO to update the Pipeline state from ALLOCATED to OPEN
Line 162: NIT: commentlog can be removed.
Line 163: NIT: should we change to debug log or remove it?
Line 97/201: how do we handle if the pipeline state is not in either ALLOCATED
or OPEN state? Can we add a precondition assert here to ensure the pipeline is
in OPEN state?
Line 260/299: we should add {{if (LOG.isDebugEnabled(O)) }} .
BlockContainerInfo.java
Line 28: class document needs to be updated, the name of the class itself may
also need to be udpated as the ContainerInfo is not only for Block svc only
anymore with this change.
Line 64: NIT: "last used time"
Line 80/82: the exceptions are not declared on Line 86
Pipeline.java
Line 62: should we remove containerName from the Pipeline? This can be handled
in a separate ticket. My understanding is that we will need a pipeline name for
the container like we defined in Ozone.proto.
Line 65/251: lifeCycleStates -> lifeCycleState
BlockManagerImpl.java
Line 200: can we update the javadoc with the new parameters?
Line 252 The size/usage information is not being persisted when allocateBlock
via BlockManager. Can we add an API on SCM to allow update the container usage
info? For now, we can still let block manager to call this API to update usage
before the container report is available. I found the TODO in ContainerMapping
to update the size via container report. With that, we still could have over
provision problem wrt. Container size.
Line 233: +1 for the idea of allowing different container selection policies?
With the new container selection algorithm, how do we handle more than 20
clients allocate block at the same time? We may not want to hand over the same
ALLOCATED containers to different clients for them to create on the DNs?
Otherwise, clients will need to compete for the container creation on DNs with
some fails and retry? I see the patch change the default container provision
size from 5 to 20. This can mitigate the above issue to some extent but cannot
solve it completely. As we discussed offline, we might need an additional state
to indicate that an ALLOCATED container has been assigned so that it will not
be handed over to other clients.
Line 305: can we add WARN or ERROR when the getMachines.size() == 0
Line 408-413: can re update the API that support get OPEN containers with type
and replication factors as parameter?
This can be done with a follow up JIRA?
ContainerMapping.java
Line 251: blockreports-> container report
ContainerMapping#updateContainerState needs to be updated (missing in the
change)
It should update the in memory map via
containerStateManager#updateContainerState
ContainerStateManager.java
Line 141: we will need to pass the containerStore from ContainerMapping so that
the ContainerStateManager can load the containers from persisted store after
reboot.
Line 161: NIT: can we change this OZONE_SCM_BLOCK_SIZE_KEY to
OZONE_SCM_BLOCK_SIZE_MB
Line 193: ContainerKey contains owner,type, replicationFactor and state. Will
the look up of container by name with the same container key using priority
queue be slow? Maybe a LinkedHashMap based LRU will have better performance for
look up than PriorityQueue here. This can also avoid the overhead of
maintaining a new lastUsed time in the containerInfo.
Line 468: factor -> replicationFactor
PipelineSelector.java
Line 74-75: please add TODO STAND_ALONE should not be always set for Ratis. Or
do we want to use this as a default and update later?
RatisManagerImpl.java
Line 53/67/194: ratisMembers never got updated. Can you add some comment on the
expected usage of it? If I understand correctly, we should update ratisMembers
after line 197 or when the client inform ratisManager that the pipeline has
been created.
Line 110/153/163: do we need to bind pipeline with a containerName? Especially
with line 110, what if client ask for a pipeline for container2 where we return
an existing open pipeline for container1.
Line 158: NIT: define const for pipeline prefix "Ratis-"
> Ozone: Support Ratis as a first class replication mechanism
> -----------------------------------------------------------
>
> Key: HDFS-12387
> URL: https://issues.apache.org/jira/browse/HDFS-12387
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Affects Versions: HDFS-7240
> Reporter: Anu Engineer
> Assignee: Anu Engineer
> Priority: Critical
> Labels: ozoneMerge
> Attachments: HDFS-12387-HDFS-7240.001.patch
>
>
> Ozone container layer supports pluggable replication policies. This JIRA
> brings Apache Ratis based replication to Ozone. Apache Ratis is a java
> implementation of Raft protocol.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]