[
https://issues.apache.org/jira/browse/HDFS-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980071#comment-15980071
]
Weiwei Yang edited comment on HDFS-11493 at 4/22/17 6:34 PM:
-------------------------------------------------------------
Hi [~anu]
Thanks for posting the patch as well as the design doc, looks very nice. I
haven't read thoroughly about the code yet, just have some quick thoughts that
hope helps,
*Node State*
Right now node state only has HEALTHY, STALE, DEAD, UNKNOWN. Is it useful to
add following states as well?
* MAINTENANCE: admin could bring down a node and set it as "MAINTENANCE" state
for maintenance, and in this case scm doesn't treat containers on this node as
missing;
* DECOMMISSIONING and DECOMMISSIONED: admin could gracefully decommission a
node from a given pool
*Pull Container Report*
Replication manager requests a pool of nodes to send container reports, imagine
there 3 pools being processed in parallel, does that mean 24 * 3 = 72 nodes
container report will arrive scm in a wave? Would that cause network problem?
*Scm configuration*
Can we move configuration properties
OZONE_SCM_CONTAINER_REPORT_PROCCESSING_LAG,
OZONE_SCM_MAX_CONTAINER_REPORT_THREADS and
OZONE_SCM_MAX_WAIT_FOR_CONTAINER_REPORTS_SECONDS from {{OzoneConfigKeys}} to
{{ScmConfigKeys}} ?
*CommandQueue*
Looks like the command queue maintains a list of commands for each datanode,
suggest to use finer grained lock for synchronization. More specifically, if a
thread wants to add a command for datanode A, and another thread wants to add a
command for datanode B, we probably don't want them to wait for the other.
This is a in-memory queue, how to make sure not to run into inconsistent state?
Imagine if replication manager has just processed container reports from a pool
and ask a datanode to replicate a container, assume the replication is
happening in progress. And then scm crashed and restarted, how scm gets to know
what current state is?
The queue seems to be time ordered, I think it will be better to support
priority as well. Commands may have different priority, for example, replicate
a container priority is usually higher than delete a container replica;
replicate a container also may have different priorities according to the
number of replicas in desire.
I will try to read more and comment. Thanks.
was (Author: cheersyang):
Hi [~anu]
Thanks for posting the patch as well as the design doc, looks very nice. I
haven't read thoroughly about the code yet, just have some quick thoughts that
hope helps,
*Node pool*
Some general questions, how we handle over-replicated containers;
*Node State*
Right now node state only has HEALTHY, STALE, DEAD, UNKNOWN. Is it useful to
add following states as well?
* MAINTENANCE: admin could bring down a node and set it as "MAINTENANCE" state
for maintenance, and in this case scm doesn't treat containers on this node as
missing;
* DECOMMISSIONING and DECOMMISSIONED: admin could gracefully decommission a
node from a given pool
*Pull Container Report*
Replication manager requests a pool of nodes to send container reports, imagine
there 3 pools being processed in parallel, does that mean 24 * 3 = 72 nodes
container report will arrive scm in a wave? Would that cause network problem?
*Scm configuration*
Can we move configuration properties
OZONE_SCM_CONTAINER_REPORT_PROCCESSING_LAG,
OZONE_SCM_MAX_CONTAINER_REPORT_THREADS and
OZONE_SCM_MAX_WAIT_FOR_CONTAINER_REPORTS_SECONDS from {{OzoneConfigKeys}} to
{{ScmConfigKeys}} ?
*CommandQueue*
Looks like the command queue maintains a list of commands for each datanode,
suggest to use finer grained lock for synchronization. More specifically, if a
thread wants to add a command for datanode A, and another thread wants to add a
command for datanode B, we probably don't want them to wait for the other.
This is a in-memory queue, how to make sure not to run into inconsistent state?
Imagine if replication manager has just processed container reports from a pool
and ask a datanode to replicate a container, assume the replication is
happening in progress. And then scm crashed and restarted, how scm gets to know
what current state is?
The queue seems to be time ordered, I think it will be better to support
priority as well. Commands may have different priority, for example, replicate
a container priority is usually higher than delete a container replica;
replicate a container also may have different priorities according to the
number of replicas in desire.
I will try to read more and comment. Thanks.
> Ozone: SCM: Add the ability to handle container reports
> ---------------------------------------------------------
>
> Key: HDFS-11493
> URL: https://issues.apache.org/jira/browse/HDFS-11493
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Affects Versions: HDFS-7240
> Reporter: Anu Engineer
> Assignee: Anu Engineer
> Attachments: container-replication-storage.pdf,
> HDFS-11493-HDFS-7240.001.patch
>
>
> Once a datanode sends the container report it is SCM's responsibility to
> determine if the replication levels are acceptable. If it is not, SCM should
> initiate a replication request to another datanode. This JIRA tracks how SCM
> handles a container report.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]