[
https://issues.apache.org/jira/browse/HDDS-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16918239#comment-16918239
]
Xiaoyu Yao commented on HDDS-2010:
----------------------------------
I would prefer 1 for better scalability. Also SCM always has its in-memory
pipeline map built based on the pipeline report from DNs.
We also need another Jira that change the current pipeline creation logic:
Currently, SCM directly talk to DN to create pipeline, assuming there are
pending Read/Write need to use the pipeline follows.
We should change to have the pipeline creation/destroy into DN heartbeat
response model. This way we have better SCM scalability.
cc: [~anu].
> PipelineID management for multi-raft, in SCM or in datanode?
> ------------------------------------------------------------
>
> Key: HDDS-2010
> URL: https://issues.apache.org/jira/browse/HDDS-2010
> Project: Hadoop Distributed Data Store
> Issue Type: New Feature
> Components: Ozone Datanode
> Reporter: Li Cheng
> Assignee: Li Cheng
> Priority: Major
> Fix For: 0.5.0
>
>
> With the intention to support multi-raft, I wanna bring up a question on how
> the pipeline unique ids be managed. Since every datanode can be member in
> multiple raft pipelines, the pipeline ids need to be persisted with the
> datanode for recovery purpose (we can talk about recovery later). Generally
> there are two options:
> # Store in datanode (like datanodeDetails) and every time pipelines mapping
> change on single datanode, pipeline ids will be serialized to local file.
> This way will lead to many more local serialization of things like
> datanodeDetails, but the updates are only for local datanode change.
> Improvement can be made like linking a serializable object to datanodeDetails
> and datanode keeps updating the new pipeline ids to the serializable object
> instead the details file. On the other hand, since the pipeline ids are
> stored only in datanode locally, there will be no global view in SCM. (or we
> can store a lazy copy?)
> # Stored in SCM. SCM can maintain a large mapping between datanode ids and
> pipeline ids. But this way will lead to an exponentially increasing frequency
> in SCM updates since the pipeline mapping changes are way more complex and
> happen all the time. Obviously this gives SCM too much pressure, but it can
> also give SCM a global view on the management over datanodes and multi raft
> pipelines.
>
> Thoughts? [~xyao] [~Sammi]
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]