[
https://issues.apache.org/jira/browse/HDDS-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vyacheslav Tutrinov updated HDDS-14872:
---------------------------------------
Description:
h2. Background
h3. Current Architecture Limitations
Ozone currently uses a single RAFT consensus group for the Ozone Manager (OM)
in high availability (HA) deployments. While this provides strong consistency
and automatic failover, it has several limitations:
# {*}Single Leader Bottleneck{*}: All write operations must go through a
single OM leader, limiting write throughput regardless of the number of OM
replicas
# {*}RAFT Log Contention{*}: A single RAFT log serializes all metadata
updates, creating a scalability bottleneck
# {*}Resource Underutilization{*}: In a 3-node or 5-node OM cluster, only one
node actively processes write requests
# {*}Limited Horizontal Scalability{*}: Adding more OM nodes improves read
capacity (with follower reads) but not write capacity
The current single-raft architecture becomes a significant bottleneck for
metadata operations.
h2. Goals
# {*}Improve Write Throughput{*}: Distribute write load across multiple RAFT
groups to achieve near-linear scaling with the number of OM nodes
was:
h2. Background
h3. Current Architecture Limitations
Ozone currently uses a single RAFT consensus group for the Ozone Manager (OM)
in high availability (HA) deployments. While this provides strong consistency
and automatic failover, it has several limitations:
# {*}Single Leader Bottleneck{*}: All write operations must go through a
single OM leader, limiting write throughput regardless of the number of OM
replicas
# {*}RAFT Log Contention{*}: A single RAFT log serializes all metadata
updates, creating a scalability bottleneck
# {*}Resource Underutilization{*}: In a 3-node or 5-node OM cluster, only one
node actively processes write requests
# {*}Limited Horizontal Scalability{*}: Adding more OM nodes improves read
capacity (with follower reads) but not write capacity
The current single-raft architecture becomes a significant bottleneck for
metadata operations.
h2. Goals
# {*}Improve Write Throughput{*}: Distribute write load across multiple RAFT
leaders to achieve near-linear scaling with the number of OM nodes
> Design And Implement Ozone Manager Multi-RAFT feature
> -----------------------------------------------------
>
> Key: HDDS-14872
> URL: https://issues.apache.org/jira/browse/HDDS-14872
> Project: Apache Ozone
> Issue Type: Epic
> Reporter: Vyacheslav Tutrinov
> Priority: Major
>
> h2. Background
> h3. Current Architecture Limitations
> Ozone currently uses a single RAFT consensus group for the Ozone Manager (OM)
> in high availability (HA) deployments. While this provides strong consistency
> and automatic failover, it has several limitations:
> # {*}Single Leader Bottleneck{*}: All write operations must go through a
> single OM leader, limiting write throughput regardless of the number of OM
> replicas
> # {*}RAFT Log Contention{*}: A single RAFT log serializes all metadata
> updates, creating a scalability bottleneck
> # {*}Resource Underutilization{*}: In a 3-node or 5-node OM cluster, only
> one node actively processes write requests
> # {*}Limited Horizontal Scalability{*}: Adding more OM nodes improves read
> capacity (with follower reads) but not write capacity
>
> The current single-raft architecture becomes a significant bottleneck for
> metadata operations.
> h2. Goals
> # {*}Improve Write Throughput{*}: Distribute write load across multiple RAFT
> groups to achieve near-linear scaling with the number of OM nodes
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]