RE: [DISCUSS] KIP-1255 Remote Read Replicas for Kafka Tiered Storage

Manan Gupta Fri, 09 Jan 2026 04:09:20 -0800

Below are responses to the key concerns raised around RRRs in KIP-1248 and
KIP-1254, organized by area:

1. Performance & Latency
1.1 Higher read latency
Yes. Historical reads add a hop (remote storage → RRR → client). This is
intentional: RRRs target cold and analytic workloads where throughput and
cost efficiency matter more than tail latency.
Mitigations include prefetching, local caching, larger sequential reads,
and AZ-local RRRs. Hot-path consumers continue to read directly from
leaders.

1.2 Increased internal bandwidth
RRRs increase internal traffic, but they:

   -

   Reduce load on leader brokers
   -

   Centralize and optimize remote storage access
   -

   Improve cost control versus per-client object storage reads

2. Client & Protocol
2.1 Client complexity
Client complexity is reduced, not eliminated. Brokers remain authoritative,
clients stay storage-agnostic, and most complexity is encapsulated in
shared libraries.

2.2 Redirect-based flow
Redirects are lightweight and Kafka-native (similar to leader/coordinator
discovery). Clients follow broker instructions without understanding
storage layouts or tiering.

3. Semantics & Features
3.1 Transactional semantics
Preserved. RRRs read canonical log segments, including transaction markers.
read_committed semantics are supported.

3.2 Newer features
RRRs initially support standard log consumption only. Features requiring
coordination or state mutation remain on main brokers by design.

4. Metadata & Routing
4.1 Partition assignment
RRRs are stateless: no partition ownership, ISR participation, or
rebalancing. Routing is dynamic and broker/controller-driven.

4.2 AZ affinity
Handled via existing rack/AZ metadata and broker-directed redirects.

4.3 Failure handling
No state means no rebalancing. Clients retry against another RRR or fall
back to brokers.

5. Operations & Scaling
5.1 Operational overhead
RRRs add a fleet but are stateless: no replication, elections, writes, or
durability responsibilities. They are easy to automate and replace.

5.2 Autoscaling
A first-class goal. RRRs scale on load, start quickly, and scale down
safely without state migration.

6. Architectural Trade-off
Yes, complexity is shifted—but deliberately off the hot path. This isolates
cold and bursty reads, protects real-time workloads, and cleanly separates
durability, serving, and analytics concerns.
On 2025/12/14 10:58:32 Manan Gupta wrote:
> Hi all,
>
> This email starts the discussion thread for *KIP-1255: Remote Read
Replicas
> for Kafka Tiered Storage*. The proposal introduces a lightweight broker
> role, *Remote Read Replica*, dedicated to serving historical reads
directly
> from remote storage.
>
> We’d appreciate your initial thoughts and feedback on the proposal.
>

RE: [DISCUSS] KIP-1255 Remote Read Replicas for Kafka Tiered Storage

Reply via email to