Adi,

Thanks for the write-up. Here are my thoughts:

I think you are suggesting a way of automating resurrecting a topic’s
replication factor in the presence of a specific scenario: in the event of
permanent broker failures. I agree that the partition reassignment
mechanism should be used to add replicas when they are lost to permanent
broker failures. But I think the KIP probably chews off more than we can
digest.

Before we automate detection of permanent broker failures and have the
controller mitigate through automatic data balancing, I’d like to point out
that our current difficulty is not that but the ability to generate a
workable partition assignment for rebalancing data in a cluster.

There are 2 problems with partition rebalancing today:

   1. Lack of replica throttling for balancing data: In the absence of
   replica throttling, even if you come up with an assignment that might be
   workable, it isn’t practical to kick it off without worrying about bringing
   the entire cluster down. I don’t think the hack of moving partitions in
   batches is effective as it at-best a best guess.
   2. Lack of support for policies in the rebalance tool that automatically
   generate a workable partition assignment: There is no easy way to generate
   a partition reassignment JSON file. An example of a policy is “end up with
   an equal number of partitions on every broker while minimizing data
   movement”. There might be other policies that might make sense, we’d have
   to experiment.

Broadly speaking, the data balancing problem is comprised of 3 parts:

   1. Trigger: An event that triggers data balancing to take place. KIP-46
   suggests a specific trigger and that is permanent broker failure. But there
   might be several other events that might make sense — Cluster expansion,
   decommissioning brokers, data imbalance
   2. Policy: Given a set of constraints, generate a target partition
   assignment that can be executed when triggered.
   3. Mechanism: Given a partition assignment, make the state changes and
   actually move the data until the target assignment is achieved.

Currently, the trigger is manual through the rebalance tool, there is no
support for any viable policy today and we have a built-in mechanism that,
given a policy and upon a trigger, moves data in a cluster but does not
support throttling.

Given that both the policy and the throttling improvement to the mechanism
are hard problems and given our past experience of operationalizing
partition reassignment (required months of testing before we got it right),
I strongly recommend attacking this problem in stages. I think a more
practical approach would be to add the concept of pluggable policies in the
rebalance tool, implement a practical policy that generates a workable
partition assignment upon triggering the tool and improve the mechanism to
support throttling so that a given policy can succeed without manual
intervention. If we solved these problems first, the rebalance tool would
be much more accessible to Kafka users and operators.

Assuming that we do this, the problem that KIP-46 aims to solve becomes
much easier. You can separate the detection of permanent broker failures
(trigger) from the mitigation (above-mentioned improvements to data
balancing). The latter will be a native capability in Kafka. Detecting
permanent hardware failures is much easily done via an external script that
uses a simple health check. (Part 1 of KIP-46).

I agree that it will be great to *eventually* be able to fully automate
both the trigger as well as the policies while also improving the
mechanism. But I’m highly skeptical of big-bang approaches that go from a
completely manual and cumbersome process to a fully automated one,
especially when that involves large-scale data movement in a running
cluster. Once we stabilize these changes and feel confident that they work,
we can push the policy into the controller and have it automatically be
triggered based on different events.

Thanks,
Neha

On Tue, Feb 2, 2016 at 6:13 PM, Aditya Auradkar <
aaurad...@linkedin.com.invalid> wrote:

> Hey everyone,
>
> I just created a kip to discuss automated replica reassignment when we lose
> a broker in the cluster.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-46%3A+Self+Healing+Kafka
>
> Any feedback is welcome.
>
> Thanks,
> Aditya
>



-- 
Thanks,
Neha

Reply via email to