Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-13 Thread Stanislav Kozlovski
Hey Viktor,

I intuitively think that reassignment is a form of (extra) replication, so
I think the non-additive version sounds more natural to me. Would be good
to see what others think

Thanks for summing up what you changed in the KIP's wording here.

Best,
Stanislav

On Fri, Dec 13, 2019 at 3:39 PM Viktor Somogyi-Vass 
wrote:

> Hey Stan,
>
> 1. Yes.
>
> 2. Yes and no :). My earlier suggestion was exactly that. In the last reply
> to you I meant that if the replication throttle is 20 and the reassignment
> throttle is 10 then we'd still have 20 total throttle but 10 of that can be
> used for general replication and 10 again for reassignment. I think that
> either your version or this one can be good solutions, the main difference
> is how do you think about reassignment.
> If we think that reassignment is a special kind of replication then it
> might make sense to treat it as that and for me it sounds logical that we
> sort of count under the replication quota. It also protects you from
> setting too high value as you won't be able to configure something higher
> than replication.throttled.rate (but then what's left for general
> replication). On the other side users may have to increase their
> replication.throttled.rate if they want to increase or set their
> reassignment quota. This doesn't really play when you treat them under
> non-related quotas but you have to keep your total quota
> (replication+reassignment) in mind. Also I don't usually see customers
> using replication throttling during normal operation so for them it might
> be better to use the additive version (where 20 + 10 = 30) from an
> operational perspective (less things to configure).
> I'll add this non-additive version to the rejected alternatives.
>
> Considering the mentioned drawbacks I think it's better to go with the
> additive one.
> The updated KIP:
> "The possible configuration variations are:
> - replication.throttled.rate is set but reassignment.throttled.rate isn't
> (or -1): any kind of replication (so including reassignment) can take up to
> replication.throttled.rate bytes.
> - replication.throttled.rate and reassignment.throttled.rate both set: both
> can use a bandwidth up to the configured limit and the total replication
> limit will be reassignment.throttled.rate + replication.throttled.rate
> - replication.throttled.rate is not set but reassignment.throttled.rate is
> set: in this case general replication has no bandwidth limits but
> reassignment.throttled.rate has the configured limits.
> - neither replication.throttled.rate nor reassignment.throttled.rate are
> set (or -1): no throttling is set on any replication."
>
> 3. Yea, the motivation section might be a bit poorly worded as both you and
> Ismael pointed out problems, so let me rephrase it:
> "A user is able to specify the partition and the throttle rate but it will
> be applied to all non-ISR replication traffic. This is can be undesirable
> because during reassignment it also applies to non-reassignment replication
> and causes a replica to be throttled if it falls out of ISR. Also if
> leadership changes during reassignment, the throttles also have to be
> changed manually."
>
> Viktor
>
> On Tue, Dec 10, 2019 at 8:16 PM Stanislav Kozlovski <
> stanis...@confluent.io>
> wrote:
>
> > Hey Viktor,
> >
> > I like your latest idea regarding the replication/reassignment configs
> > interplay - I think it makes sense for replication to always be higher. A
> > small matrix of possibilities in the KIP may be useful to future readers
> > (users)
> > To be extra clear:
> > 1. if reassignment.throttle is -1, reassignment traffic is counted with
> > replication traffic against replication.throttle
> > 2. if replication.throttle is 20 and reassignment.throttle is 10, we
> have a
> > 30 total throttle
> > Is my understanding correct?
> >
> > Regarding the KIP - the motivation states
> >
> > > So a user is able to specify the partition and the throttle rate but it
> > will be applied to all non-ISR replication traffic. This is undesirable
> > because if a node that is being throttled falls out of ISR it would
> further
> > prevent it from catching up.
> >
> > This KIP does not solve this problem, right?
> > Or did you mean to mention the problem where reassignment replicas would
> > eat up the throttle and further limit the non-ISR "original" replicas
> from
> > catching up?
> >
> > Best,
> > Stanislav
> >
> > On Tue, Dec 10, 2019 at 9:09 AM Viktor Somogyi-Vass <
> > viktorsomo...@gmail.com>
> > wrote:
> >
> > > This config will only be applied to those replicas which are
> reassigning
> > > and not yet in ISR. When they become ISR then reassignment throttling
> > stops
> > > altogether and won't apply when they fall out of ISR. Specifically
> > > the validity of the config spans from the point when a reassignment is
> > > propagated by the adding_replicas field in the LeaderAndIsr request
> until
> > > the broker gets another LeaderAndIsr request saying that the 

Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-13 Thread Viktor Somogyi-Vass
Hey Stan,

1. Yes.

2. Yes and no :). My earlier suggestion was exactly that. In the last reply
to you I meant that if the replication throttle is 20 and the reassignment
throttle is 10 then we'd still have 20 total throttle but 10 of that can be
used for general replication and 10 again for reassignment. I think that
either your version or this one can be good solutions, the main difference
is how do you think about reassignment.
If we think that reassignment is a special kind of replication then it
might make sense to treat it as that and for me it sounds logical that we
sort of count under the replication quota. It also protects you from
setting too high value as you won't be able to configure something higher
than replication.throttled.rate (but then what's left for general
replication). On the other side users may have to increase their
replication.throttled.rate if they want to increase or set their
reassignment quota. This doesn't really play when you treat them under
non-related quotas but you have to keep your total quota
(replication+reassignment) in mind. Also I don't usually see customers
using replication throttling during normal operation so for them it might
be better to use the additive version (where 20 + 10 = 30) from an
operational perspective (less things to configure).
I'll add this non-additive version to the rejected alternatives.

Considering the mentioned drawbacks I think it's better to go with the
additive one.
The updated KIP:
"The possible configuration variations are:
- replication.throttled.rate is set but reassignment.throttled.rate isn't
(or -1): any kind of replication (so including reassignment) can take up to
replication.throttled.rate bytes.
- replication.throttled.rate and reassignment.throttled.rate both set: both
can use a bandwidth up to the configured limit and the total replication
limit will be reassignment.throttled.rate + replication.throttled.rate
- replication.throttled.rate is not set but reassignment.throttled.rate is
set: in this case general replication has no bandwidth limits but
reassignment.throttled.rate has the configured limits.
- neither replication.throttled.rate nor reassignment.throttled.rate are
set (or -1): no throttling is set on any replication."

3. Yea, the motivation section might be a bit poorly worded as both you and
Ismael pointed out problems, so let me rephrase it:
"A user is able to specify the partition and the throttle rate but it will
be applied to all non-ISR replication traffic. This is can be undesirable
because during reassignment it also applies to non-reassignment replication
and causes a replica to be throttled if it falls out of ISR. Also if
leadership changes during reassignment, the throttles also have to be
changed manually."

Viktor

On Tue, Dec 10, 2019 at 8:16 PM Stanislav Kozlovski 
wrote:

> Hey Viktor,
>
> I like your latest idea regarding the replication/reassignment configs
> interplay - I think it makes sense for replication to always be higher. A
> small matrix of possibilities in the KIP may be useful to future readers
> (users)
> To be extra clear:
> 1. if reassignment.throttle is -1, reassignment traffic is counted with
> replication traffic against replication.throttle
> 2. if replication.throttle is 20 and reassignment.throttle is 10, we have a
> 30 total throttle
> Is my understanding correct?
>
> Regarding the KIP - the motivation states
>
> > So a user is able to specify the partition and the throttle rate but it
> will be applied to all non-ISR replication traffic. This is undesirable
> because if a node that is being throttled falls out of ISR it would further
> prevent it from catching up.
>
> This KIP does not solve this problem, right?
> Or did you mean to mention the problem where reassignment replicas would
> eat up the throttle and further limit the non-ISR "original" replicas from
> catching up?
>
> Best,
> Stanislav
>
> On Tue, Dec 10, 2019 at 9:09 AM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com>
> wrote:
>
> > This config will only be applied to those replicas which are reassigning
> > and not yet in ISR. When they become ISR then reassignment throttling
> stops
> > altogether and won't apply when they fall out of ISR. Specifically
> > the validity of the config spans from the point when a reassignment is
> > propagated by the adding_replicas field in the LeaderAndIsr request until
> > the broker gets another LeaderAndIsr request saying that the new replica
> is
> > added and in ISR. Furthermore the config will be applied only the actual
> > leader and follower so if the leader changes in the meanwhile the
> > throttling changes with it (again based on the LeaderAndIsr requests).
> >
> > For instance when a new broker is added to offload some partitions there,
> > it will be safer to use this config instead of general fetch throttling
> for
> > this very reason: when an existing partition that is being reassigned
> falls
> > out of ISR then it will be propagated via the LeaderAndIsr 

Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-10 Thread Stanislav Kozlovski
Hey Viktor,

I like your latest idea regarding the replication/reassignment configs
interplay - I think it makes sense for replication to always be higher. A
small matrix of possibilities in the KIP may be useful to future readers
(users)
To be extra clear:
1. if reassignment.throttle is -1, reassignment traffic is counted with
replication traffic against replication.throttle
2. if replication.throttle is 20 and reassignment.throttle is 10, we have a
30 total throttle
Is my understanding correct?

Regarding the KIP - the motivation states

> So a user is able to specify the partition and the throttle rate but it
will be applied to all non-ISR replication traffic. This is undesirable
because if a node that is being throttled falls out of ISR it would further
prevent it from catching up.

This KIP does not solve this problem, right?
Or did you mean to mention the problem where reassignment replicas would
eat up the throttle and further limit the non-ISR "original" replicas from
catching up?

Best,
Stanislav

On Tue, Dec 10, 2019 at 9:09 AM Viktor Somogyi-Vass 
wrote:

> This config will only be applied to those replicas which are reassigning
> and not yet in ISR. When they become ISR then reassignment throttling stops
> altogether and won't apply when they fall out of ISR. Specifically
> the validity of the config spans from the point when a reassignment is
> propagated by the adding_replicas field in the LeaderAndIsr request until
> the broker gets another LeaderAndIsr request saying that the new replica is
> added and in ISR. Furthermore the config will be applied only the actual
> leader and follower so if the leader changes in the meanwhile the
> throttling changes with it (again based on the LeaderAndIsr requests).
>
> For instance when a new broker is added to offload some partitions there,
> it will be safer to use this config instead of general fetch throttling for
> this very reason: when an existing partition that is being reassigned falls
> out of ISR then it will be propagated via the LeaderAndIsr request so
> throttling also changes. This removes the need for changing the configs
> manually and would give an easy way for people to configure throttling yet
> would make better efforts to not throttle what's not needed to be throttled
> (the replica which is falling out of ISR).
>
> Viktor
>
> On Fri, Dec 6, 2019 at 5:12 PM Ismael Juma  wrote:
>
> > My concern is that we're very focused on reassignment where I think users
> > enable throttling to avoid overwhelming brokers with replica catch up
> > traffic (typically disk and/or bandwidth). The current approach achieves
> > that by not throttling ISR replication.
> >
> > The downside is that when a broker falls out of the ISR, it may suddenly
> > get throttled and never catch up. However, if the throttle can cause this
> > kind of issue, then it's broken for replicas being reassigned too, so one
> > could say that it's a configuration error.
> >
> > Do we have specific scenarios that would be solved by the proposed
> change?
> >
> > Ismael
> >
> > On Fri, Dec 6, 2019 at 2:26 AM Viktor Somogyi-Vass <
> > viktorsomo...@gmail.com>
> > wrote:
> >
> > > Thanks for the question. I think it depends on how the user will try to
> > fix
> > > it.
> > > - If they just replace the disk then I think it shouldn't count as a
> > > reassignment and should be allocated under the normal replication
> quotas.
> > > In this case there is no reassignment going on as far as I can tell,
> the
> > > broker shuts down serving replicas from that dir/disk, notifies the
> > > controller which changes the leadership. When the disk is fixed the
> > broker
> > > will be restarted to pick up the changes and it starts the replication
> > from
> > > the current leader.
> > > - If the user reassigns the partitions to other brokers then it will
> fall
> > > under the reassignment traffic.
> > > Also if the user moves a partition to a different disk it would also
> > count
> > > as normal replication as it technically not a reassignment but an
> > > alter-replica-dir event but it's still done with the reassignment tool,
> > so
> > > I'd keep the current functionality of the
> > > --replica-alter-log-dirs-throttle.
> > > Is this aligned with your thinking?
> > >
> > > Viktor
> > >
> > > On Wed, Dec 4, 2019 at 2:47 PM Ismael Juma  wrote:
> > >
> > > > Thanks Viktor. How do we intend to handle the case where a broker
> loses
> > > its
> > > > disk and has to catch up from the beginning?
> > > >
> > > > Ismael
> > > >
> > > > On Wed, Dec 4, 2019, 4:31 AM Viktor Somogyi-Vass <
> > > viktorsomo...@gmail.com>
> > > > wrote:
> > > >
> > > > > Thanks for the notice Ismael, KAFKA-4313 fixed this issue indeed.
> > I've
> > > > > updated the KIP.
> > > > >
> > > > > Viktor
> > > > >
> > > > > On Tue, Dec 3, 2019 at 3:28 PM Ismael Juma 
> > wrote:
> > > > >
> > > > > > Hi Viktor,
> > > > > >
> > > > > > The KIP states:
> > > > > >
> > > > > > "KIP-73
> > > > > > <
> > > > > >
> > > > >
> > > >
> 

Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-10 Thread Viktor Somogyi-Vass
This config will only be applied to those replicas which are reassigning
and not yet in ISR. When they become ISR then reassignment throttling stops
altogether and won't apply when they fall out of ISR. Specifically
the validity of the config spans from the point when a reassignment is
propagated by the adding_replicas field in the LeaderAndIsr request until
the broker gets another LeaderAndIsr request saying that the new replica is
added and in ISR. Furthermore the config will be applied only the actual
leader and follower so if the leader changes in the meanwhile the
throttling changes with it (again based on the LeaderAndIsr requests).

For instance when a new broker is added to offload some partitions there,
it will be safer to use this config instead of general fetch throttling for
this very reason: when an existing partition that is being reassigned falls
out of ISR then it will be propagated via the LeaderAndIsr request so
throttling also changes. This removes the need for changing the configs
manually and would give an easy way for people to configure throttling yet
would make better efforts to not throttle what's not needed to be throttled
(the replica which is falling out of ISR).

Viktor

On Fri, Dec 6, 2019 at 5:12 PM Ismael Juma  wrote:

> My concern is that we're very focused on reassignment where I think users
> enable throttling to avoid overwhelming brokers with replica catch up
> traffic (typically disk and/or bandwidth). The current approach achieves
> that by not throttling ISR replication.
>
> The downside is that when a broker falls out of the ISR, it may suddenly
> get throttled and never catch up. However, if the throttle can cause this
> kind of issue, then it's broken for replicas being reassigned too, so one
> could say that it's a configuration error.
>
> Do we have specific scenarios that would be solved by the proposed change?
>
> Ismael
>
> On Fri, Dec 6, 2019 at 2:26 AM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com>
> wrote:
>
> > Thanks for the question. I think it depends on how the user will try to
> fix
> > it.
> > - If they just replace the disk then I think it shouldn't count as a
> > reassignment and should be allocated under the normal replication quotas.
> > In this case there is no reassignment going on as far as I can tell, the
> > broker shuts down serving replicas from that dir/disk, notifies the
> > controller which changes the leadership. When the disk is fixed the
> broker
> > will be restarted to pick up the changes and it starts the replication
> from
> > the current leader.
> > - If the user reassigns the partitions to other brokers then it will fall
> > under the reassignment traffic.
> > Also if the user moves a partition to a different disk it would also
> count
> > as normal replication as it technically not a reassignment but an
> > alter-replica-dir event but it's still done with the reassignment tool,
> so
> > I'd keep the current functionality of the
> > --replica-alter-log-dirs-throttle.
> > Is this aligned with your thinking?
> >
> > Viktor
> >
> > On Wed, Dec 4, 2019 at 2:47 PM Ismael Juma  wrote:
> >
> > > Thanks Viktor. How do we intend to handle the case where a broker loses
> > its
> > > disk and has to catch up from the beginning?
> > >
> > > Ismael
> > >
> > > On Wed, Dec 4, 2019, 4:31 AM Viktor Somogyi-Vass <
> > viktorsomo...@gmail.com>
> > > wrote:
> > >
> > > > Thanks for the notice Ismael, KAFKA-4313 fixed this issue indeed.
> I've
> > > > updated the KIP.
> > > >
> > > > Viktor
> > > >
> > > > On Tue, Dec 3, 2019 at 3:28 PM Ismael Juma 
> wrote:
> > > >
> > > > > Hi Viktor,
> > > > >
> > > > > The KIP states:
> > > > >
> > > > > "KIP-73
> > > > > <
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-73+Replication+Quotas
> > > > > >
> > > > > added
> > > > > quotas for replication but it doesn't separate normal replication
> > > traffic
> > > > > from reassignment. So a user is able to specify the partition and
> the
> > > > > throttle rate but it will be applied to both ISR and non-ISR
> traffic"
> > > > >
> > > > > This is not true, ISR traffic is not throttled.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Thu, Oct 24, 2019 at 5:38 AM Viktor Somogyi-Vass <
> > > > > viktorsomo...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi People,
> > > > > >
> > > > > > I've created a KIP to improve replication quotas by handling
> > > > reassignment
> > > > > > related throttling as a separate case with its own configurable
> > > limits
> > > > > and
> > > > > > change the kafka-reassign-partitions tool to use these new
> configs
> > > > going
> > > > > > forward.
> > > > > > Please have a look, I'd be happy to receive any feedback and
> answer
> > > > > > all your questions.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-542%3A+Partition+Reassignment+Throttling
> > > > > >
> > > > > > Thanks,
> > > > > > Viktor
> > > > > >
> > > > >
> > 

Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-06 Thread Ismael Juma
My concern is that we're very focused on reassignment where I think users
enable throttling to avoid overwhelming brokers with replica catch up
traffic (typically disk and/or bandwidth). The current approach achieves
that by not throttling ISR replication.

The downside is that when a broker falls out of the ISR, it may suddenly
get throttled and never catch up. However, if the throttle can cause this
kind of issue, then it's broken for replicas being reassigned too, so one
could say that it's a configuration error.

Do we have specific scenarios that would be solved by the proposed change?

Ismael

On Fri, Dec 6, 2019 at 2:26 AM Viktor Somogyi-Vass 
wrote:

> Thanks for the question. I think it depends on how the user will try to fix
> it.
> - If they just replace the disk then I think it shouldn't count as a
> reassignment and should be allocated under the normal replication quotas.
> In this case there is no reassignment going on as far as I can tell, the
> broker shuts down serving replicas from that dir/disk, notifies the
> controller which changes the leadership. When the disk is fixed the broker
> will be restarted to pick up the changes and it starts the replication from
> the current leader.
> - If the user reassigns the partitions to other brokers then it will fall
> under the reassignment traffic.
> Also if the user moves a partition to a different disk it would also count
> as normal replication as it technically not a reassignment but an
> alter-replica-dir event but it's still done with the reassignment tool, so
> I'd keep the current functionality of the
> --replica-alter-log-dirs-throttle.
> Is this aligned with your thinking?
>
> Viktor
>
> On Wed, Dec 4, 2019 at 2:47 PM Ismael Juma  wrote:
>
> > Thanks Viktor. How do we intend to handle the case where a broker loses
> its
> > disk and has to catch up from the beginning?
> >
> > Ismael
> >
> > On Wed, Dec 4, 2019, 4:31 AM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com>
> > wrote:
> >
> > > Thanks for the notice Ismael, KAFKA-4313 fixed this issue indeed. I've
> > > updated the KIP.
> > >
> > > Viktor
> > >
> > > On Tue, Dec 3, 2019 at 3:28 PM Ismael Juma  wrote:
> > >
> > > > Hi Viktor,
> > > >
> > > > The KIP states:
> > > >
> > > > "KIP-73
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-73+Replication+Quotas
> > > > >
> > > > added
> > > > quotas for replication but it doesn't separate normal replication
> > traffic
> > > > from reassignment. So a user is able to specify the partition and the
> > > > throttle rate but it will be applied to both ISR and non-ISR traffic"
> > > >
> > > > This is not true, ISR traffic is not throttled.
> > > >
> > > > Ismael
> > > >
> > > > On Thu, Oct 24, 2019 at 5:38 AM Viktor Somogyi-Vass <
> > > > viktorsomo...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi People,
> > > > >
> > > > > I've created a KIP to improve replication quotas by handling
> > > reassignment
> > > > > related throttling as a separate case with its own configurable
> > limits
> > > > and
> > > > > change the kafka-reassign-partitions tool to use these new configs
> > > going
> > > > > forward.
> > > > > Please have a look, I'd be happy to receive any feedback and answer
> > > > > all your questions.
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-542%3A+Partition+Reassignment+Throttling
> > > > >
> > > > > Thanks,
> > > > > Viktor
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-06 Thread Viktor Somogyi-Vass
Thanks for the question. I think it depends on how the user will try to fix
it.
- If they just replace the disk then I think it shouldn't count as a
reassignment and should be allocated under the normal replication quotas.
In this case there is no reassignment going on as far as I can tell, the
broker shuts down serving replicas from that dir/disk, notifies the
controller which changes the leadership. When the disk is fixed the broker
will be restarted to pick up the changes and it starts the replication from
the current leader.
- If the user reassigns the partitions to other brokers then it will fall
under the reassignment traffic.
Also if the user moves a partition to a different disk it would also count
as normal replication as it technically not a reassignment but an
alter-replica-dir event but it's still done with the reassignment tool, so
I'd keep the current functionality of the --replica-alter-log-dirs-throttle.
Is this aligned with your thinking?

Viktor

On Wed, Dec 4, 2019 at 2:47 PM Ismael Juma  wrote:

> Thanks Viktor. How do we intend to handle the case where a broker loses its
> disk and has to catch up from the beginning?
>
> Ismael
>
> On Wed, Dec 4, 2019, 4:31 AM Viktor Somogyi-Vass 
> wrote:
>
> > Thanks for the notice Ismael, KAFKA-4313 fixed this issue indeed. I've
> > updated the KIP.
> >
> > Viktor
> >
> > On Tue, Dec 3, 2019 at 3:28 PM Ismael Juma  wrote:
> >
> > > Hi Viktor,
> > >
> > > The KIP states:
> > >
> > > "KIP-73
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-73+Replication+Quotas
> > > >
> > > added
> > > quotas for replication but it doesn't separate normal replication
> traffic
> > > from reassignment. So a user is able to specify the partition and the
> > > throttle rate but it will be applied to both ISR and non-ISR traffic"
> > >
> > > This is not true, ISR traffic is not throttled.
> > >
> > > Ismael
> > >
> > > On Thu, Oct 24, 2019 at 5:38 AM Viktor Somogyi-Vass <
> > > viktorsomo...@gmail.com>
> > > wrote:
> > >
> > > > Hi People,
> > > >
> > > > I've created a KIP to improve replication quotas by handling
> > reassignment
> > > > related throttling as a separate case with its own configurable
> limits
> > > and
> > > > change the kafka-reassign-partitions tool to use these new configs
> > going
> > > > forward.
> > > > Please have a look, I'd be happy to receive any feedback and answer
> > > > all your questions.
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-542%3A+Partition+Reassignment+Throttling
> > > >
> > > > Thanks,
> > > > Viktor
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-04 Thread Ismael Juma
Thanks Viktor. How do we intend to handle the case where a broker loses its
disk and has to catch up from the beginning?

Ismael

On Wed, Dec 4, 2019, 4:31 AM Viktor Somogyi-Vass 
wrote:

> Thanks for the notice Ismael, KAFKA-4313 fixed this issue indeed. I've
> updated the KIP.
>
> Viktor
>
> On Tue, Dec 3, 2019 at 3:28 PM Ismael Juma  wrote:
>
> > Hi Viktor,
> >
> > The KIP states:
> >
> > "KIP-73
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-73+Replication+Quotas
> > >
> > added
> > quotas for replication but it doesn't separate normal replication traffic
> > from reassignment. So a user is able to specify the partition and the
> > throttle rate but it will be applied to both ISR and non-ISR traffic"
> >
> > This is not true, ISR traffic is not throttled.
> >
> > Ismael
> >
> > On Thu, Oct 24, 2019 at 5:38 AM Viktor Somogyi-Vass <
> > viktorsomo...@gmail.com>
> > wrote:
> >
> > > Hi People,
> > >
> > > I've created a KIP to improve replication quotas by handling
> reassignment
> > > related throttling as a separate case with its own configurable limits
> > and
> > > change the kafka-reassign-partitions tool to use these new configs
> going
> > > forward.
> > > Please have a look, I'd be happy to receive any feedback and answer
> > > all your questions.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-542%3A+Partition+Reassignment+Throttling
> > >
> > > Thanks,
> > > Viktor
> > >
> >
>


Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-04 Thread Viktor Somogyi-Vass
Thanks for the notice Ismael, KAFKA-4313 fixed this issue indeed. I've
updated the KIP.

Viktor

On Tue, Dec 3, 2019 at 3:28 PM Ismael Juma  wrote:

> Hi Viktor,
>
> The KIP states:
>
> "KIP-73
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-73+Replication+Quotas
> >
> added
> quotas for replication but it doesn't separate normal replication traffic
> from reassignment. So a user is able to specify the partition and the
> throttle rate but it will be applied to both ISR and non-ISR traffic"
>
> This is not true, ISR traffic is not throttled.
>
> Ismael
>
> On Thu, Oct 24, 2019 at 5:38 AM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com>
> wrote:
>
> > Hi People,
> >
> > I've created a KIP to improve replication quotas by handling reassignment
> > related throttling as a separate case with its own configurable limits
> and
> > change the kafka-reassign-partitions tool to use these new configs going
> > forward.
> > Please have a look, I'd be happy to receive any feedback and answer
> > all your questions.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-542%3A+Partition+Reassignment+Throttling
> >
> > Thanks,
> > Viktor
> >
>


Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-03 Thread Ismael Juma
Hi Viktor,

The KIP states:

"KIP-73

added
quotas for replication but it doesn't separate normal replication traffic
from reassignment. So a user is able to specify the partition and the
throttle rate but it will be applied to both ISR and non-ISR traffic"

This is not true, ISR traffic is not throttled.

Ismael

On Thu, Oct 24, 2019 at 5:38 AM Viktor Somogyi-Vass 
wrote:

> Hi People,
>
> I've created a KIP to improve replication quotas by handling reassignment
> related throttling as a separate case with its own configurable limits and
> change the kafka-reassign-partitions tool to use these new configs going
> forward.
> Please have a look, I'd be happy to receive any feedback and answer
> all your questions.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-542%3A+Partition+Reassignment+Throttling
>
> Thanks,
> Viktor
>


Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-03 Thread Viktor Somogyi-Vass
Hi Stan,

I was about to start a vote on this one but I think I have one more idea to
your last point about the total cap.
What if we said that the (leader|follower).replication.throttled.rate is
the overall limit which we allow for leadership replication (so the total
cap) and (leader|follower).reassignment.throttled.rate must have a value
lower than that. By default it'd be -1 which would mean that
replication.throttled.rate should be applied (so the backward compatible
behavior). Increasing this value would mean that we put a limit on
reassignment throttling. For other replication throttling the
replication.throttled.rate - reassignment.throttled.rate would be applied.
If replication.throttled.rate is not specified but
reassignment.throttled.rate is specified, then the reassignment is bounded
and other replication traffic isn't. Finally,
replication.throttled.replicas would be applied on reassignment too if
specified so the reassignment won't "escape" the boundaries given by the
replication throttle side.
I think this is a fair solution to solve the total cap problem and would be
aligned with the current config.
What do you think?

Viktor

On Mon, Nov 4, 2019 at 3:55 PM Viktor Somogyi-Vass 
wrote:

> Exactly. I also can't envision scenarios where we would like to throttle
> the reassignment traffic to only a subset of the reassigning replicas.
>
> The other day I was wondering about that with specialized quotas we can
> solve the incremental partition reassignment too. Basically the controller
> would throttle most of the partitions to 0 and let only some of them to
> reassign but I discarded the idea because it is more intuitive to actually
> break up a big reassignment into smaller steps (and more traceable too).
> But perhaps there is a need for throttling the reassigning replicas
> differently depending on the produce rate on those partitions, however in
> my mind I was planning with the incremental partition reassignment so
> perhaps it'd be the best if the controller would be able to decide how many
> partition can be fitted into the given bandwidth and we'd just expose
> simple configs.
>
> If we always take the lowest value, this means that the reassignment
> throttle must always be equal to or lower than the replication throttle.
> Doesn't that mean that the reassigning partitions may never catch up? I
> guess not, since we expect to always be moving less than the total number
> of partitions at one time.
> I have mixed feelings about this - I like the flexibility of being able to
> configure whatever value we please, yet I struggle to come up with a
> scenario where we would want a higher reassignment throttle than
> replication. Perhaps your suggestion is better.
>
> Yes it could mean that, however concern with preferring reassignment
> quotas is that it could cause the "bootstrapping broker problem", so the
> sum of follower reassignment + replication quotas would eat away the
> bandwidth from the leaders. In this case I think it's a better problem to
> have a reassignment that you can't finish than leaders unable to answer
> fetch requests fast enough. The reassignment problem can be mitigated in
> this case by carefully increasing the replication & reassignment quotas in
> this case for the given partition. I'll set up a test environment for this
> though and get back if something doesn't add up.
>
> This begs another question - since we're separating the replication
> throttle from the reassignment throttle, the maximum traffic a broker may
> replicate now becomes `replication.throttled.rate` + `
> reassignment.throttled.rate`
> Seems like we would benefit from having a total cap to ensure users don't
> shoot themselves in the foot.
>
> We could have a new config that denotes the total possible throttle rate
> and we then divide that by reassignment and replication. But that assumes
> that we would set the replication.throttled.rate much lower than what the
> broker could handle.
>
> Perhaps the best approach would be to denote how much the broker can handle
> (total.replication.throttle.rate) and then allow only up to N% of that go
> towards reassignments (reassignment.throttled.rate) in a best-effort way
> (preferring replication traffic). That sounds tricky to implement though
> Interested to hear what others think
>
> Good catch. I'm also leaning towards to having simpler configs and
> improving the broker/controller code to make more intelligent decisions. I
> also agree with having a total.replication.throttle.rate but I think we
> should stay with the byte based notation as that is more conventional in
> the quota world and easier to handle. That way you can say that your total
> replication quota is 10, your leader and follower replication quota is 3
> each, the reassignment ones are 2 each and then you maxed out your limit.
> We can print warnings/errors if the overall value doesn't match up to the
> max.
>
> Viktor
>
> On Mon, Nov 4, 2019 at 12:27 PM Stanislav Kozlovski <
> 

Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-11-04 Thread Viktor Somogyi-Vass
Exactly. I also can't envision scenarios where we would like to throttle
the reassignment traffic to only a subset of the reassigning replicas.

The other day I was wondering about that with specialized quotas we can
solve the incremental partition reassignment too. Basically the controller
would throttle most of the partitions to 0 and let only some of them to
reassign but I discarded the idea because it is more intuitive to actually
break up a big reassignment into smaller steps (and more traceable too).
But perhaps there is a need for throttling the reassigning replicas
differently depending on the produce rate on those partitions, however in
my mind I was planning with the incremental partition reassignment so
perhaps it'd be the best if the controller would be able to decide how many
partition can be fitted into the given bandwidth and we'd just expose
simple configs.

If we always take the lowest value, this means that the reassignment
throttle must always be equal to or lower than the replication throttle.
Doesn't that mean that the reassigning partitions may never catch up? I
guess not, since we expect to always be moving less than the total number
of partitions at one time.
I have mixed feelings about this - I like the flexibility of being able to
configure whatever value we please, yet I struggle to come up with a
scenario where we would want a higher reassignment throttle than
replication. Perhaps your suggestion is better.

Yes it could mean that, however concern with preferring reassignment quotas
is that it could cause the "bootstrapping broker problem", so the sum of
follower reassignment + replication quotas would eat away the bandwidth
from the leaders. In this case I think it's a better problem to have a
reassignment that you can't finish than leaders unable to answer fetch
requests fast enough. The reassignment problem can be mitigated in this
case by carefully increasing the replication & reassignment quotas in this
case for the given partition. I'll set up a test environment for this
though and get back if something doesn't add up.

This begs another question - since we're separating the replication
throttle from the reassignment throttle, the maximum traffic a broker may
replicate now becomes `replication.throttled.rate` + `
reassignment.throttled.rate`
Seems like we would benefit from having a total cap to ensure users don't
shoot themselves in the foot.

We could have a new config that denotes the total possible throttle rate
and we then divide that by reassignment and replication. But that assumes
that we would set the replication.throttled.rate much lower than what the
broker could handle.

Perhaps the best approach would be to denote how much the broker can handle
(total.replication.throttle.rate) and then allow only up to N% of that go
towards reassignments (reassignment.throttled.rate) in a best-effort way
(preferring replication traffic). That sounds tricky to implement though
Interested to hear what others think

Good catch. I'm also leaning towards to having simpler configs and
improving the broker/controller code to make more intelligent decisions. I
also agree with having a total.replication.throttle.rate but I think we
should stay with the byte based notation as that is more conventional in
the quota world and easier to handle. That way you can say that your total
replication quota is 10, your leader and follower replication quota is 3
each, the reassignment ones are 2 each and then you maxed out your limit.
We can print warnings/errors if the overall value doesn't match up to the
max.

Viktor

On Mon, Nov 4, 2019 at 12:27 PM Stanislav Kozlovski 
wrote:

> Hi Viktor,
>
> > As for the first question I think is no need for *.throttled.replicas in
> case of reassignment because the LeaderAndIsrRequest exactly specifies the
> replicas needed to be throttled.
>
> Exactly. I also can't envision scenarios where we would like to throttle
> the reassignment traffic to only a subset of the reassigning replicas.
>
> > For instance a bootstrapping server where all replicas are throttled and
> there are reassigning replicas and the reassignment throttle set higher I
> think we should still apply the replication throttle to ensure the broker
> won't have problems. What do you think?
>
> If we always take the lowest value, this means that the reassignment
> throttle must always be equal to or lower than the replication throttle.
> Doesn't that mean that the reassigning partitions may never catch up? I
> guess not, since we expect to always be moving less than the total number
> of partitions at one time.
> I have mixed feelings about this - I like the flexibility of being able to
> configure whatever value we please, yet I struggle to come up with a
> scenario where we would want a higher reassignment throttle than
> replication. Perhaps your suggestion is better.
>
> This begs another question - since we're separating the replication
> throttle from the reassignment throttle, the 

Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-11-04 Thread Stanislav Kozlovski
Hi Viktor,

> As for the first question I think is no need for *.throttled.replicas in
case of reassignment because the LeaderAndIsrRequest exactly specifies the
replicas needed to be throttled.

Exactly. I also can't envision scenarios where we would like to throttle
the reassignment traffic to only a subset of the reassigning replicas.

> For instance a bootstrapping server where all replicas are throttled and
there are reassigning replicas and the reassignment throttle set higher I
think we should still apply the replication throttle to ensure the broker
won't have problems. What do you think?

If we always take the lowest value, this means that the reassignment
throttle must always be equal to or lower than the replication throttle.
Doesn't that mean that the reassigning partitions may never catch up? I
guess not, since we expect to always be moving less than the total number
of partitions at one time.
I have mixed feelings about this - I like the flexibility of being able to
configure whatever value we please, yet I struggle to come up with a
scenario where we would want a higher reassignment throttle than
replication. Perhaps your suggestion is better.

This begs another question - since we're separating the replication
throttle from the reassignment throttle, the maximum traffic a broker may
replicate now becomes `replication.throttled.rate` + `
reassignment.throttled.rate`
Seems like we would benefit from having a total cap to ensure users don't
shoot themselves in the foot.

We could have a new config that denotes the total possible throttle rate
and we then divide that by reassignment and replication. But that assumes
that we would set the replication.throttled.rate much lower than what the
broker could handle.

Perhaps the best approach would be to denote how much the broker can handle
(total.replication.throttle.rate) and then allow only up to N% of that go
towards reassignments (reassignment.throttled.rate) in a best-effort way
(preferring replication traffic). That sounds tricky to implement though
Interested to hear what others think

Best,
Stanislav


On Mon, Nov 4, 2019 at 11:08 AM Viktor Somogyi-Vass 
wrote:

> Hey Stan,
>
> > We will introduce two new configs in order to eventually replace
> *.replication.throttled.rate.
> Just to clarify, you mean to replace said config in the context of
> reassignment throttling, right? We are not planning to remove that config
>
> Yes, I don't want to remove that config either. Removed that sentence.
>
> And also to clarify, *.throttled.replicas will not apply to the new
> *reassignment* configs, correct? We will throttle all reassigning replicas.
> (I am +1 on this, I believe it is easier to reason about. We could always
> add a new config later)
>
> Are you asking whether there is a need for a
> leader.reassignment.throttled.replicas and
> follower.reassignment.throttled.replicas config or are you interested in
> the behavior between the old and the new configs?
> As for the first question I think is no need for *.throttled.replicas in
> case of reassignment because the LeaderAndIsrRequest exactly specifies the
> replicas needed to be throttled.
> As for the second, see below.
>
> I have one comment about backwards-compatibility - should we ensure that
> the old `*.replication.throttled.rate` and `*.throttled.replicas` still
> apply to reassigning traffic if set? We could have the new config take
> precedence, but still preserve backwards compatibility.
>
> Sure, we should apply replication throttling to reassignment too if set.
> But instead of the new taking precedence I'd apply whichever has lower
> value.
> For instance a bootstrapping server where all replicas are throttled and
> there are reassigning replicas and the reassignment throttle set higher I
> think we should still apply the replication throttle to ensure the broker
> won't have problems. What do you think?
>
> Thanks,
> Viktor
>
>
> On Fri, Nov 1, 2019 at 9:57 AM Stanislav Kozlovski  >
> wrote:
>
> > Hey Viktor. Thanks for the KIP!
> >
> > > We will introduce two new configs in order to eventually replace
> > *.replication.throttled.rate.
> > Just to clarify, you mean to replace said config in the context of
> > reassignment throttling, right? We are not planning to remove that config
> >
> > And also to clarify, *.throttled.replicas will not apply to the new
> > *reassignment* configs, correct? We will throttle all reassigning
> replicas.
> > (I am +1 on this, I believe it is easier to reason about. We could always
> > add a new config later)
> >
> > I have one comment about backwards-compatibility - should we ensure that
> > the old `*.replication.throttled.rate` and `*.throttled.replicas` still
> > apply to reassigning traffic if set? We could have the new config take
> > precedence, but still preserve backwards compatibility.
> >
> > Thanks,
> > Stanislav
> >
> > On Thu, Oct 24, 2019 at 1:38 PM Viktor Somogyi-Vass <
> > viktorsomo...@gmail.com>
> > wrote:
> >
> > > Hi People,
> 

Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-11-04 Thread Viktor Somogyi-Vass
Hey Stan,

> We will introduce two new configs in order to eventually replace
*.replication.throttled.rate.
Just to clarify, you mean to replace said config in the context of
reassignment throttling, right? We are not planning to remove that config

Yes, I don't want to remove that config either. Removed that sentence.

And also to clarify, *.throttled.replicas will not apply to the new
*reassignment* configs, correct? We will throttle all reassigning replicas.
(I am +1 on this, I believe it is easier to reason about. We could always
add a new config later)

Are you asking whether there is a need for a
leader.reassignment.throttled.replicas and
follower.reassignment.throttled.replicas config or are you interested in
the behavior between the old and the new configs?
As for the first question I think is no need for *.throttled.replicas in
case of reassignment because the LeaderAndIsrRequest exactly specifies the
replicas needed to be throttled.
As for the second, see below.

I have one comment about backwards-compatibility - should we ensure that
the old `*.replication.throttled.rate` and `*.throttled.replicas` still
apply to reassigning traffic if set? We could have the new config take
precedence, but still preserve backwards compatibility.

Sure, we should apply replication throttling to reassignment too if set.
But instead of the new taking precedence I'd apply whichever has lower
value.
For instance a bootstrapping server where all replicas are throttled and
there are reassigning replicas and the reassignment throttle set higher I
think we should still apply the replication throttle to ensure the broker
won't have problems. What do you think?

Thanks,
Viktor


On Fri, Nov 1, 2019 at 9:57 AM Stanislav Kozlovski 
wrote:

> Hey Viktor. Thanks for the KIP!
>
> > We will introduce two new configs in order to eventually replace
> *.replication.throttled.rate.
> Just to clarify, you mean to replace said config in the context of
> reassignment throttling, right? We are not planning to remove that config
>
> And also to clarify, *.throttled.replicas will not apply to the new
> *reassignment* configs, correct? We will throttle all reassigning replicas.
> (I am +1 on this, I believe it is easier to reason about. We could always
> add a new config later)
>
> I have one comment about backwards-compatibility - should we ensure that
> the old `*.replication.throttled.rate` and `*.throttled.replicas` still
> apply to reassigning traffic if set? We could have the new config take
> precedence, but still preserve backwards compatibility.
>
> Thanks,
> Stanislav
>
> On Thu, Oct 24, 2019 at 1:38 PM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com>
> wrote:
>
> > Hi People,
> >
> > I've created a KIP to improve replication quotas by handling reassignment
> > related throttling as a separate case with its own configurable limits
> and
> > change the kafka-reassign-partitions tool to use these new configs going
> > forward.
> > Please have a look, I'd be happy to receive any feedback and answer
> > all your questions.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-542%3A+Partition+Reassignment+Throttling
> >
> > Thanks,
> > Viktor
> >
>
>
> --
> Best,
> Stanislav
>


Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-11-01 Thread Stanislav Kozlovski
Hey Viktor. Thanks for the KIP!

> We will introduce two new configs in order to eventually replace
*.replication.throttled.rate.
Just to clarify, you mean to replace said config in the context of
reassignment throttling, right? We are not planning to remove that config

And also to clarify, *.throttled.replicas will not apply to the new
*reassignment* configs, correct? We will throttle all reassigning replicas.
(I am +1 on this, I believe it is easier to reason about. We could always
add a new config later)

I have one comment about backwards-compatibility - should we ensure that
the old `*.replication.throttled.rate` and `*.throttled.replicas` still
apply to reassigning traffic if set? We could have the new config take
precedence, but still preserve backwards compatibility.

Thanks,
Stanislav

On Thu, Oct 24, 2019 at 1:38 PM Viktor Somogyi-Vass 
wrote:

> Hi People,
>
> I've created a KIP to improve replication quotas by handling reassignment
> related throttling as a separate case with its own configurable limits and
> change the kafka-reassign-partitions tool to use these new configs going
> forward.
> Please have a look, I'd be happy to receive any feedback and answer
> all your questions.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-542%3A+Partition+Reassignment+Throttling
>
> Thanks,
> Viktor
>


-- 
Best,
Stanislav


[DISCUSS] KIP-542: Partition Reassignment Throttling

2019-10-24 Thread Viktor Somogyi-Vass
Hi People,

I've created a KIP to improve replication quotas by handling reassignment
related throttling as a separate case with its own configurable limits and
change the kafka-reassign-partitions tool to use these new configs going
forward.
Please have a look, I'd be happy to receive any feedback and answer
all your questions.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-542%3A+Partition+Reassignment+Throttling

Thanks,
Viktor