[jira] [Commented] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-12 Thread Andreas (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323763#comment-16323763
 ] 

Andreas commented on KAFKA-6442:


For what is worth, restarting node4 (for maintenance) seems to have got 
everything unstuck. I guess it triggered leader relection internally?

> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> Topic: topicA Partition: 32 Leader: -1 Replicas: 1,6,2,7,3,8 Isr: 
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because I cannot run 
> {{kafka-reassign-partitions.sh}}.
> Is there a way to recover from such a situation? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-12 Thread Andreas (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323747#comment-16323747
 ] 

Andreas commented on KAFKA-6442:


Thanks for the reply. I am afraid "unclean.leader.election.enable" is not set 
at all, so it should default to true.
Running ./zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids" returns

WatchedEvent state:SyncConnected type:None path:null
[1, 2, 3, 4]

which is legit.

> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> Topic: topicA Partition: 32 Leader: -1 Replicas: 1,6,2,7,3,8 Isr: 
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because I cannot run 
> {{kafka-reassign-partitions.sh}}.
> Is there a way to recover from such a situation? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

*Topic topicA  Partition 32  Leader 1 
Replicas 162738  Isr *
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

Topic topicA  Partition 32  Leader 1 
Replicas 162738  Isr 
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> *Topic topicA  Partition 32  Leader 1 
> Replicas 162738  Isr *
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because I cannot run 
> 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

Topic topicA  Partition 32  Leader 1 
Replicas 162738  Isr 
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader 1 }}{{ 
Replicas 162738  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> Topic topicA  Partition 32  Leader 1 
> Replicas 162738  Isr 
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because I cannot run 
> 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

Topic topicA  Partition 32  Leader 1 
Replicas 162738  Isr 
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

*Topic topicA  Partition 32  Leader 1 
Replicas 162738  Isr *
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> Topic topicA  Partition 32  Leader 1 
> Replicas 162738  Isr 
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because I cannot run 
> 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader 1 }}{{ 
Replicas 162738  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader 1 }} 
Replicas 162738  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic topicA  Partition 32  Leader 1 }}{{ 
> Replicas 162738  Isr }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because I 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader 1 }} 
Replicas 162738  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader 1  
Replicas 162738  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic topicA  Partition 32  Leader 1 }} 
> Replicas 162738  Isr }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because I cannot 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader 1  
Replicas 162738  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader 1  
Replicas 1,6,2,7,3,8  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic topicA  Partition 32  Leader 1  
> Replicas 162738  Isr }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because I cannot 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader 1  
Replicas 1,6,2,7,3,8  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader -1 }}  
Replicas 1,6,2,7,3,8  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic topicA  Partition 32  Leader 1  
> Replicas 1,6,2,7,3,8  Isr }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader -1 }}  
Replicas 1,6,2,7,3,8  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32}}  Leader -1   
Replicas 1,6,2,7,3,8  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic topicA  Partition 32  Leader -1 }}  
> Replicas 1,6,2,7,3,8  Isr }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader -1   
Replicas 1,6,2,7,3,8  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA }}   Partition: 32  Leader: -1   Replicas: 
1,6,2,7,3,8  Isr: }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic topicA  Partition 32  Leader -1   
> Replicas 1,6,2,7,3,8  Isr }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32}}  Leader -1   
Replicas 1,6,2,7,3,8  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA  Partition 32  Leader -1   
Replicas 1,6,2,7,3,8  Isr }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic topicA  Partition 32}}  Leader -1   
> Replicas 1,6,2,7,3,8  Isr }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA }}   Partition: 32  Leader: -1   Replicas: 
1,6,2,7,3,8  Isr: }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic}}: topicAPartition: 32  Leader: -1   Replicas: 
1,6,2,7,3,8  Isr: }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic topicA }}   Partition: 32  Leader: -1   Replicas: 
> 1,6,2,7,3,8  Isr: }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA }}   Partition: 32  Leader: -1   Replicas: 
1,6,2,7,3,8  Isr: }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic topicA }}   Partition: 32  Leader: -1   Replicas: 
1,6,2,7,3,8  Isr: }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic topicA }}   Partition: 32  Leader: -1   Replicas: 
> 1,6,2,7,3,8  Isr: }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Fix Version/s: (was: 0.8.2.1)

> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic: topicAPartition: 32  Leader: -1   Replicas: 
> 1,6,2,7,3,8  Isr: }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because I cannot run 
> {{kafka-reassign-partitions.sh}}.
> Is there a way to recover from such a situation? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Affects Version/s: 0.8.2.1

> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic: topicAPartition: 32  Leader: -1   Replicas: 
> 1,6,2,7,3,8  Isr: }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because I cannot run 
> {{kafka-reassign-partitions.sh}}.
> Is there a way to recover from such a situation? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic}}: topicAPartition: 32  Leader: -1   Replicas: 
1,6,2,7,3,8  Isr: }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic: topicAPartition: 32  Leader: -1   Replicas: 
1,6,2,7,3,8  Isr: }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Andreas
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic}}: topicAPartition: 32  Leader: -1   Replicas: 
> 1,6,2,7,3,8  Isr: }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic: topicAPartition: 32  Leader: -1   Replicas: 
1,6,2,7,3,8  Isr: }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

-{{ Topic: topicAPartition: 32  Leader: -1   Replicas: 
1,6,2,7,3,8  Isr: }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Reporter: Andreas
> Fix For: 0.8.2.1
>
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic: topicAPartition: 32  Leader: -1   Replicas: 
> 1,6,2,7,3,8  Isr: }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{ Topic: topicAPartition: 32   Leader: -1  Replicas: 1,6,2,7,3,8   
Isr: }}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because I cannot run 
{{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic: topicA Partition: 32   Leader: -1  Replicas: 1,6,2,7,3,8   Isr:}}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because {{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Reporter: Andreas
> Fix For: 0.8.2.1
>
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{ Topic: topicA  Partition: 32   Leader: -1  Replicas: 1,6,2,7,3,8   
> Isr: }}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because I cannot run 

[jira] [Updated] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas updated KAFKA-6442:
---
Description: 
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
{{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

{{Topic: topicA Partition: 32   Leader: -1  Replicas: 1,6,2,7,3,8   Isr:}}
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error {{Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.}}, 
from which I cannot recover because {{kafka-reassign-partitions.sh}}.

Is there a way to recover from such a situation? 

  was:
PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
`kafka-reassign-partitions.sh` to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

`Topic: topicA  Partition: 32   Leader: -1  Replicas: 1,6,2,7,3,8   Isr:`
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun `kafka-reassign-partitions.sh` 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error `Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.`, 
from which I cannot recover because `kafka-reassign-partitions.sh`.

Is there a way to recover from such a situation? 


> Catch 22 with cluster rebalancing
> -
>
> Key: KAFKA-6442
> URL: https://issues.apache.org/jira/browse/KAFKA-6442
> Project: Kafka
>  Issue Type: Bug
>Reporter: Andreas
> Fix For: 0.8.2.1
>
>
> PS. I classified this as a bug because I think the cluster should not be 
> stuck in that situation, apologies if that is wrong.
> Hi,
> I found myself in a situation a bit difficult to explain so I will skip the 
> how I ended up in this situation, but here is the problem.
> Some of the brokers of my cluster are permanently gone. Consequently, I had 
> some partitions that now had offline leaders etc so, I used the 
> {{kafka-reassign-partitions.sh}} to rebalance my topics and for the most part 
> that worked ok. Where that did not work ok, was for partitions that had 
> leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
> through to what now looks like
> {{Topic: topicA   Partition: 32   Leader: -1  Replicas: 1,6,2,7,3,8   
> Isr:}}
> (1,2,3 are legit, 6,7,8 permanently gone)
> So the first catch 22, is that I cannot elect a new leader, because the 
> leader needs to be elected from the ISR, and I cannot recreate the ISR 
> because the topic has no leader.
> The second catch 22 is that I cannot rerun {{kafka-reassign-partitions.sh}} 
> because the previous one is supposedly still in progress, and I cannot 
> increase the number of partitions to account for the now permanently offline 
> partitions, because that produces the following error {{Error while executing 
> topic command requirement failed: All partitions should have the same number 
> of replicas.}}, from which I cannot recover because 
> {{kafka-reassign-partitions.sh}}.
> Is there 

[jira] [Created] (KAFKA-6442) Catch 22 with cluster rebalancing

2018-01-11 Thread Andreas (JIRA)
Andreas created KAFKA-6442:
--

 Summary: Catch 22 with cluster rebalancing
 Key: KAFKA-6442
 URL: https://issues.apache.org/jira/browse/KAFKA-6442
 Project: Kafka
  Issue Type: Bug
Reporter: Andreas
 Fix For: 0.8.2.1


PS. I classified this as a bug because I think the cluster should not be stuck 
in that situation, apologies if that is wrong.

Hi,
I found myself in a situation a bit difficult to explain so I will skip the how 
I ended up in this situation, but here is the problem.

Some of the brokers of my cluster are permanently gone. Consequently, I had 
some partitions that now had offline leaders etc so, I used the 
`kafka-reassign-partitions.sh` to rebalance my topics and for the most part 
that worked ok. Where that did not work ok, was for partitions that had 
leaders, rs and irs completely in the gone brokers. Those got stuck halfway 
through to what now looks like

`Topic: topicA  Partition: 32   Leader: -1  Replicas: 1,6,2,7,3,8   Isr:`
(1,2,3 are legit, 6,7,8 permanently gone)

So the first catch 22, is that I cannot elect a new leader, because the leader 
needs to be elected from the ISR, and I cannot recreate the ISR because the 
topic has no leader.

The second catch 22 is that I cannot rerun `kafka-reassign-partitions.sh` 
because the previous one is supposedly still in progress, and I cannot increase 
the number of partitions to account for the now permanently offline partitions, 
because that produces the following error `Error while executing topic command 
requirement failed: All partitions should have the same number of replicas.`, 
from which I cannot recover because `kafka-reassign-partitions.sh`.

Is there a way to recover from such a situation? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)