[jira] [Updated] (KAFKA-3924) Data loss due to halting when LEO is larger than leader's LEO

2016-11-30 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3924:
---
Labels: reliability  (was: )

> Data loss due to halting when LEO is larger than leader's LEO
> -
>
> Key: KAFKA-3924
> URL: https://issues.apache.org/jira/browse/KAFKA-3924
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0
>Reporter: Maysam Yabandeh
>  Labels: reliability
> Fix For: 0.10.0.1
>
> Attachments: deadlock-stack
>
>
> Currently the follower broker panics when its LEO is larger than its leader's 
> LEO,  and assuming that this is an impossible state to reach, halts the 
> process to prevent any further damage.
> {code}
> if (leaderEndOffset < replica.logEndOffset.messageOffset) {
>   // Prior to truncating the follower's log, ensure that doing so is not 
> disallowed by the configuration for unclean leader election.
>   // This situation could only happen if the unclean election 
> configuration for a topic changes while a replica is down. Otherwise,
>   // we should never encounter this situation since a non-ISR leader 
> cannot be elected if disallowed by the broker configuration.
>   if (!LogConfig.fromProps(brokerConfig.originals, 
> AdminUtils.fetchEntityConfig(replicaMgr.zkUtils,
> ConfigType.Topic, 
> topicAndPartition.topic)).uncleanLeaderElectionEnable) {
> // Log a fatal error and shutdown the broker to ensure that data loss 
> does not unexpectedly occur.
> fatal("...")
> Runtime.getRuntime.halt(1)
>   }
> {code}
> Firstly this assumption is invalid and there are legitimate cases (examples 
> below) that this case could actually occur. Secondly halt results into the 
> broker losing its un-flushed data, and if multiple brokers halt 
> simultaneously there is a chance that both leader and followers of a 
> partition are among the halted brokers, which would result into permanent 
> data loss.
> Given that this is a legit case, we suggest to replace it with a graceful 
> shutdown to avoid propagating data loss to the entire cluster.
> Details:
> One legit case that this could actually occur is when a troubled broker 
> shrinks its partitions right before crashing (KAFKA-3410 and KAFKA-3861). In 
> this case the broker has lost some data but the controller cannot still 
> elects the others as the leader. If the crashed broker comes back up, the 
> controller elects it as the leader, and as a result all other brokers who are 
> now following it halt since they have LEOs larger than that of shrunk topics 
> in the restarted broker.  We actually had a case that bringing up a crashed 
> broker simultaneously took down the entire cluster and as explained above 
> this could result into data loss.
> The other legit case is when multiple brokers ungracefully shutdown at the 
> same time. In this case both of the leader and the followers lose their 
> un-flushed data but one of them has HW larger than the other. Controller 
> elects the one who comes back up sooner as the leader and if its LEO is less 
> than its future follower, the follower will halt (and probably lose more 
> data). Simultaneous ungrateful shutdown could happen due to hardware issue 
> (e.g., rack power failure), operator errors, or software issue (e.g., the 
> case above that is further explained in KAFKA-3410 and KAFKA-3861 and causes 
> simultaneous halts in multiple brokers)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3924) Data loss due to halting when LEO is larger than leader's LEO

2016-08-13 Thread Alexey Ozeritskiy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Ozeritskiy updated KAFKA-3924:
-
Attachment: deadlock-stack

> Data loss due to halting when LEO is larger than leader's LEO
> -
>
> Key: KAFKA-3924
> URL: https://issues.apache.org/jira/browse/KAFKA-3924
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0
>Reporter: Maysam Yabandeh
> Fix For: 0.10.0.1
>
> Attachments: deadlock-stack
>
>
> Currently the follower broker panics when its LEO is larger than its leader's 
> LEO,  and assuming that this is an impossible state to reach, halts the 
> process to prevent any further damage.
> {code}
> if (leaderEndOffset < replica.logEndOffset.messageOffset) {
>   // Prior to truncating the follower's log, ensure that doing so is not 
> disallowed by the configuration for unclean leader election.
>   // This situation could only happen if the unclean election 
> configuration for a topic changes while a replica is down. Otherwise,
>   // we should never encounter this situation since a non-ISR leader 
> cannot be elected if disallowed by the broker configuration.
>   if (!LogConfig.fromProps(brokerConfig.originals, 
> AdminUtils.fetchEntityConfig(replicaMgr.zkUtils,
> ConfigType.Topic, 
> topicAndPartition.topic)).uncleanLeaderElectionEnable) {
> // Log a fatal error and shutdown the broker to ensure that data loss 
> does not unexpectedly occur.
> fatal("...")
> Runtime.getRuntime.halt(1)
>   }
> {code}
> Firstly this assumption is invalid and there are legitimate cases (examples 
> below) that this case could actually occur. Secondly halt results into the 
> broker losing its un-flushed data, and if multiple brokers halt 
> simultaneously there is a chance that both leader and followers of a 
> partition are among the halted brokers, which would result into permanent 
> data loss.
> Given that this is a legit case, we suggest to replace it with a graceful 
> shutdown to avoid propagating data loss to the entire cluster.
> Details:
> One legit case that this could actually occur is when a troubled broker 
> shrinks its partitions right before crashing (KAFKA-3410 and KAFKA-3861). In 
> this case the broker has lost some data but the controller cannot still 
> elects the others as the leader. If the crashed broker comes back up, the 
> controller elects it as the leader, and as a result all other brokers who are 
> now following it halt since they have LEOs larger than that of shrunk topics 
> in the restarted broker.  We actually had a case that bringing up a crashed 
> broker simultaneously took down the entire cluster and as explained above 
> this could result into data loss.
> The other legit case is when multiple brokers ungracefully shutdown at the 
> same time. In this case both of the leader and the followers lose their 
> un-flushed data but one of them has HW larger than the other. Controller 
> elects the one who comes back up sooner as the leader and if its LEO is less 
> than its future follower, the follower will halt (and probably lose more 
> data). Simultaneous ungrateful shutdown could happen due to hardware issue 
> (e.g., rack power failure), operator errors, or software issue (e.g., the 
> case above that is further explained in KAFKA-3410 and KAFKA-3861 and causes 
> simultaneous halts in multiple brokers)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3924) Data loss due to halting when LEO is larger than leader's LEO

2016-07-27 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-3924:
---
   Resolution: Fixed
Fix Version/s: 0.10.0.1
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 1634
[https://github.com/apache/kafka/pull/1634]

> Data loss due to halting when LEO is larger than leader's LEO
> -
>
> Key: KAFKA-3924
> URL: https://issues.apache.org/jira/browse/KAFKA-3924
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0
>Reporter: Maysam Yabandeh
> Fix For: 0.10.0.1
>
>
> Currently the follower broker panics when its LEO is larger than its leader's 
> LEO,  and assuming that this is an impossible state to reach, halts the 
> process to prevent any further damage.
> {code}
> if (leaderEndOffset < replica.logEndOffset.messageOffset) {
>   // Prior to truncating the follower's log, ensure that doing so is not 
> disallowed by the configuration for unclean leader election.
>   // This situation could only happen if the unclean election 
> configuration for a topic changes while a replica is down. Otherwise,
>   // we should never encounter this situation since a non-ISR leader 
> cannot be elected if disallowed by the broker configuration.
>   if (!LogConfig.fromProps(brokerConfig.originals, 
> AdminUtils.fetchEntityConfig(replicaMgr.zkUtils,
> ConfigType.Topic, 
> topicAndPartition.topic)).uncleanLeaderElectionEnable) {
> // Log a fatal error and shutdown the broker to ensure that data loss 
> does not unexpectedly occur.
> fatal("...")
> Runtime.getRuntime.halt(1)
>   }
> {code}
> Firstly this assumption is invalid and there are legitimate cases (examples 
> below) that this case could actually occur. Secondly halt results into the 
> broker losing its un-flushed data, and if multiple brokers halt 
> simultaneously there is a chance that both leader and followers of a 
> partition are among the halted brokers, which would result into permanent 
> data loss.
> Given that this is a legit case, we suggest to replace it with a graceful 
> shutdown to avoid propagating data loss to the entire cluster.
> Details:
> One legit case that this could actually occur is when a troubled broker 
> shrinks its partitions right before crashing (KAFKA-3410 and KAFKA-3861). In 
> this case the broker has lost some data but the controller cannot still 
> elects the others as the leader. If the crashed broker comes back up, the 
> controller elects it as the leader, and as a result all other brokers who are 
> now following it halt since they have LEOs larger than that of shrunk topics 
> in the restarted broker.  We actually had a case that bringing up a crashed 
> broker simultaneously took down the entire cluster and as explained above 
> this could result into data loss.
> The other legit case is when multiple brokers ungracefully shutdown at the 
> same time. In this case both of the leader and the followers lose their 
> un-flushed data but one of them has HW larger than the other. Controller 
> elects the one who comes back up sooner as the leader and if its LEO is less 
> than its future follower, the follower will halt (and probably lose more 
> data). Simultaneous ungrateful shutdown could happen due to hardware issue 
> (e.g., rack power failure), operator errors, or software issue (e.g., the 
> case above that is further explained in KAFKA-3410 and KAFKA-3861 and causes 
> simultaneous halts in multiple brokers)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3924) Data loss due to halting when LEO is larger than leader's LEO

2016-07-18 Thread Maysam Yabandeh (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maysam Yabandeh updated KAFKA-3924:
---
Status: Patch Available  (was: Open)

Submitting a simple patch that replaces halt with exit, which in turn will 
trigger the shutdown hook that gracefully shuts down the broker.

[~noslowerdna] You seem to be the most knowledgable about this part of the code 
but I did not manage to tag you on github pull request. Do you think you can 
review the patch? or perhaps redirect us to the some who could? Thanks.

> Data loss due to halting when LEO is larger than leader's LEO
> -
>
> Key: KAFKA-3924
> URL: https://issues.apache.org/jira/browse/KAFKA-3924
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0
>Reporter: Maysam Yabandeh
>
> Currently the follower broker panics when its LEO is larger than its leader's 
> LEO,  and assuming that this is an impossible state to reach, halts the 
> process to prevent any further damage.
> {code}
> if (leaderEndOffset < replica.logEndOffset.messageOffset) {
>   // Prior to truncating the follower's log, ensure that doing so is not 
> disallowed by the configuration for unclean leader election.
>   // This situation could only happen if the unclean election 
> configuration for a topic changes while a replica is down. Otherwise,
>   // we should never encounter this situation since a non-ISR leader 
> cannot be elected if disallowed by the broker configuration.
>   if (!LogConfig.fromProps(brokerConfig.originals, 
> AdminUtils.fetchEntityConfig(replicaMgr.zkUtils,
> ConfigType.Topic, 
> topicAndPartition.topic)).uncleanLeaderElectionEnable) {
> // Log a fatal error and shutdown the broker to ensure that data loss 
> does not unexpectedly occur.
> fatal("...")
> Runtime.getRuntime.halt(1)
>   }
> {code}
> Firstly this assumption is invalid and there are legitimate cases (examples 
> below) that this case could actually occur. Secondly halt results into the 
> broker losing its un-flushed data, and if multiple brokers halt 
> simultaneously there is a chance that both leader and followers of a 
> partition are among the halted brokers, which would result into permanent 
> data loss.
> Given that this is a legit case, we suggest to replace it with a graceful 
> shutdown to avoid propagating data loss to the entire cluster.
> Details:
> One legit case that this could actually occur is when a troubled broker 
> shrinks its partitions right before crashing (KAFKA-3410 and KAFKA-3861). In 
> this case the broker has lost some data but the controller cannot still 
> elects the others as the leader. If the crashed broker comes back up, the 
> controller elects it as the leader, and as a result all other brokers who are 
> now following it halt since they have LEOs larger than that of shrunk topics 
> in the restarted broker.  We actually had a case that bringing up a crashed 
> broker simultaneously took down the entire cluster and as explained above 
> this could result into data loss.
> The other legit case is when multiple brokers ungracefully shutdown at the 
> same time. In this case both of the leader and the followers lose their 
> un-flushed data but one of them has HW larger than the other. Controller 
> elects the one who comes back up sooner as the leader and if its LEO is less 
> than its future follower, the follower will halt (and probably lose more 
> data). Simultaneous ungrateful shutdown could happen due to hardware issue 
> (e.g., rack power failure), operator errors, or software issue (e.g., the 
> case above that is further explained in KAFKA-3410 and KAFKA-3861 and causes 
> simultaneous halts in multiple brokers)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3924) Data loss due to halting when LEO is larger than leader's LEO

2016-07-18 Thread Maysam Yabandeh (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maysam Yabandeh updated KAFKA-3924:
---
Description: 
Currently the follower broker panics when its LEO is larger than its leader's 
LEO,  and assuming that this is an impossible state to reach, halts the process 
to prevent any further damage.
{code}
if (leaderEndOffset < replica.logEndOffset.messageOffset) {
  // Prior to truncating the follower's log, ensure that doing so is not 
disallowed by the configuration for unclean leader election.
  // This situation could only happen if the unclean election configuration 
for a topic changes while a replica is down. Otherwise,
  // we should never encounter this situation since a non-ISR leader cannot 
be elected if disallowed by the broker configuration.
  if (!LogConfig.fromProps(brokerConfig.originals, 
AdminUtils.fetchEntityConfig(replicaMgr.zkUtils,
ConfigType.Topic, 
topicAndPartition.topic)).uncleanLeaderElectionEnable) {
// Log a fatal error and shutdown the broker to ensure that data loss 
does not unexpectedly occur.
fatal("...")
Runtime.getRuntime.halt(1)
  }
{code}

Firstly this assumption is invalid and there are legitimate cases (examples 
below) that this case could actually occur. Secondly halt results into the 
broker losing its un-flushed data, and if multiple brokers halt simultaneously 
there is a chance that both leader and followers of a partition are among the 
halted brokers, which would result into permanent data loss.

Given that this is a legit case, we suggest to replace it with a graceful 
shutdown to avoid propagating data loss to the entire cluster.

Details:
One legit case that this could actually occur is when a troubled broker shrinks 
its partitions right before crashing (KAFKA-3410 and KAFKA-3861). In this case 
the broker has lost some data but the controller cannot still elects the others 
as the leader. If the crashed broker comes back up, the controller elects it as 
the leader, and as a result all other brokers who are now following it halt 
since they have LEOs larger than that of shrunk topics in the restarted broker. 
 We actually had a case that bringing up a crashed broker simultaneously took 
down the entire cluster and as explained above this could result into data loss.

The other legit case is when multiple brokers ungracefully shutdown at the same 
time. In this case both of the leader and the followers lose their un-flushed 
data but one of them has HW larger than the other. Controller elects the one 
who comes back up sooner as the leader and if its LEO is less than its future 
follower, the follower will halt (and probably lose more data). Simultaneous 
ungrateful shutdown could happen due to hardware issue (e.g., rack power 
failure), operator errors, or software issue (e.g., the case above that is 
further explained in KAFKA-3410 and KAFKA-3861 and causes simultaneous halts in 
multiple brokers)

  was:
Currently the follower broker panics when its LEO is less than its leader's 
LEO,  and assuming that this is an impossible state to reach, halts the process 
to prevent any further damage.
{code}
if (leaderEndOffset < replica.logEndOffset.messageOffset) {
  // Prior to truncating the follower's log, ensure that doing so is not 
disallowed by the configuration for unclean leader election.
  // This situation could only happen if the unclean election configuration 
for a topic changes while a replica is down. Otherwise,
  // we should never encounter this situation since a non-ISR leader cannot 
be elected if disallowed by the broker configuration.
  if (!LogConfig.fromProps(brokerConfig.originals, 
AdminUtils.fetchEntityConfig(replicaMgr.zkUtils,
ConfigType.Topic, 
topicAndPartition.topic)).uncleanLeaderElectionEnable) {
// Log a fatal error and shutdown the broker to ensure that data loss 
does not unexpectedly occur.
fatal("...")
Runtime.getRuntime.halt(1)
  }
{code}

Firstly this assumption is invalid and there are legitimate cases (examples 
below) that this case could actually occur. Secondly halt results into the 
broker losing its un-flushed data, and if multiple brokers halt simultaneously 
there is a chance that both leader and followers of a partition are among the 
halted brokers, which would result into permanent data loss.

Given that this is a legit case, we suggest to replace it with a graceful 
shutdown to avoid propagating data loss to the entire cluster.

Details:
One legit case that this could actually occur is when a troubled broker shrinks 
its partitions right before crashing (KAFKA-3410 and KAFKA-3861). In this case 
the broker has lost some data but the controller cannot still elects the others 
as the leader. If the crashed broker comes back up, the controller elects it as 
the leader, and as a result all other