[jira] [Commented] (KAFKA-1530) howto update continuously

2016-08-17 Thread Alexey Ozeritskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425454#comment-15425454
 ] 

Alexey Ozeritskiy commented on KAFKA-1530:
--

I think this ticket may be closed
unclean.leader.election.enable=false helps us
Also we've developed tool kafka-restarter that restarts kafka node by node and 
controls isr status.
And we've developed tool fix-isr that can fix isr after cluster power failure.

> howto update continuously
> -
>
> Key: KAFKA-1530
> URL: https://issues.apache.org/jira/browse/KAFKA-1530
> Project: Kafka
>  Issue Type: Wish
>Reporter: Stanislav Gilmulin
>Assignee: Guozhang Wang
>Priority: Minor
>  Labels: operating_manual, performance
>
> Hi,
>  
> Could I ask you a question about the Kafka update procedure?
> Is there a way to update software, which doesn't require service interruption 
> or lead to data losses?
> We can't stop message brokering during the update as we have a strict SLA.
>  
> Best regards
> Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-17 Thread Oleg Golovin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064815#comment-14064815
 ] 

Oleg Golovin commented on KAFKA-1530:
-

Thank you for mentioning the option unclean.leader.election.enable. It seems 
to be a new option we didn't know of.
We will need some time to test it. We will report how it went  as soon as we 
perform this testing.

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Assignee: Guozhang Wang
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-11 Thread Andrey Stepachev (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058568#comment-14058568
 ] 

Andrey Stepachev commented on KAFKA-1530:
-

Looks like [~ovgolovin] problem with wrong replica election can be fixed by 
adding notion of min-replicas somewhere around that code 
https://github.com/apache/kafka/blob/3c4ca854fd2da5e5fcecdaf0856a38a9ebe4763c/core/src/main/scala/kafka/cluster/Partition.scala#L165,
 we can restrict leader election/reelection only for partitions which have 
configured size of isr.

According to [~renew] situation, it is not realistic to loose data in situation 
when leader is stopped and one of the replica will became the leader and _if_ 
acks required greater then 1. kafka maintains 'high watermark' for each 
partition and for each request it waits for required replicas to catch up with 
leader before responds to client. So if it is not a correlated failure (when we 
loose 2 replicas at once) it will work correctly. If it was 2 replicas and 1 
replica outside of isr, both in ISR die, then it is possible to bring up third 
replica and new data in those replicas data will be lost.

Just to be sure, kafka is a 'primary backup' replication system, so in doesn't 
tolerate correlated failures in oppose to quorum system. But gives high 
throughput in return. That how in stands :)

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Assignee: Guozhang Wang
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-11 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058893#comment-14058893
 ] 

Jun Rao commented on KAFKA-1530:


We do have an option unclean.leader.election.enable to prevent unclean leader 
election. So, if you care more about durability than availability, you can set 
this option to false. Then, the new leader will only be elected from isr. The 
unavailability window of a partition could be longer though since we have to 
wait until at least one broker in isr is back online.

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Assignee: Guozhang Wang
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-09 Thread Stanislav Gilmulin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056049#comment-14056049
 ] 

Stanislav Gilmulin commented on KAFKA-1530:
---

Thank you,
i'd like to ask some questions.
If a cluster has a lot of nodes and a few of them are lagging or down, we can't 
guarantee we would stop and restart nodes properly and in the right order  
Is there any recommended way to manage it?
Or even an already existing script or tool for it?
Replication factor = 3.
Version 0.8.1.1

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-09 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056339#comment-14056339
 ] 

Guozhang Wang commented on KAFKA-1530:
--

Not sure I understand your question, what do you mean by we can't guarantee we 
would stop and restart nodes properly and in the right order ?

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-09 Thread Stanislav Gilmulin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056411#comment-14056411
 ] 

Stanislav Gilmulin commented on KAFKA-1530:
---

It means we have the risks.
You're right. My question wansn't clear. Let me try to explain.

First of all, accordint to business requirements we can't stop the service. So 
we can't stop all nodes before updating.
And, as you've advised, our option would be updating step by step.  
But when we update without using the right procedure, we could lose an unknown 
amount of messages in the example case presented below. 

Let's consider this case for a example.
We have 3 replicas of one partition with 2 of them lagging behind. Then we 
restart the leader. At that very moment one of the two lagging partitions 
become a new leader. After that, when the used-to-be-leader partiton starts 
working again (and which in fact has the newest data), it truncates all the 
newest data to match with now elected leader.
This situation happens quite often when we restart a highly loaded Kafka 
cluster, so that we loose some part of our data.



 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-09 Thread Oleg (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056422#comment-14056422
 ] 

Oleg commented on KAFKA-1530:
-

Another situation which happened in our production:
We have replication level 3. One out of 3 partition started lagging behind (due 
to network connectivity problems, etc.). Then while upgrading/restarting Kafka 
we restart the whole cluster. After upgrade Kafka starts electing leaders for 
each partition. It's highly likely it may elect the lagging behind partition as 
a leader. Which in result leads to truncating two other partitions. In this 
case we loose data.

So we are seeking a means of restarting/upgrading Kafka without data loose.

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-09 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056448#comment-14056448
 ] 

Guozhang Wang commented on KAFKA-1530:
--

Hi Stanislav/Oleg,

Kafka server has a config called controlled.shutdown.enable, and when it is 
turned on, the shutting down process will first wait for all the leaders of the 
current shutting down node to migrate to other nodes before shutting down the 
server (http://kafka.apache.org/documentation.html#brokerconfigs).

For your first case, where the shutting down node is the only replica in ISR, 
the shutting down process will block until there are other nodes back in ISR 
and hence can take the partitions; for your second case where there are more 
than one node in ISR, then it is guaranteed that the leaders of the shutting 
down nodes will be moved to another ISR node.

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (KAFKA-1530) howto update continuously

2014-07-08 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055437#comment-14055437
 ] 

Guozhang Wang commented on KAFKA-1530:
--

Hi Stanislav,

Upgrading a kafka server does require bouncing it with the new jar, however if 
you have a cluster of more than one server and data replication is turned on 
(i.e. replication factor  1 for all topics hosted) then this should not 
interrupt the message brokering, since you will only bounce the brokers 
sequentially and while one node is down, its brokering functionality will be 
moved to other replicas.

The only exception is when you upgrade from 0.7 to 0.8.*, more details can be 
found on this wiki:

https://cwiki.apache.org/confluence/display/KAFKA/Changes+in+Kafka+0.8

 howto update continuously
 -

 Key: KAFKA-1530
 URL: https://issues.apache.org/jira/browse/KAFKA-1530
 Project: Kafka
  Issue Type: Wish
Reporter: Stanislav Gilmulin
Priority: Minor
  Labels: operating_manual, performance

 Hi,
  
 Could I ask you a question about the Kafka update procedure?
 Is there a way to update software, which doesn't require service interruption 
 or lead to data losses?
 We can't stop message brokering during the update as we have a strict SLA.
  
 Best regards
 Stanislav Gilmulin



--
This message was sent by Atlassian JIRA
(v6.2#6252)