[jira] [Updated] (KAFKA-5322) Resolve AddPartitions response error code inconsistency
[ https://issues.apache.org/jira/browse/KAFKA-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-5322: -- Labels: exactly-once (was: ) > Resolve AddPartitions response error code inconsistency > --- > > Key: KAFKA-5322 > URL: https://issues.apache.org/jira/browse/KAFKA-5322 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Jason Gustafson >Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > > The AddPartitions request does not support partial failures. Either all > partitions are successfully added to the transaction or none of them are. > Currently we return a separate error code for each partition that was added > to the transaction, which begs the question of what error code to return if > we have not actually encountered a fatal partition-level error for some > partition. For example, suppose we send AddPartitions with partitions A and > B. If A is not authorized, we will not even attempt to add B to the > transaction, but what error code should we use. The current solution is to > only include partition A and its error code in the response, but this is a > little inconsistent with most other request types. Alternatives that have > been proposed: > 1. Instead of a partition-level error, use one global error. We can add a > global error message to return friendlier details to the user about which > partition had a fault. The downside is that we would have to parse the > message contents if we wanted to do any partition-specific handling. We could > not easily fill the set of topics in {{TopicAuthorizationException}} for > example. > 2. We can add a new error code to indicate that the broker did not even > attempt to add the partition to the transaction. For example: > OPERATION_NOT_ATTEMPTED or something like that. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (KAFKA-5282) Transactions integration test: Use factory methods to keep track of open producers and consumers and close them all on tearDown
[ https://issues.apache.org/jira/browse/KAFKA-5282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian reassigned KAFKA-5282: - Assignee: (was: Apurva Mehta) > Transactions integration test: Use factory methods to keep track of open > producers and consumers and close them all on tearDown > --- > > Key: KAFKA-5282 > URL: https://issues.apache.org/jira/browse/KAFKA-5282 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Apurva Mehta > Labels: exactly-once > Fix For: 0.11.0.0 > > > See: https://github.com/apache/kafka/pull/3093/files#r117354588 > The current transactions integration test creates individual producers and > consumer per test, and closes them independently. > It would be more robust to create them through a central factory method that > keeps track of each instance, and then close those instances on `tearDown`. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (KAFKA-5093) Load only batch header when rebuilding producer ID map
[ https://issues.apache.org/jira/browse/KAFKA-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian reassigned KAFKA-5093: - Assignee: (was: Jason Gustafson) > Load only batch header when rebuilding producer ID map > -- > > Key: KAFKA-5093 > URL: https://issues.apache.org/jira/browse/KAFKA-5093 > Project: Kafka > Issue Type: Sub-task >Reporter: Jason Gustafson >Priority: Critical > Fix For: 0.11.0.0 > > > When rebuilding the producer ID map for KIP-98, we unnecessarily load the > full record data into memory when scanning through the log. It would be > better to only load the batch header since it is all that is needed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (KAFKA-5251) Producer should drop queued sends when transaction is aborted
[ https://issues.apache.org/jira/browse/KAFKA-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian reassigned KAFKA-5251: - Assignee: (was: Apurva Mehta) > Producer should drop queued sends when transaction is aborted > - > > Key: KAFKA-5251 > URL: https://issues.apache.org/jira/browse/KAFKA-5251 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson > Labels: exactly-once > Fix For: 0.11.0.0 > > > As an optimization, if a transaction is aborted, we can drop any records > which have not yet been sent to the brokers. However, to avoid the sequence > number getting out of sync, we need to continue sending any request which has > been sent at least once. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (KAFKA-5032) Think through implications of max.message.size affecting record batches in message format V2
[ https://issues.apache.org/jira/browse/KAFKA-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian reassigned KAFKA-5032: - Assignee: (was: Apurva Mehta) > Think through implications of max.message.size affecting record batches in > message format V2 > > > Key: KAFKA-5032 > URL: https://issues.apache.org/jira/browse/KAFKA-5032 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Ismael Juma >Priority: Critical > Labels: exactly-once > Fix For: 0.11.0.0 > > > It's worth noting that the new behaviour for uncompressed messages is the > same as the existing behaviour for compressed messages. > A few things to think about: > 1. Do the producer settings max.request.size and batch.size still make sense > and do we need to update the documentation? My conclusion is that things are > still fine, but we may need to revise the docs. > 2. Consider changing default max message set size to include record batch > overhead. This is currently defined as: > {code} > val MessageMaxBytes = 100 + MessageSet.LogOverhead > {code} > We should consider changing it to (I haven't thought it through though): > {code} > val MessageMaxBytes = 100 + DefaultRecordBatch.RECORD_BATCH_OVERHEAD > {code} > 3. When a record batch is too large, we throw RecordTooLargeException, which > is confusing because there's also a RecordBatchTooLargeException. We should > consider renaming these exceptions to make the behaviour clearer. > 4. We should consider deprecating max.message.bytes (server config) and > message.max.bytes (topic config) in favour of configs that make it clear that > we are talking about record batches instead of individual messages. > Part of the work in this JIRA is working out what should be done for 0.11.0.0 > and what can be done later. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (KAFKA-5024) Old clients don't support message format V2
[ https://issues.apache.org/jira/browse/KAFKA-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian reassigned KAFKA-5024: - Assignee: (was: Apurva Mehta) > Old clients don't support message format V2 > --- > > Key: KAFKA-5024 > URL: https://issues.apache.org/jira/browse/KAFKA-5024 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Ismael Juma > Fix For: 0.11.0.0 > > > Is this OK? If so, we can close this JIRA, but we should make that decision > consciously. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5151) Refactor TransactionCoordinator in-memory structure and error handling logic
[ https://issues.apache.org/jira/browse/KAFKA-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011385#comment-16011385 ] Sriram Subramanian commented on KAFKA-5151: --- [~guozhang] can we close this? > Refactor TransactionCoordinator in-memory structure and error handling logic > > > Key: KAFKA-5151 > URL: https://issues.apache.org/jira/browse/KAFKA-5151 > Project: Kafka > Issue Type: Sub-task > Components: core >Reporter: Guozhang Wang >Assignee: Guozhang Wang > > Current status: > 1. we are having two types of threads: request handling thread for any client > requests as well as controller requests for `immigration` and `emigration`, > and the marker sender thread for draining queued marker entries and handle > responses. They maintain different in-memory cache structures like the > `txnMetadataCache`, and the `pendingTxnMap` which are storing the same info, > and they access some of the shared structures concurrently, like the markers > queue and the markerPurgatory. > 2. we are having one queue per broker today, and due to the emigration > purpose we probably are having one queue per brokerId + TxnLogPartitionId + > DataPartitionId, which would result in a lot of queues to handle. > This ticket is for collapsing some of these structures and simplify the > access of them from concurrent threads. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KAFKA-5249) Transaction index recovery does not snapshot properly
[ https://issues.apache.org/jira/browse/KAFKA-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-5249: -- Labels: exactly-once (was: ) > Transaction index recovery does not snapshot properly > - > > Key: KAFKA-5249 > URL: https://issues.apache.org/jira/browse/KAFKA-5249 > Project: Kafka > Issue Type: Sub-task >Reporter: Jason Gustafson >Assignee: Jason Gustafson > Labels: exactly-once > > When recovering the transaction index, we should take snapshots of the > producer state after recovering each segment. Currently, the snapshot offset > is not updated correctly so we will reread the segment multiple times. > Additionally, it appears that we do not remove snapshots with offsets higher > than the log end offset in all cases upon truncation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KAFKA-5251) Producer should drop queued sends when transaction is aborted
[ https://issues.apache.org/jira/browse/KAFKA-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-5251: -- Labels: exactly-once (was: ) > Producer should drop queued sends when transaction is aborted > - > > Key: KAFKA-5251 > URL: https://issues.apache.org/jira/browse/KAFKA-5251 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer >Reporter: Jason Gustafson >Assignee: Apurva Mehta > Labels: exactly-once > Fix For: 0.11.0.0 > > > As an optimization, if a transaction is aborted, we can drop any records > which have not yet been sent to the brokers. However, to avoid the sequence > number getting out of sync, we need to continue sending any request which has > been sent at least once. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-3986) completedReceives can contain closed channels
[ https://issues.apache.org/jira/browse/KAFKA-3986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15394900#comment-15394900 ] Sriram Subramanian commented on KAFKA-3986: --- Given that it cannot be consistently reproduced, it is a blocker for 0.10.0.1? cc [~ijuma] > completedReceives can contain closed channels > -- > > Key: KAFKA-3986 > URL: https://issues.apache.org/jira/browse/KAFKA-3986 > Project: Kafka > Issue Type: Bug > Components: network >Reporter: Ryan P > Fix For: 0.10.0.1 > > > I'm not entirely sure why at this point but it is possible to throw a Null > Pointer Exception when processingCompletedReceives. This happens when a > fairly substantial number of simultaneously initiated connections are > initiated with the server. > The processor thread does carry on but it may be worth investigating how the > channel could be both closed and completedReceives. > The NPE in question is thrown here: > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/network/SocketServer.scala#L490 > It can not be consistently reproduced either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1524) Implement transactional producer
[ https://issues.apache.org/jira/browse/KAFKA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371788#comment-15371788 ] Sriram Subramanian commented on KAFKA-1524: --- We hope to provide an update on this soon. > Implement transactional producer > > > Key: KAFKA-1524 > URL: https://issues.apache.org/jira/browse/KAFKA-1524 > Project: Kafka > Issue Type: New Feature >Reporter: Joel Koshy >Assignee: Raul Castro Fernandez > Labels: transactions > Attachments: KAFKA-1524.patch, KAFKA-1524.patch, KAFKA-1524.patch, > KAFKA-1524_2014-08-18_09:39:34.patch, KAFKA-1524_2014-08-20_09:14:59.patch > > > Implement the basic transactional producer functionality as outlined in > https://cwiki.apache.org/confluence/display/KAFKA/Transactional+Messaging+in+Kafka > The scope of this jira is basic functionality (i.e., to be able to begin and > commit or abort a transaction) without the failure scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14168996#comment-14168996 ] Sriram Subramanian commented on KAFKA-1555: --- [~gwenshap] +1 on your suggestion. We can get the documentation ready and then do the linking during the release. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167113#comment-14167113 ] Sriram Subramanian commented on KAFKA-1555: --- Awesome. I suggest we document the guarantees provided by the different knobs. That would be very useful. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167464#comment-14167464 ] Sriram Subramanian commented on KAFKA-1555: --- My vote would be to update our documentation - http://kafka.apache.org/documentation.html It currently refers to 0.8.1. We should make 0.8.2 the current one after the release. The Design section can have Guarantees portion that talks about what guarantees that Kafka gives w.r.t consistency Vs availability and when. What do the rest think? provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch, KAFKA-1555.5.patch, KAFKA-1555.6.patch, KAFKA-1555.8.patch, KAFKA-1555.9.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147474#comment-14147474 ] Sriram Subramanian commented on KAFKA-1555: --- 1. Great. Not supporting all values above ack 1 is a good step. We are essentially not using it as an integer any more. I would still love it to be made more explicit with an enum for clarity. 2. Also, by setting ack = -1 and min_isr = 2, we still do not avoid data loss when one broker goes down. The issue is the way we select a leader. When a request was written to the leader, the min_isr check could have succeeded and we would have written to min_isr - 1 number of replicas. However, the replicas could subsequently go out of the isr. When the leader fails after that, we would have an unclean leader election and select any replica as the leader and it could be one that was lagging. To completely guarantee no data loss, we would need to do the following a. Ensure logs do not diverge on unclean leader elections b. Choose the broker with the longest log as the leader 3. We may have not documented ack 1 but since it is an integer, there are chances somebody could be using it. In such a case this could be a backwards incompatible change. It would be worth mentioning it in the release notes. 4. Long term, I think the min_isr config should be in the API. This gives better control per message and explicitly lets the caller know what guarantees they get. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14147967#comment-14147967 ] Sriram Subramanian commented on KAFKA-1555: --- ack = -1 with clean leader election already prevents data loss. If min_isr=2, I would expect the data to be never lost when the leader fails. That should be the simplest guarantee the system should provide. We should not add further clauses to this or it would be impossible to define the system. If we were to say - with min_isr-2 and ack=-1 you just reduced the probability of loss but it could still get lost under unclean leader election, we will loose credibility on these settings. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148407#comment-14148407 ] Sriram Subramanian commented on KAFKA-1555: --- Thank you for summarizing all the thoughts Jay. 1. I had issues with how ack was designed initially with the min_isr config and it looks a lot better now with ack = 0, ack = 1 and ack = -1. I still think ack should be an enum explaining what it does rather than using -1 or any arbitrary integers. 2. I don't see the value of min_isr if it does not prevent data loss under unclean leader election. If it was a clean leader election, we would always have one other replica that has the data and min_isr does not add any more value. It is completely possible to ensure there is no data loss with unclean leader election using the min_isr and I think that is the real benefit of it. 3. Has I had said previously, I like the sender to know what guarantees they get when they send the request and would opt for min_isr being exposed at the API level. 4. W.r.t your last point, I think it may not be possible to avoid duplicates by failing before writing to the log. The reason is that the isr could become less than min_isr just after the check and we could still end up failing the request after a timeout. Agreed, this is an edge case and we end up with a lot less duplicates. So I think, you would need the check in both places. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148520#comment-14148520 ] Sriram Subramanian commented on KAFKA-1555: --- 2. I agree. I think what min_isr helps in is to have a way to specify I don't want to loose my data as long as min_isr - 1 number of nodes are down. For example, if no_of_replicas=3, min_isr = 2 and ack=-1, we should not loose data as long as one node is down even when there is an unclean leader election. In this particular case, when the leader fails, it is expected that all replica nodes are up but could be out of the isr. Under such constraints it is definitely possible to prevent data loss (ignoring data loss due to system failures and data not flushed to disk) by making the node with the longest log (assuming we ensure they don't diverge) as the leader. 3. I prefer b or c. d is attractive since you could use just one variable to define your required guarantees but it is hard to understand at the API level. 4. I totally agree. The issue is ISR takes a while to reflect the actual reality. Assume we failed early before writing to the local log and did not have any checks after writing. Replicas go down. It would take a while for the isr to reflect that the replicas are not in the isr anymore. During this time, we would simply write the messages to the log and loose it later. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch, KAFKA-1555.4.patch, KAFKA-1555.5.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146751#comment-14146751 ] Sriram Subramanian commented on KAFKA-1555: --- Jun, I am not arguing that we should not have the feature. I am arguing about what is the best way to expose that feature. I think ack being a number along with min isr being another number is very confusing. The ack really does not indicate if the system is opting for availability or consistency today. The min_isr also works only for ack=-1. Cases where ack = 2 and min_isr = 2 are very confusing to reason about. In this case, we would still end up writing only to the ISR and return success. If ISR = 1, it just make system not behave in any predictable way. We should either change how ack is implemented today or move these options to the API so that the caller knows what they are opting for. If this is an interim solution, I would like to see a JIRA filed to revisit this. It is usually hard to change things later if the users get used to how a system behaves. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14146751#comment-14146751 ] Sriram Subramanian edited comment on KAFKA-1555 at 9/24/14 9:32 PM: Jun, I am not arguing that we should not have the feature. I am arguing about what is the best way to expose that feature. I think ack being a number along with min isr being another number is very confusing. The ack really does not indicate if the system is opting for availability or consistency today. The min_isr also works only for ack=-1. What is the difference between ack = 2 and ack -1 with min_isr=2? These differences are so subtle that it gets hard to explain what the system does. We should either change how ack is implemented today or move these options to the API so that the caller knows what they are opting for. If this is an interim solution, I would like to see a JIRA filed to revisit this. It is usually hard to change things later if the users get used to how a system behaves. was (Author: sriramsub): Jun, I am not arguing that we should not have the feature. I am arguing about what is the best way to expose that feature. I think ack being a number along with min isr being another number is very confusing. The ack really does not indicate if the system is opting for availability or consistency today. The min_isr also works only for ack=-1. Cases where ack = 2 and min_isr = 2 are very confusing to reason about. In this case, we would still end up writing only to the ISR and return success. If ISR = 1, it just make system not behave in any predictable way. We should either change how ack is implemented today or move these options to the API so that the caller knows what they are opting for. If this is an interim solution, I would like to see a JIRA filed to revisit this. It is usually hard to change things later if the users get used to how a system behaves. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch, KAFKA-1555.2.patch, KAFKA-1555.3.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14142967#comment-14142967 ] Sriram Subramanian commented on KAFKA-1555: --- Sorry to be late here but I think this an important change and we need to ensure this is perfectly the right behavior for the long term. To summarize the discussion and code change so far - 1. We would set the min.isr per topic in log config 2. We would use this config only when ack is set to -1 and fail the call if the number of in sync replicas is less than min isr The main drawbacks I see with this approach are - 1. If we plan to set this value at a per topic level, this should be part of create/modify topic and should be set during topic creation or modified later. This ensures that if we do expose a createTopic api in the protocol, it would be available to be set/modified. 2. I could see scenarios where multiple writers could have different requirements on the same topic and may not have any knowledge of how the topic was created. 3. I think what we are really solving for is to either make the write durable on all replicas or on just the in sync replicas. The min.isr value provides the option of a number and I think any value other than 0 or no_of_replicas is of no value. This would only confuse the clients when they create the topic. This is how I interpret the acks w.r.t the clients - 0 - No response required. I don't really care if the write happened 1 - I need a response after the write happened to the leader successfully 1 - I need the write to happen on all replicas before a response. This has two options - a. Response is sent after write happens to replicas in ISR b. Response is sent after write happens to all replicas Having an enum for ack as below is a lot clearer and sets the expectations right in my opinion. enum AckType { No_Response, Write_To_Leader, Write_To_ISR, (Chooses availability over consistency) Write_To_All_Replicas (Chooses consistency over availability) } provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-977) Implement generation/term per leader to reconcile messages correctly
[ https://issues.apache.org/jira/browse/KAFKA-977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14142990#comment-14142990 ] Sriram Subramanian commented on KAFKA-977: -- I would like to bring this issue to discussion again. Kafka is used a lot more now for use cases other than just moving data from point A to point B. For example, consider the case where Kafka acts as the log and materialized views are created by consuming these logs. In such scenarios, it is important that the logs are consistent and do not diverge even under unclean leader elections (Replaying these replicas should create the same view). Having a generation/term is essential for log replication and it would be great for Kafka to have the same guarantees as other log replication protocols. I would be happy to give more detailed examples for this but would want to know if we think this is an issue to address soon. Implement generation/term per leader to reconcile messages correctly Key: KAFKA-977 URL: https://issues.apache.org/jira/browse/KAFKA-977 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian During unclean leader election, the log messages can diverge and when the followers come back up Kafka does not reconcile correctly. To implement it correctly, we need to add a term/generation to each message and use that to reconcile. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1034) Improve partition reassignment to optimize writes to zookeeper
[ https://issues.apache.org/jira/browse/KAFKA-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14142999#comment-14142999 ] Sriram Subramanian commented on KAFKA-1034: --- At the high level, we were making to many writes to ZK during partition reassignment. Lot of things have changed since then and we would need to revisit the code to see if this is still an issue. It would be useful if someone who has touched this code recently can comment on it. Improve partition reassignment to optimize writes to zookeeper -- Key: KAFKA-1034 URL: https://issues.apache.org/jira/browse/KAFKA-1034 Project: Kafka Issue Type: Bug Affects Versions: 0.8.0, 0.8.1 Reporter: Sriram Subramanian Assignee: Sriram Subramanian Labels: newbie++ Fix For: 0.8.2 For ReassignPartition tool, check if optimizing the writes to ZK after every replica reassignment is possible -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143578#comment-14143578 ] Sriram Subramanian commented on KAFKA-1555: --- [~gwenshap] A key thing I would like to ensure when a feature is added is if it can be easily explained to the end users of the system. A system can provide a great level of flexibility by exposing its functionalities as configs but what gets hard with these data systems is that over time there is config bloat and it gets complex to specify the guarantees that the system provides. Let us say we had N replicas. min isr = 0, 1 is trivial to explain min isr = N - use it when you need strong durability min isr = 2 ... N-1 - I need your help here. What would be a good guidance to give the users on what values to use between 2 to N-1? provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1555) provide strong consistency with reasonable availability
[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143628#comment-14143628 ] Sriram Subramanian commented on KAFKA-1555: --- Surviving the leader crash makes sense. That is provided by any value from 2 to N. I am more interested in being able to specify why 2 and not N? Performance? Availability? Probability of data loss? If so, we should be able to quantify it. I don't want to drag this discussion but I think it is a common mistake to not quantify what are the benefits of choosing one value over the other between 2...N-1 and pushing that choice to the users by providing a config that is fine grained. It would be great to document this use case with an example and indicating how performance, availability, data loss are affected by choosing one value over the other. provide strong consistency with reasonable availability --- Key: KAFKA-1555 URL: https://issues.apache.org/jira/browse/KAFKA-1555 Project: Kafka Issue Type: Improvement Components: controller Affects Versions: 0.8.1.1 Reporter: Jiang Wu Assignee: Gwen Shapira Fix For: 0.8.2 Attachments: KAFKA-1555.0.patch, KAFKA-1555.1.patch In a mission critical application, we expect a kafka cluster with 3 brokers can satisfy two requirements: 1. When 1 broker is down, no message loss or service blocking happens. 2. In worse cases such as two brokers are down, service can be blocked, but no message loss happens. We found that current kafka versoin (0.8.1.1) cannot achieve the requirements due to its three behaviors: 1. when choosing a new leader from 2 followers in ISR, the one with less messages may be chosen as the leader. 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it has less messages than the leader. 3. ISR can contains only 1 broker, therefore acknowledged messages may be stored in only 1 broker. The following is an analytical proof. We consider a cluster with 3 brokers and a topic with 3 replicas, and assume that at the beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e., they have the same messages and are all in ISR. According to the value of request.required.acks (acks for short), there are the following cases. 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this time, although C hasn't received m, C is still in ISR. If A is killed, C can be elected as the new leader, and consumers will miss m. 3. acks=-1. B and C restart and are removed from ISR. Producer sends a message m to A, and receives an acknowledgement. Disk failure happens in A before B and C replicate m. Message m is lost. In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1546) Automate replica lag tuning
[ https://issues.apache.org/jira/browse/KAFKA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068731#comment-14068731 ] Sriram Subramanian commented on KAFKA-1546: --- the lagBegin does not persist across shutdowns or leader transitions. A safe assumption to make is that all fetchers are lagging when a node becomes a leader till we get the first fetch. This would ensure we don't assume there is no lag when a fetcher is down and a new leader is elected. Automate replica lag tuning --- Key: KAFKA-1546 URL: https://issues.apache.org/jira/browse/KAFKA-1546 Project: Kafka Issue Type: Improvement Components: replication Affects Versions: 0.8.0, 0.8.1, 0.8.1.1 Reporter: Neha Narkhede Labels: newbie++ Currently, there is no good way to tune the replica lag configs to automatically account for high and low volume topics on the same cluster. For the low-volume topic it will take a very long time to detect a lagging replica, and for the high-volume topic it will have false-positives. One approach to making this easier would be to have the configuration be something like replica.lag.max.ms and translate this into a number of messages dynamically based on the throughput of the partition. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1539) Due to OS caching Kafka might loose offset files which causes full reset of data
[ https://issues.apache.org/jira/browse/KAFKA-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068848#comment-14068848 ] Sriram Subramanian commented on KAFKA-1539: --- I had encountered the same issue in another project and had to explicitly use fsync to fix it. Due to OS caching Kafka might loose offset files which causes full reset of data Key: KAFKA-1539 URL: https://issues.apache.org/jira/browse/KAFKA-1539 Project: Kafka Issue Type: Bug Components: log Affects Versions: 0.8.1.1 Reporter: Dmitry Bugaychenko Assignee: Jay Kreps Attachments: KAFKA-1539.patch Seen this while testing power failure and disk failures. Due to chaching on OS level (eg. XFS can cache data for 30 seconds) after failure we got offset files of zero length. This dramatically slows down broker startup (it have to re-check all segments) and if high watermark offsets lost it simply erases all data and start recovering from other brokers (looks funny - first spending 2-3 hours re-checking logs and then deleting them all due to missing high watermark). Proposal: introduce offset files rotation. Keep two version of offset file, write to oldest, read from the newest valid. In this case we would be able to configure offset checkpoint time in a way that at least one file is alway flushed and valid. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1298) Controlled shutdown tool doesn't seem to work out of the box
[ https://issues.apache.org/jira/browse/KAFKA-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13923378#comment-13923378 ] Sriram Subramanian commented on KAFKA-1298: --- I dont think we have enabled it by default which we should. The issue is today the controller does not let a broker to shutdown if it is the only one up. We try to move the leader for all the topic partitions on a broker and we fail if there are no other brokers to move the leadership to. We should probably succeed if the number of replicas = 1 because there is not much the controller can do. This however may not be something we can fix in 0.8.1. Controlled shutdown tool doesn't seem to work out of the box Key: KAFKA-1298 URL: https://issues.apache.org/jira/browse/KAFKA-1298 Project: Kafka Issue Type: Improvement Reporter: Jay Kreps Labels: usability Download Kafka and try to use our shutdown tool. Got this: bin/kafka-run-class.sh kafka.admin.ShutdownBroker --zookeeper localhost:2181 --broker 0 [2014-03-06 16:58:23,636] ERROR Operation failed due to controller failure (kafka.admin.ShutdownBroker$) java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: jkreps-mn.linkedin.biz; nested exception is: java.net.ConnectException: Connection refused] at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:340) at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:249) at kafka.admin.ShutdownBroker$.kafka$admin$ShutdownBroker$$invokeShutdown(ShutdownBroker.scala:56) at kafka.admin.ShutdownBroker$.main(ShutdownBroker.scala:109) at kafka.admin.ShutdownBroker.main(ShutdownBroker.scala) Caused by: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: jkreps-mn.linkedin.biz; nested exception is: java.net.ConnectException: Connection refused] at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:101) at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:185) at javax.naming.InitialContext.lookup(InitialContext.java:392) at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1888) at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1858) at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:257) ... 4 more Caused by: java.rmi.ConnectException: Connection refused to host: jkreps-mn.linkedin.biz; nested exception is: java.net.ConnectException: Connection refused at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:601) at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:198) at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:184) at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:322) at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source) at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:97) ... 9 more Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:382) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:241) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:228) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:431) at java.net.Socket.connect(Socket.java:527) at java.net.Socket.connect(Socket.java:476) at java.net.Socket.init(Socket.java:373) at java.net.Socket.init(Socket.java:187) at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:22) at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:128) at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:595) ... 14 more Oh god, RMI?!!!??? Presumably this is because we stopped setting the JMX port by default. This is good because setting the JMX port breaks the quickstart which requires running multiple nodes on a single machine. The root cause imo is just using RMI here instead of our regular RPC. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1289) Misc. nitpicks in log cleaner
[ https://issues.apache.org/jira/browse/KAFKA-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13919022#comment-13919022 ] Sriram Subramanian commented on KAFKA-1289: --- +1 Misc. nitpicks in log cleaner - Key: KAFKA-1289 URL: https://issues.apache.org/jira/browse/KAFKA-1289 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1 Reporter: Jay Kreps Attachments: KAFKA-1289-v1.patch There are a couple of minor annoyances in the log cleaner in 0.8.1. Since this is one of the major features it would be nice to address these. Problems: 1. Logging is no longer going to the kafka-cleaner.log 2. Shutdown when the log cleaner is enabled is very slow 3. TestLogCleaner uses obsolete properties for the producer and consumer In addition I want to change the configuration from dedupe to compact as we don't use the terminology dedupe anywhere else and I think it is less intuitive. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-930: - Attachment: KAFKA-930_2014-02-24_01:59:46.patch Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.8.1 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch, KAFKA-930_2013-11-19_17:38:49.patch, KAFKA-930_2013-11-21_09:42:11.patch, KAFKA-930_2013-12-09_22:51:57.patch, KAFKA-930_2013-12-20_11:13:01.patch, KAFKA-930_2013-12-20_11:22:36.patch, KAFKA-930_2014-01-27_13:28:51.patch, KAFKA-930_2014-02-24_01:59:46.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (KAFKA-1258) Delete temporary data directory after unit test finishes
[ https://issues.apache.org/jira/browse/KAFKA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13898861#comment-13898861 ] Sriram Subramanian commented on KAFKA-1258: --- Look into http://junit.org/javadoc/4.9/org/junit/rules/TemporaryFolder.html. Helps to manage temp folders in junit. It may be supported only in java 7. Delete temporary data directory after unit test finishes Key: KAFKA-1258 URL: https://issues.apache.org/jira/browse/KAFKA-1258 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Fix For: 0.9.0 Today in unit testsuite most of the time when a test case is setup a temporary directory will be created with a random int as suffix, and will not be deleted after the test. After a few unit tests this will create tons of directories in java.io.tmpdir (/tmp for Linux). Would be better to remove them for clean unit tests. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (KAFKA-330) Add delete topic support
[ https://issues.apache.org/jira/browse/KAFKA-330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894628#comment-13894628 ] Sriram Subramanian commented on KAFKA-330: -- Can we have https://issues.apache.org/jira/secure/attachment/12625445/KAFKA-930_2014-01-27_13%3A28%3A51.patch this merged now that delete support is in? Add delete topic support - Key: KAFKA-330 URL: https://issues.apache.org/jira/browse/KAFKA-330 Project: Kafka Issue Type: Bug Components: controller, log, replication Affects Versions: 0.8.0, 0.8.1 Reporter: Neha Narkhede Assignee: Neha Narkhede Priority: Blocker Labels: features, project Fix For: 0.8.1 Attachments: KAFKA-330.patch, KAFKA-330_2014-01-28_15:19:20.patch, KAFKA-330_2014-01-28_22:01:16.patch, KAFKA-330_2014-01-31_14:19:14.patch, KAFKA-330_2014-01-31_17:45:25.patch, KAFKA-330_2014-02-01_11:30:32.patch, KAFKA-330_2014-02-01_14:58:31.patch, KAFKA-330_2014-02-05_09:31:30.patch, KAFKA-330_2014-02-06_07:48:40.patch, KAFKA-330_2014-02-06_09:42:38.patch, KAFKA-330_2014-02-06_10:29:31.patch, KAFKA-330_2014-02-06_11:37:48.patch, kafka-330-v1.patch, kafka-330-v2.patch One proposal of this API is here - https://cwiki.apache.org/confluence/display/KAFKA/Kafka+replication+detailed+design+V2#KafkareplicationdetaileddesignV2-Deletetopic -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883343#comment-13883343 ] Sriram Subramanian commented on KAFKA-930: -- Updated reviewboard https://reviews.apache.org/r/15711/ against branch origin/trunk Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9.0 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch, KAFKA-930_2013-11-19_17:38:49.patch, KAFKA-930_2013-11-21_09:42:11.patch, KAFKA-930_2013-12-09_22:51:57.patch, KAFKA-930_2013-12-20_11:13:01.patch, KAFKA-930_2013-12-20_11:22:36.patch, KAFKA-930_2014-01-27_13:28:51.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-930: - Attachment: KAFKA-930_2014-01-27_13:28:51.patch Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9.0 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch, KAFKA-930_2013-11-19_17:38:49.patch, KAFKA-930_2013-11-21_09:42:11.patch, KAFKA-930_2013-12-09_22:51:57.patch, KAFKA-930_2013-12-20_11:13:01.patch, KAFKA-930_2013-12-20_11:22:36.patch, KAFKA-930_2014-01-27_13:28:51.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Closed] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian closed KAFKA-930. Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9.0 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch, KAFKA-930_2013-11-19_17:38:49.patch, KAFKA-930_2013-11-21_09:42:11.patch, KAFKA-930_2013-12-09_22:51:57.patch, KAFKA-930_2013-12-20_11:13:01.patch, KAFKA-930_2013-12-20_11:22:36.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (KAFKA-838) Update design document to match Kafka 0.8 design
[ https://issues.apache.org/jira/browse/KAFKA-838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian resolved KAFKA-838. -- Resolution: Fixed Update design document to match Kafka 0.8 design Key: KAFKA-838 URL: https://issues.apache.org/jira/browse/KAFKA-838 Project: Kafka Issue Type: Sub-task Reporter: Neha Narkhede Assignee: Sriram Subramanian Kafka 0.8 design is significantly different as compared to Kafka 0.7 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (KAFKA-906) Invoke halt on shutdown and startup failure to ensure the jvm is brought down
[ https://issues.apache.org/jira/browse/KAFKA-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13855810#comment-13855810 ] Sriram Subramanian commented on KAFKA-906: -- We don't need this anymore. Invoke halt on shutdown and startup failure to ensure the jvm is brought down - Key: KAFKA-906 URL: https://issues.apache.org/jira/browse/KAFKA-906 Project: Kafka Issue Type: Bug Affects Versions: 0.8.0 Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.8.1 Sometimes kafka is made to run as a service in an external container. This container usually disables the individual services from exiting the process by installing a security manager. The right fix is to implement the startup and shutdown logic of kafka using the interfaces provided by these containers which would involve more work. For 0.8, we will simply call halt as the last step of shutdown and startup during a failure. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13854441#comment-13854441 ] Sriram Subramanian commented on KAFKA-930: -- Updated reviewboard https://reviews.apache.org/r/15711/ against branch origin/trunk Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9.0 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch, KAFKA-930_2013-11-19_17:38:49.patch, KAFKA-930_2013-11-21_09:42:11.patch, KAFKA-930_2013-12-09_22:51:57.patch, KAFKA-930_2013-12-20_11:13:01.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13854458#comment-13854458 ] Sriram Subramanian commented on KAFKA-930: -- Updated reviewboard https://reviews.apache.org/r/15711/ against branch origin/trunk Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9.0 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch, KAFKA-930_2013-11-19_17:38:49.patch, KAFKA-930_2013-11-21_09:42:11.patch, KAFKA-930_2013-12-09_22:51:57.patch, KAFKA-930_2013-12-20_11:13:01.patch, KAFKA-930_2013-12-20_11:22:36.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-930: - Attachment: KAFKA-930_2013-12-20_11:22:36.patch Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9.0 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch, KAFKA-930_2013-11-19_17:38:49.patch, KAFKA-930_2013-11-21_09:42:11.patch, KAFKA-930_2013-12-09_22:51:57.patch, KAFKA-930_2013-12-20_11:13:01.patch, KAFKA-930_2013-12-20_11:22:36.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13854605#comment-13854605 ] Sriram Subramanian commented on KAFKA-930: -- checked in to trunk Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9.0 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch, KAFKA-930_2013-11-19_17:38:49.patch, KAFKA-930_2013-11-21_09:42:11.patch, KAFKA-930_2013-12-09_22:51:57.patch, KAFKA-930_2013-12-20_11:13:01.patch, KAFKA-930_2013-12-20_11:22:36.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13844011#comment-13844011 ] Sriram Subramanian commented on KAFKA-930: -- Updated reviewboard https://reviews.apache.org/r/15711/ against branch origin/trunk Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9.0 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch, KAFKA-930_2013-11-19_17:38:49.patch, KAFKA-930_2013-11-21_09:42:11.patch, KAFKA-930_2013-12-09_22:51:57.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-930: - Attachment: KAFKA-930_2013-12-09_22:51:57.patch Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9.0 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch, KAFKA-930_2013-11-19_17:38:49.patch, KAFKA-930_2013-11-21_09:42:11.patch, KAFKA-930_2013-12-09_22:51:57.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829130#comment-13829130 ] Sriram Subramanian commented on KAFKA-930: -- Updated reviewboard https://reviews.apache.org/r/15711/ against branch origin/trunk Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch, KAFKA-930_2013-11-19_17:38:49.patch, KAFKA-930_2013-11-21_09:42:11.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-930: - Attachment: KAFKA-930.patch Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9 Attachments: KAFKA-930.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827208#comment-13827208 ] Sriram Subramanian commented on KAFKA-930: -- Created reviewboard https://reviews.apache.org/r/15711/ against branch origin/trunk Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9 Attachments: KAFKA-930.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827217#comment-13827217 ] Sriram Subramanian commented on KAFKA-930: -- Updated reviewboard https://reviews.apache.org/r/15711/ against branch origin/trunk Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-930: - Attachment: KAFKA-930_2013-11-19_17:37:29.patch Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9 Attachments: KAFKA-930.patch, KAFKA-930_2013-11-19_17:37:29.patch It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1097) Race condition while reassigning low throughput partition leads to incorrect ISR information in zookeeper
[ https://issues.apache.org/jira/browse/KAFKA-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802132#comment-13802132 ] Sriram Subramanian commented on KAFKA-1097: --- From the description, it does not seem like it causes bad things to happen. The non existent replica would be in the ISR till new data comes to the partition. Is there any bad things that can happen in this state? Race condition while reassigning low throughput partition leads to incorrect ISR information in zookeeper -- Key: KAFKA-1097 URL: https://issues.apache.org/jira/browse/KAFKA-1097 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Neha Narkhede Priority: Critical Fix For: 0.8 While moving partitions, the controller moves the old replicas through the following state changes - ONLINE - OFFLINE - NON_EXISTENT During the offline state change, the controller removes the old replica and writes the updated ISR to zookeeper and notifies the leader. Note that it doesn't notify the old replicas to stop fetching from the leader (to be fixed in KAFKA-1032). During the non-existent state change, the controller does not write the updated ISR or replica list to zookeeper. Right after the non-existent state change, the controller writes the new replica list to zookeeper, but does not update the ISR. So an old replica can send a fetch request after the offline state change, essentially letting the leader add it back to the ISR. The problem is that if there is no new data coming in for the partition and the old replica is fully caught up, the leader cannot remove it from the ISR. That lets a non existent replica live in the ISR at least until new data comes in to the partition -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (KAFKA-1052) integrate add-partitions command into topicCommand
[ https://issues.apache.org/jira/browse/KAFKA-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-1052: -- Attachment: KAFKA-1052.patch integrate add-partitions command into topicCommand -- Key: KAFKA-1052 URL: https://issues.apache.org/jira/browse/KAFKA-1052 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1 Reporter: Jun Rao Assignee: Sriram Subramanian Attachments: KAFKA-1052.patch After merging from 0.8 (kafka-1051), we dragged in a new admin command add-partitions. This needs to be integrated with the general topicCommand. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (KAFKA-1052) integrate add-partitions command into topicCommand
[ https://issues.apache.org/jira/browse/KAFKA-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-1052: -- Attachment: KAFKA-1052_2013-10-09_10:55:05.patch integrate add-partitions command into topicCommand -- Key: KAFKA-1052 URL: https://issues.apache.org/jira/browse/KAFKA-1052 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1 Reporter: Jun Rao Assignee: Sriram Subramanian Attachments: KAFKA-1052_2013-10-09_10:55:05.patch, KAFKA-1052.patch After merging from 0.8 (kafka-1051), we dragged in a new admin command add-partitions. This needs to be integrated with the general topicCommand. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Work started] (KAFKA-1052) integrate add-partitions command into topicCommand
[ https://issues.apache.org/jira/browse/KAFKA-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on KAFKA-1052 started by Sriram Subramanian. integrate add-partitions command into topicCommand -- Key: KAFKA-1052 URL: https://issues.apache.org/jira/browse/KAFKA-1052 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1 Reporter: Jun Rao Assignee: Sriram Subramanian After merging from 0.8 (kafka-1051), we dragged in a new admin command add-partitions. This needs to be integrated with the general topicCommand. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (KAFKA-1008) Unmap before resizing
[ https://issues.apache.org/jira/browse/KAFKA-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773517#comment-13773517 ] Sriram Subramanian commented on KAFKA-1008: --- +1 Unmap before resizing - Key: KAFKA-1008 URL: https://issues.apache.org/jira/browse/KAFKA-1008 Project: Kafka Issue Type: Bug Components: core, log Affects Versions: 0.8 Environment: Windows, Linux, Mac OS Reporter: Elizabeth Wei Assignee: Jay Kreps Labels: patch Fix For: 0.8 Attachments: KAFKA-0.8-1008-v7.patch, KAFKA-0.8-1008-v8.patch, KAFKA-1008-v6.patch, KAFKA-trunk-1008-v7.patch, unmap-v5.patch Original Estimate: 1h Remaining Estimate: 1h While I was studying how MappedByteBuffer works, I saw a sharing runtime exception on Windows. I applied what I learned to generate a patch which uses an internal open JDK API to solve this problem. Following Jay's advice, I made a helper method called tryUnmap(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-1043) Time-consuming FetchRequest could block other request in the response queue
[ https://issues.apache.org/jira/browse/KAFKA-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771574#comment-13771574 ] Sriram Subramanian commented on KAFKA-1043: --- We no longer block on the socket buffers if they get full. We do block on a slow I/O. That can be fixed by either capping it based on time or only fetching a subset of topics. Even if there is no slow I/O and the socket buffers do not get full, we could spend too much time processing the fetch response. In that case, fetching only a subset of topics should help. Anything more is just over engineering to me. Time-consuming FetchRequest could block other request in the response queue --- Key: KAFKA-1043 URL: https://issues.apache.org/jira/browse/KAFKA-1043 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1 Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.8, 0.8.1 Since in SocketServer the processor who takes any request is also responsible for writing the response for that request, we make each processor owning its own response queue. If a FetchRequest takes irregularly long time to write the channel buffer it would block all other responses in the queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-1012) Implement an Offset Manager and hook offset requests to it
[ https://issues.apache.org/jira/browse/KAFKA-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753726#comment-13753726 ] Sriram Subramanian commented on KAFKA-1012: --- Looks promising. Took a look at OffsetManager. The interface for OffsetManager needs more thinking. The zkClient, logManager are implementation specific. triggerLoadOffsets should also not be part of the interface. Imagine if someone needs to extend a new OffsetManager and they need to add a new k/v store client. They would have to end up modifying the interface to add their client. All this should at the minimum be part of the constructor. Even better would be if we can make the OffsetManager completely pluggable without needing to rebuild the kafka jar. Removing triggerLoadOffsets is going to need some amount of thinking though. At a minimum we need to think of a better name. trait OffsetManager extends Logging { protected var zkClient: ZkClient = null def startup(zkClient: ZkClient, logManager: LogManager = null) def getOffset(key: GroupTopicPartition): OffsetMetadataAndError def putOffset(key: GroupTopicPartition, offset: OffsetAndMetadata) def triggerLoadOffsets(partition: Int) def shutdown() } Implement an Offset Manager and hook offset requests to it -- Key: KAFKA-1012 URL: https://issues.apache.org/jira/browse/KAFKA-1012 Project: Kafka Issue Type: Sub-task Components: consumer Reporter: Tejas Patil Assignee: Tejas Patil Priority: Minor Attachments: KAFKA-1012.patch, KAFKA-1012-v2.patch After KAFKA-657, we have a protocol for consumers to commit and fetch offsets from brokers. Currently, consumers are not using this API and directly talking with Zookeeper. This Jira will involve following: 1. Add a special topic in kafka for storing offsets 2. Add an OffsetManager interface which would handle storing, accessing, loading and maintaining consumer offsets 3. Implement offset managers for both of these 2 choices : existing ZK based storage or inbuilt storage for offsets. 4. Leader brokers would now maintain an additional hash table of offsets for the group-topic-partitions that they lead 5. Consumers should now use the OffsetCommit and OffsetFetch API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (KAFKA-886) Update info on Controlled shutdown and Preferred replica election tool
[ https://issues.apache.org/jira/browse/KAFKA-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian resolved KAFKA-886. -- Resolution: Fixed Update info on Controlled shutdown and Preferred replica election tool -- Key: KAFKA-886 URL: https://issues.apache.org/jira/browse/KAFKA-886 Project: Kafka Issue Type: Sub-task Affects Versions: 0.8 Reporter: Sriram Subramanian Assignee: Sriram Subramanian Priority: Blocker Labels: p1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (KAFKA-886) Update info on Controlled shutdown and Preferred replica election tool
[ https://issues.apache.org/jira/browse/KAFKA-886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian closed KAFKA-886. Update info on Controlled shutdown and Preferred replica election tool -- Key: KAFKA-886 URL: https://issues.apache.org/jira/browse/KAFKA-886 Project: Kafka Issue Type: Sub-task Affects Versions: 0.8 Reporter: Sriram Subramanian Assignee: Sriram Subramanian Priority: Blocker Labels: p1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-347: - Resolution: Fixed Status: Resolved (was: Patch Available) change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 Attachments: kafka-347.patch, kafka-347-v2.patch, KAFKA-347-v2-rebased.patch, KAFKA-347-v3.patch, KAFKA-347-v4.patch, KAFKA-347-v5.patch We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian closed KAFKA-347. change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 Attachments: kafka-347.patch, kafka-347-v2.patch, KAFKA-347-v2-rebased.patch, KAFKA-347-v3.patch, KAFKA-347-v4.patch, KAFKA-347-v5.patch We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (KAFKA-988) Make ReassignReplica tool more usable
[ https://issues.apache.org/jira/browse/KAFKA-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian resolved KAFKA-988. -- Resolution: Fixed Make ReassignReplica tool more usable - Key: KAFKA-988 URL: https://issues.apache.org/jira/browse/KAFKA-988 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian As part of this first iteration, we will have two options - The manual option takes a list of topic - partition - replicas list and reassigns them. The automatic option takes a list of topics and broker list to move the topics to. The tool assigns the replicas for the topic partitions to these brokers using the default assignment strategy. A dry run will be provided to see the assignment before actually doing the assignment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (KAFKA-988) Make ReassignReplica tool more usable
[ https://issues.apache.org/jira/browse/KAFKA-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian closed KAFKA-988. Make ReassignReplica tool more usable - Key: KAFKA-988 URL: https://issues.apache.org/jira/browse/KAFKA-988 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian As part of this first iteration, we will have two options - The manual option takes a list of topic - partition - replicas list and reassigns them. The automatic option takes a list of topics and broker list to move the topics to. The tool assigns the replicas for the topic partitions to these brokers using the default assignment strategy. A dry run will be provided to see the assignment before actually doing the assignment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-1034) Port AddPartition and ReassignReplicas changes to trunk
Sriram Subramanian created KAFKA-1034: - Summary: Port AddPartition and ReassignReplicas changes to trunk Key: KAFKA-1034 URL: https://issues.apache.org/jira/browse/KAFKA-1034 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Once the 0.8 changes are merged to trunk, we need to do the following 1. Integrate AddPartition command with other admin tools in trunk 2. Remove usage of the first index of the partition list to find the number of replicas 3. For ReassignPartition tool, check if optimizing the writes to ZK after every replica reassignment is possible -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-990) Fix ReassignPartitionCommand and improve usability
[ https://issues.apache.org/jira/browse/KAFKA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-990: - Attachment: KAFKA-990-v3.patch Fix ReassignPartitionCommand and improve usability -- Key: KAFKA-990 URL: https://issues.apache.org/jira/browse/KAFKA-990 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Attachments: KAFKA-990-v1.patch, KAFKA-990-v1-rebased.patch, KAFKA-990-v2.patch, KAFKA-990-v3.patch 1. The tool does not register for IsrChangeListener on controller failover. 2. There is a race condition where the previous listener can fire on controller failover and the replicas can be in ISR. Even after re-registering the ISR listener after failover, it will never be triggered. 3. The input the tool is a static list which is very hard to use. To improve this, as a first step the tool needs to take a list of topics and list of brokers to do the assignment to and then generate the reassignment plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-990) Fix ReassignPartitionCommand and improve usability
[ https://issues.apache.org/jira/browse/KAFKA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13743944#comment-13743944 ] Sriram Subramanian commented on KAFKA-990: -- 2.1 will do so. 2.2 We cannot make it mandatory. It is not required when explicit list is specified. In the case when only topics are specified we do make it mandatory. 31. There is already a tool for that. It is called CheckReassignmentStatus. Fix ReassignPartitionCommand and improve usability -- Key: KAFKA-990 URL: https://issues.apache.org/jira/browse/KAFKA-990 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Attachments: KAFKA-990-v1.patch, KAFKA-990-v1-rebased.patch, KAFKA-990-v2.patch 1. The tool does not register for IsrChangeListener on controller failover. 2. There is a race condition where the previous listener can fire on controller failover and the replicas can be in ISR. Even after re-registering the ISR listener after failover, it will never be triggered. 3. The input the tool is a static list which is very hard to use. To improve this, as a first step the tool needs to take a list of topics and list of brokers to do the assignment to and then generate the reassignment plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-990) Fix ReassignPartitionCommand and improve usability
[ https://issues.apache.org/jira/browse/KAFKA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-990: - Attachment: KAFKA-990-v2.patch Fix ReassignPartitionCommand and improve usability -- Key: KAFKA-990 URL: https://issues.apache.org/jira/browse/KAFKA-990 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Attachments: KAFKA-990-v1.patch, KAFKA-990-v1-rebased.patch, KAFKA-990-v2.patch 1. The tool does not register for IsrChangeListener on controller failover. 2. There is a race condition where the previous listener can fire on controller failover and the replicas can be in ISR. Even after re-registering the ISR listener after failover, it will never be triggered. 3. The input the tool is a static list which is very hard to use. To improve this, as a first step the tool needs to take a list of topics and list of brokers to do the assignment to and then generate the reassignment plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-990) Fix ReassignPartitionCommand and improve usability
[ https://issues.apache.org/jira/browse/KAFKA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737416#comment-13737416 ] Sriram Subramanian commented on KAFKA-990: -- Neha - fixed what you suggested Jun - 1. KafkaController: 1.1 done 1.2 we can fail for now. we can revisit this. 1.3 done 2. ReassignPartitionsCommand 2.1 I did not do that as it makes the code ugly and it does not cause any harm. Let me know if you are strong about this. 2.2 i think it is safer to be explicit instead of using the live brokers for the move and causing perf issues 2.3 done 3. polluting the page cache is debatable. We could do the log appends on the follower by-passing the cache but when the follower becomes the leader, it could cause lot of IO. Another option is to throttle the rate at which the appends happen on the follower that reduces the sudden influx of messages at the follower and fetch requests at the leader. Both of these are outside the scope of this JIRA. Fix ReassignPartitionCommand and improve usability -- Key: KAFKA-990 URL: https://issues.apache.org/jira/browse/KAFKA-990 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Attachments: KAFKA-990-v1.patch, KAFKA-990-v1-rebased.patch, KAFKA-990-v2.patch 1. The tool does not register for IsrChangeListener on controller failover. 2. There is a race condition where the previous listener can fire on controller failover and the replicas can be in ISR. Even after re-registering the ISR listener after failover, it will never be triggered. 3. The input the tool is a static list which is very hard to use. To improve this, as a first step the tool needs to take a list of topics and list of brokers to do the assignment to and then generate the reassignment plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-1007) Document new tools before 0.8 release
Sriram Subramanian created KAFKA-1007: - Summary: Document new tools before 0.8 release Key: KAFKA-1007 URL: https://issues.apache.org/jira/browse/KAFKA-1007 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian We need to document the following tools before 0.8 release 1. Add partition tool 2. ReassignPartition tool -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-990) Fix ReassignPartitionCommand and improve usability
[ https://issues.apache.org/jira/browse/KAFKA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-990: - Attachment: KAFKA-990-v1.patch Fix ReassignPartitionCommand and improve usability -- Key: KAFKA-990 URL: https://issues.apache.org/jira/browse/KAFKA-990 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Attachments: KAFKA-990-v1.patch 1. The tool does not register for IsrChangeListener on controller failover. 2. There is a race condition where the previous listener can fire on controller failover and the replicas can be in ISR. Even after re-registering the ISR listener after failover, it will never be triggered. 3. The input the tool is a static list which is very hard to use. To improve this, as a first step the tool needs to take a list of topics and list of brokers to do the assignment to and then generate the reassignment plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-999) Controlled shutdown never succeeds until the broker is killed
[ https://issues.apache.org/jira/browse/KAFKA-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729869#comment-13729869 ] Sriram Subramanian commented on KAFKA-999: -- +1 on the proposed fix. Also this could happen only if the retry attempt is infinite. Controlled shutdown never succeeds until the broker is killed - Key: KAFKA-999 URL: https://issues.apache.org/jira/browse/KAFKA-999 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Neha Narkhede Priority: Critical A race condition in the way leader and isr request is handled by the broker and controlled shutdown can lead to a situation where controlled shutdown can never succeed and the only way to bounce the broker is to kill it. The root cause is that broker uses a smart to avoid fetching from a leader that is not alive according to the controller. This leads to the broker aborting a become follower request. And in cases where replication factor is 2, the leader can never be transferred to a follower since it keeps rejecting the become follower request and stays out of the ISR. This causes controlled shutdown to fail forever One sequence of events that led to this bug is as follows - - Broker 2 is leader and controller - Broker 2 is bounced (uncontrolled shutdown) - Controller fails over - Controlled shutdown is invoked on broker 1 - Controller starts leader election for partitions that broker 2 led - Controller sends become follower request with leader as broker 1 to broker 2. At the same time, it does not include broker 1 in alive broker list sent as part of leader and isr request - Broker 2 rejects leaderAndIsr request since leader is not in the list of alive brokers - Broker 1 fails to transfer leadership to broker 2 since broker 2 is not in ISR - Controlled shutdown can never succeed on broker 1 Since controlled shutdown is a config option, if there are bugs in controlled shutdown, there is no option but to kill the broker -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-615) Avoid fsync on log segment roll
[ https://issues.apache.org/jira/browse/KAFKA-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729935#comment-13729935 ] Sriram Subramanian commented on KAFKA-615: -- Do you not want to reset the recoveryPoint to the logEndOffset on startup? If logEndOffset is less than the recoveryPoint on startup, I think we could end up getting writes to the truncated offsets and we would not flush them. No? Avoid fsync on log segment roll --- Key: KAFKA-615 URL: https://issues.apache.org/jira/browse/KAFKA-615 Project: Kafka Issue Type: Bug Reporter: Jay Kreps Assignee: Neha Narkhede Attachments: KAFKA-615-v1.patch, KAFKA-615-v2.patch, KAFKA-615-v3.patch, KAFKA-615-v4.patch, KAFKA-615-v5.patch, KAFKA-615-v6.patch, KAFKA-615-v7.patch It still isn't feasible to run without an application level fsync policy. This is a problem as fsync locks the file and tuning such a policy so that the flushes aren't so frequent that seeks reduce throughput, yet not so infrequent that the fsync is writing so much data that there is a noticable jump in latency is very challenging. The remaining problem is the way that log recovery works. Our current policy is that if a clean shutdown occurs we do no recovery. If an unclean shutdown occurs we recovery the last segment of all logs. To make this correct we need to ensure that each segment is fsync'd before we create a new segment. Hence the fsync during roll. Obviously if the fsync during roll is the only time fsync occurs then it will potentially write out the entire segment which for a 1GB segment at 50mb/sec might take many seconds. The goal of this JIRA is to eliminate this and make it possible to run with no application-level fsyncs at all, depending entirely on replication and background writeback for durability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-615) Avoid fsync on log segment roll
[ https://issues.apache.org/jira/browse/KAFKA-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729989#comment-13729989 ] Sriram Subramanian commented on KAFKA-615: -- +1 Avoid fsync on log segment roll --- Key: KAFKA-615 URL: https://issues.apache.org/jira/browse/KAFKA-615 Project: Kafka Issue Type: Bug Reporter: Jay Kreps Assignee: Neha Narkhede Attachments: KAFKA-615-v1.patch, KAFKA-615-v2.patch, KAFKA-615-v3.patch, KAFKA-615-v4.patch, KAFKA-615-v5.patch, KAFKA-615-v6.patch, KAFKA-615-v7.patch, KAFKA-615-v8.patch It still isn't feasible to run without an application level fsync policy. This is a problem as fsync locks the file and tuning such a policy so that the flushes aren't so frequent that seeks reduce throughput, yet not so infrequent that the fsync is writing so much data that there is a noticable jump in latency is very challenging. The remaining problem is the way that log recovery works. Our current policy is that if a clean shutdown occurs we do no recovery. If an unclean shutdown occurs we recovery the last segment of all logs. To make this correct we need to ensure that each segment is fsync'd before we create a new segment. Hence the fsync during roll. Obviously if the fsync during roll is the only time fsync occurs then it will potentially write out the entire segment which for a 1GB segment at 50mb/sec might take many seconds. The goal of this JIRA is to eliminate this and make it possible to run with no application-level fsyncs at all, depending entirely on replication and background writeback for durability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-984) Avoid a full rebalance in cases when a new topic is discovered but container/broker set stay the same
[ https://issues.apache.org/jira/browse/KAFKA-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728601#comment-13728601 ] Sriram Subramanian commented on KAFKA-984: -- Agreed. I share the same thought and have pushed back on this change. We can work around this for now by partitioning the topics between the different instances. Avoid a full rebalance in cases when a new topic is discovered but container/broker set stay the same - Key: KAFKA-984 URL: https://issues.apache.org/jira/browse/KAFKA-984 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.8 Attachments: KAFKA-984.v1.patch, KAFKA-984.v2.patch, KAFKA-984.v2.patch Currently a full rebalance will be triggered on high level consumers even when just a new topic is added to ZK. Better avoid this behavior but only rebalance on this newly added topic. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-990) Fix ReassignPartitionCommand and improve usability
Sriram Subramanian created KAFKA-990: Summary: Fix ReassignPartitionCommand and improve usability Key: KAFKA-990 URL: https://issues.apache.org/jira/browse/KAFKA-990 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian 1. The tool does not register for IsrChangeListener on controller failover. 2. There is a race condition where the previous listener can fire on controller failover and the replicas can be in ISR. Even after re-registering the ISR listener after failover, it will never be triggered. 3. The input the tool is a static list which is very hard to use. To improve this, as a first step the tool needs to take a list of topics and list of brokers to do the assignment to and then generate the reassignment plan. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-988) Make ReassignReplica tool more usable
Sriram Subramanian created KAFKA-988: Summary: Make ReassignReplica tool more usable Key: KAFKA-988 URL: https://issues.apache.org/jira/browse/KAFKA-988 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian As part of this first iteration, we will have two options - The manual option takes a list of topic - partition - replicas list and reassigns them. The automatic option takes a list of topics and broker list to move the topics to. The tool assigns the replicas for the topic partitions to these brokers using the default assignment strategy. A dry run will be provided to see the assignment before actually doing the assignment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13715390#comment-13715390 ] Sriram Subramanian commented on KAFKA-347: -- v2.4 the reason to expose it is for manual replica assignment. It is more explicit to specify the rep factor and the assignments for those. Rebased without the zkconsumer connector change. change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 Attachments: kafka-347.patch, kafka-347-v2.patch, KAFKA-347-v2-rebased.patch We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-347: - Attachment: KAFKA-347-v2-rebased.patch change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 Attachments: kafka-347.patch, kafka-347-v2.patch, KAFKA-347-v2-rebased.patch We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13715469#comment-13715469 ] Sriram Subramanian commented on KAFKA-347: -- 20. Indentation seems fine to me. 21.2 It is present to make manual assignment more clear. 25 Yes the test was done. I will do another sanity check after the patch is commited. change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 Attachments: kafka-347.patch, kafka-347-v2.patch, KAFKA-347-v2-rebased.patch We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-347: - Attachment: KAFKA-347-v3.patch change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 Attachments: kafka-347.patch, kafka-347-v2.patch, KAFKA-347-v2-rebased.patch, KAFKA-347-v3.patch We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-347: - Attachment: KAFKA-347-v4.patch change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 Attachments: kafka-347.patch, kafka-347-v2.patch, KAFKA-347-v2-rebased.patch, KAFKA-347-v3.patch, KAFKA-347-v4.patch We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13715943#comment-13715943 ] Sriram Subramanian commented on KAFKA-347: -- added a script for addpartitions change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 Attachments: kafka-347.patch, kafka-347-v2.patch, KAFKA-347-v2-rebased.patch, KAFKA-347-v3.patch, KAFKA-347-v4.patch, KAFKA-347-v5.patch We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-347: - Attachment: KAFKA-347-v5.patch change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 Attachments: kafka-347.patch, kafka-347-v2.patch, KAFKA-347-v2-rebased.patch, KAFKA-347-v3.patch, KAFKA-347-v4.patch, KAFKA-347-v5.patch We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-982) Logo for Kafka
[ https://issues.apache.org/jira/browse/KAFKA-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13715946#comment-13715946 ] Sriram Subramanian commented on KAFKA-982: -- +1 for 294. 294 is what I like but it does not seem to get any love. Second choice 298 Logo for Kafka -- Key: KAFKA-982 URL: https://issues.apache.org/jira/browse/KAFKA-982 Project: Kafka Issue Type: Improvement Reporter: Jay Kreps Attachments: 289.jpeg, 294.jpeg, 296.png, 298.jpeg We should have a logo for kafka. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-347: - Attachment: kafka-347-v2.patch change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 Attachments: kafka-347.patch, kafka-347-v2.patch We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-979) Add jitter for time based rolling
Sriram Subramanian created KAFKA-979: Summary: Add jitter for time based rolling Key: KAFKA-979 URL: https://issues.apache.org/jira/browse/KAFKA-979 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Currently, for low volume topics time based rolling happens at the same time. This causes a lot of IO on a typical cluster and creates back pressure. We need to add a jitter to prevent them from happening at the same time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-977) Implement generation/term per leader to reconcile messages correctly
Sriram Subramanian created KAFKA-977: Summary: Implement generation/term per leader to reconcile messages correctly Key: KAFKA-977 URL: https://issues.apache.org/jira/browse/KAFKA-977 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian During unclean leader election, the log messages can diverge and when the followers come back up Kafka does not reconcile correctly. To implement it correctly, we need to add a term/generation to each message and use that to reconcile. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (KAFKA-930) Integrate preferred replica election logic into kafka
[ https://issues.apache.org/jira/browse/KAFKA-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on KAFKA-930 started by Sriram Subramanian. Integrate preferred replica election logic into kafka - Key: KAFKA-930 URL: https://issues.apache.org/jira/browse/KAFKA-930 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.9 It seems useful to integrate the preferred replica election logic into kafka controller. A simple way to implement this would be to have a background thread that periodically finds the topic partitions that are not assigned to the preferred broker and initiate the move. We could come up with some heuristics to initiate the move only if the imbalance over a specific threshold in order to avoid rebalancing too aggressively. Making the software do this reduces operational cost. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (KAFKA-838) Update design document to match Kafka 0.8 design
[ https://issues.apache.org/jira/browse/KAFKA-838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian reassigned KAFKA-838: Assignee: Sriram Subramanian Update design document to match Kafka 0.8 design Key: KAFKA-838 URL: https://issues.apache.org/jira/browse/KAFKA-838 Project: Kafka Issue Type: Sub-task Reporter: Neha Narkhede Assignee: Sriram Subramanian Kafka 0.8 design is significantly different as compared to Kafka 0.7 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (KAFKA-781) Add option to the controlled shutdown tool to timeout after n secs
[ https://issues.apache.org/jira/browse/KAFKA-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian resolved KAFKA-781. -- Resolution: Fixed We have moved this logic into the broker. Add option to the controlled shutdown tool to timeout after n secs -- Key: KAFKA-781 URL: https://issues.apache.org/jira/browse/KAFKA-781 Project: Kafka Issue Type: Improvement Components: tools Affects Versions: 0.8 Reporter: Neha Narkhede Priority: Critical Labels: replication-tools Right now, the controlled shutdown tool has a number of retries option. This is required since it might take multiple retries to move leaders from a broker. However, it is convenient to also have an option that allows the tool to timeout after n secs and retry until then. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up
[ https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian closed KAFKA-969. Need to prevent failure of rebalance when there are no brokers available when consumer comes up --- Key: KAFKA-969 URL: https://issues.apache.org/jira/browse/KAFKA-969 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Fix For: 0.8 Attachments: emptybrokeronrebalance-1.patch There are some rare instances when a consumer would be up before bringing up the Kafka brokers. This would usually happen in a test scenario. In such conditions, during rebalance instead of failing the rebalance we just log the error and subscribe to broker changes. When the broker comes back up, we trigger the rebalance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up
Sriram Subramanian created KAFKA-969: Summary: Need to prevent failure of rebalance when there are no brokers available when consumer comes up Key: KAFKA-969 URL: https://issues.apache.org/jira/browse/KAFKA-969 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Attachments: emptybrokeronrebalance-1.patch There are some rare instances when a consumer would be up before bringing up the Kafka brokers. This would usually happen in a test scenario. In such conditions, during rebalance instead of failing the rebalance we just log the error and subscribe to broker changes. When the broker comes back up, we trigger the rebalance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up
[ https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-969: - Attachment: emptybrokeronrebalance-1.patch Need to prevent failure of rebalance when there are no brokers available when consumer comes up --- Key: KAFKA-969 URL: https://issues.apache.org/jira/browse/KAFKA-969 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Attachments: emptybrokeronrebalance-1.patch There are some rare instances when a consumer would be up before bringing up the Kafka brokers. This would usually happen in a test scenario. In such conditions, during rebalance instead of failing the rebalance we just log the error and subscribe to broker changes. When the broker comes back up, we trigger the rebalance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up
[ https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-969: - Status: Patch Available (was: Open) Need to prevent failure of rebalance when there are no brokers available when consumer comes up --- Key: KAFKA-969 URL: https://issues.apache.org/jira/browse/KAFKA-969 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Attachments: emptybrokeronrebalance-1.patch There are some rare instances when a consumer would be up before bringing up the Kafka brokers. This would usually happen in a test scenario. In such conditions, during rebalance instead of failing the rebalance we just log the error and subscribe to broker changes. When the broker comes back up, we trigger the rebalance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-969) Need to prevent failure of rebalance when there are no brokers available when consumer comes up
[ https://issues.apache.org/jira/browse/KAFKA-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13705461#comment-13705461 ] Sriram Subramanian commented on KAFKA-969: -- The other issues you mention are separate from this. You should file JIRAs for those. Need to prevent failure of rebalance when there are no brokers available when consumer comes up --- Key: KAFKA-969 URL: https://issues.apache.org/jira/browse/KAFKA-969 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Attachments: emptybrokeronrebalance-1.patch There are some rare instances when a consumer would be up before bringing up the Kafka brokers. This would usually happen in a test scenario. In such conditions, during rebalance instead of failing the rebalance we just log the error and subscribe to broker changes. When the broker comes back up, we trigger the rebalance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-911) Bug in controlled shutdown logic in controller leads to controller not sending out some state change request
[ https://issues.apache.org/jira/browse/KAFKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699500#comment-13699500 ] Sriram Subramanian commented on KAFKA-911: -- This has been fixed. Bug in controlled shutdown logic in controller leads to controller not sending out some state change request - Key: KAFKA-911 URL: https://issues.apache.org/jira/browse/KAFKA-911 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Neha Narkhede Priority: Blocker Labels: kafka-0.8, p1 Attachments: kafka-911-v1.patch, kafka-911-v2.patch The controlled shutdown logic in the controller first tries to move the leaders from the broker being shutdown. Then it tries to remove the broker from the isr list. During that operation, it does not synchronize on the controllerLock. This causes a race condition while dispatching data using the controller's channel manager. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-911) Bug in controlled shutdown logic in controller leads to controller not sending out some state change request
[ https://issues.apache.org/jira/browse/KAFKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-911: - Resolution: Fixed Status: Resolved (was: Patch Available) Bug in controlled shutdown logic in controller leads to controller not sending out some state change request - Key: KAFKA-911 URL: https://issues.apache.org/jira/browse/KAFKA-911 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 0.8 Reporter: Neha Narkhede Assignee: Neha Narkhede Priority: Blocker Labels: kafka-0.8, p1 Attachments: kafka-911-v1.patch, kafka-911-v2.patch The controlled shutdown logic in the controller first tries to move the leaders from the broker being shutdown. Then it tries to remove the broker from the isr list. During that operation, it does not synchronize on the controllerLock. This causes a race condition while dispatching data using the controller's channel manager. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-184) Log retention size and file size should be a long
[ https://issues.apache.org/jira/browse/KAFKA-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699510#comment-13699510 ] Sriram Subramanian commented on KAFKA-184: -- We have already fixed all the config naming and types in 0.8. We can resolve this. Log retention size and file size should be a long - Key: KAFKA-184 URL: https://issues.apache.org/jira/browse/KAFKA-184 Project: Kafka Issue Type: Bug Affects Versions: 0.7 Reporter: Joel Koshy Priority: Minor Fix For: 0.8.1 Attachments: KAFKA-184-0.8.patch Realized this in a local set up: the log.retention.size config option should be a long, or we're limited to 2GB. Also, the name can be improved to log.retention.size.bytes or Mbytes as appropriate. Same comments for log.file.size. If we rename the configs, it would be better to resolve KAFKA-181 first. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-717) scala 2.10 build support
[ https://issues.apache.org/jira/browse/KAFKA-717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697490#comment-13697490 ] Sriram Subramanian commented on KAFKA-717: -- I get the following errors when I try to apply the patches to 0.8 in the order that were added on 14th Jun - 1. patch 1 failed ../Downloads/0001-common-changes-for-2.10.patch:34: trailing whitespace. unmanagedSourceDirectories in Compile += (sourceDirectory in Compile, scalaVersion){ (s,v) = ../Downloads/0001-common-changes-for-2.10.patch:36: trailing whitespace. case v if v.startsWith(2.8.) = 2.8.x ../Downloads/0001-common-changes-for-2.10.patch:38: trailing whitespace. })) ../Downloads/0001-common-changes-for-2.10.patch:54: trailing whitespace. * ../Downloads/0001-common-changes-for-2.10.patch:69: trailing whitespace. * Indicates that the annotated class is meant to be threadsafe. For an abstract class it is an part of the interface that an implementation error: patch failed: project/Build.scala:28 error: project/Build.scala: patch does not apply 2. patch 3 failed error: patch failed: project/Build.scala:28 error: project/Build.scala: patch does not apply 3. patch 4 failed error: patch failed: core/build.sbt:22 error: core/build.sbt: patch does not apply error: patch failed: core/src/test/scala/unit/kafka/metrics/KafkaTimerTest.scala:36 error: core/src/test/scala/unit/kafka/metrics/KafkaTimerTest.scala: patch does not apply error: patch failed: project/Build.scala:28 error: project/Build.scala: patch does not apply I used git apply scala 2.10 build support Key: KAFKA-717 URL: https://issues.apache.org/jira/browse/KAFKA-717 Project: Kafka Issue Type: Improvement Components: packaging Affects Versions: 0.8 Reporter: Viktor Taranenko Labels: build Attachments: 0001-common-changes-for-2.10.patch, 0001-common-changes-for-2.10.patch, 0001-KAFKA-717-Convert-to-scala-2.10.patch, 0002-java-conversions-changes.patch, 0002-java-conversions-changes.patch, 0003-add-2.9.3.patch, 0003-add-2.9.3.patch, 0004-Fix-cross-compile-of-tests-update-to-2.10.2-and-set-.patch, KAFKA-717-complex.patch, KAFKA-717-simple.patch, kafka_scala_2.10.tar.gz -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-950) bytesSinceLastIndexEntry needs to be reset after log segment is truncated
[ https://issues.apache.org/jira/browse/KAFKA-950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13690036#comment-13690036 ] Sriram Subramanian commented on KAFKA-950: -- +1 bytesSinceLastIndexEntry needs to be reset after log segment is truncated - Key: KAFKA-950 URL: https://issues.apache.org/jira/browse/KAFKA-950 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Jun Rao Attachments: kafka-950.patch bytesSinceLastIndexEntry needs to be reset after log segment is truncated. Otherwise, it's possible to add an index entry that points to the first message in a log segment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (KAFKA-347) change number of partitions of a topic online
[ https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian reassigned KAFKA-347: Assignee: Sriram Subramanian change number of partitions of a topic online - Key: KAFKA-347 URL: https://issues.apache.org/jira/browse/KAFKA-347 Project: Kafka Issue Type: Improvement Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Sriram Subramanian Labels: features Fix For: 0.8.1 We will need an admin tool to change the number of partitions of a topic online. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-935) Fix shutdown tool to work with new shutdown api
[ https://issues.apache.org/jira/browse/KAFKA-935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriram Subramanian updated KAFKA-935: - Attachment: shutdowntool.patch Fix shutdown tool to work with new shutdown api --- Key: KAFKA-935 URL: https://issues.apache.org/jira/browse/KAFKA-935 Project: Kafka Issue Type: Bug Reporter: Sriram Subramanian Assignee: Sriram Subramanian Attachments: shutdowntool.patch This seems to have been missed in the last patch Integrating controlled shutdown. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira