Helping out
Hey, I now have roughly a day a week I can dedicate to working on Kafka, so I am looking for issues in the 0.8.1 batch that you think might be good starting points. Input would be much appreciated. Speaking of issues, I think it would be good to either fix https://issues.apache.org/jira/browse/KAFKA-946 for 0.8 or just drop the code from the release. /Sam
Re: [VOTE] Apache Kafka 0.8.0-beta1 candidate 2
I am +1 as well The vote is closed. With four +1 votes the release passes. Awesome and amazing work by everyone that worked on and contributed to this release, thank you!!! I am going to follow-up with getting a dist folder https://issues.apache.org/jira/browse/INFRA-6400 and see again now about publishing https://issues.apache.org/jira/browse/INFRA-6414 maybe I don't need the mvn encrypt or did something wrong. I will also update kafka.apache.org with the links to the artifacts we are releasing for 0.8.0 beta 1 for download and when that is done send an announcement too On Sun, Jun 23, 2013 at 5:25 PM, Jay Kreps jay.kr...@gmail.com wrote: +1 with reservations Two things to note: 1. The README references a target release-zip that I don't think exists anymore. 2. The README doesn't explain how to start the server, which seems like useful information to have. I think we can probably live with these and document on the website. -Jay On Fri, Jun 21, 2013 at 5:47 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 Thanks Joe! On Thu, Jun 20, 2013 at 8:25 AM, Jun Rao jun...@gmail.com wrote: +1. Verified both unit tests and quick start. Thanks, Jun On Wed, Jun 19, 2013 at 9:28 PM, Joe Stein crypt...@gmail.com wrote: This is the second release candidate vote for the Apache Kafka 0.8.0-beta1 release. 0.8.0 beta1 Release Notes http://people.apache.org/~joestein/kafka-0.8.0-beta1-candidate2/RELEASE_NOTES.html *** The vote will be open for 72 hours (longer if needed) *** We are voting on artifacts for release from http://people.apache.org/~joestein/kafka-0.8.0-beta1-candidate2/ The tag to be voted upon (off the 0.8 branch): https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=34a278b4cc1bbce0a876e9255b2bfa7c2133f313 Kafka's KEYS file containing PGP keys we use to sign the release: http://svn.apache.org/repos/asf/kafka/KEYS /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop */ -- /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop */
[jira] [Created] (KAFKA-953) Remove release-zip from README we are not releasing with it
Joe Stein created KAFKA-953: --- Summary: Remove release-zip from README we are not releasing with it Key: KAFKA-953 URL: https://issues.apache.org/jira/browse/KAFKA-953 Project: Kafka Issue Type: Bug Affects Versions: 0.8 Reporter: Joe Stein Fix For: 0.8 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-954) tidy up README file for better general availability
Joe Stein created KAFKA-954: --- Summary: tidy up README file for better general availability Key: KAFKA-954 URL: https://issues.apache.org/jira/browse/KAFKA-954 Project: Kafka Issue Type: Bug Affects Versions: 0.8 Reporter: Joe Stein Fix For: 0.8 e.g. how to start server after building and all would be good too -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-953) Remove release-zip from README we are not releasing with it
[ https://issues.apache.org/jira/browse/KAFKA-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe Stein updated KAFKA-953: Priority: Blocker (was: Major) Remove release-zip from README we are not releasing with it --- Key: KAFKA-953 URL: https://issues.apache.org/jira/browse/KAFKA-953 Project: Kafka Issue Type: Bug Affects Versions: 0.8 Reporter: Joe Stein Priority: Blocker Labels: 0.8.0-beta1 Fix For: 0.8 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-937) ConsumerFetcherThread can deadlock
[ https://issues.apache.org/jira/browse/KAFKA-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692085#comment-13692085 ] Jun Rao commented on KAFKA-937: --- Alexey, This issue seems to be unrelated to this patch. The exception is thrown in SimpleConsumer and this patch doesn't touch SimpleConsumer. Could you describe how you get to this issue and how reproducible it is? ConsumerFetcherThread can deadlock -- Key: KAFKA-937 URL: https://issues.apache.org/jira/browse/KAFKA-937 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Jun Rao Fix For: 0.8 Attachments: kafka-937_delta.patch, kafka-937.patch We have the following access pattern that can introduce a deadlock. AbstractFetcherThread.processPartitionsWithError() - ConsumerFetcherThread.processPartitionsWithError() - ConsumerFetcherManager.addPartitionsWithError() wait for lock - LeaderFinderThread holding lock while calling AbstractFetcherManager.shutdownIdleFetcherThreads() - AbstractFetcherManager calling fetcher.shutdown, which needs to wait until AbstractFetcherThread.processPartitionsWithError() completes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-926) Error in consumer when the leader for some partitions is failing over to another replica.
[ https://issues.apache.org/jira/browse/KAFKA-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692127#comment-13692127 ] Jun Rao commented on KAFKA-926: --- Balaji, Is this still happening in latest 0.8? Error in consumer when the leader for some partitions is failing over to another replica. - Key: KAFKA-926 URL: https://issues.apache.org/jira/browse/KAFKA-926 Project: Kafka Issue Type: Bug Reporter: BalajiSeshadri Fix For: 0.8.1 Error in consumer when the leader for some partitions is failing over to another replica.-Created as per request from Neha. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Apache Kafka 0.8.0-beta1 candidate 2
Hey Joe, There is actually a fair amount of updating on docs needed--we still don't have an updated design doc. Let me take a shot at updating these I think I can get it done today. Let me know if you have a new published release and I can update the downloads page too. -Jay On Mon, Jun 24, 2013 at 3:59 AM, Joe Stein crypt...@gmail.com wrote: I am +1 as well The vote is closed. With four +1 votes the release passes. Awesome and amazing work by everyone that worked on and contributed to this release, thank you!!! I am going to follow-up with getting a dist folder https://issues.apache.org/jira/browse/INFRA-6400 and see again now about publishing https://issues.apache.org/jira/browse/INFRA-6414 maybe I don't need the mvn encrypt or did something wrong. I will also update kafka.apache.org with the links to the artifacts we are releasing for 0.8.0 beta 1 for download and when that is done send an announcement too On Sun, Jun 23, 2013 at 5:25 PM, Jay Kreps jay.kr...@gmail.com wrote: +1 with reservations Two things to note: 1. The README references a target release-zip that I don't think exists anymore. 2. The README doesn't explain how to start the server, which seems like useful information to have. I think we can probably live with these and document on the website. -Jay On Fri, Jun 21, 2013 at 5:47 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 Thanks Joe! On Thu, Jun 20, 2013 at 8:25 AM, Jun Rao jun...@gmail.com wrote: +1. Verified both unit tests and quick start. Thanks, Jun On Wed, Jun 19, 2013 at 9:28 PM, Joe Stein crypt...@gmail.com wrote: This is the second release candidate vote for the Apache Kafka 0.8.0-beta1 release. 0.8.0 beta1 Release Notes http://people.apache.org/~joestein/kafka-0.8.0-beta1-candidate2/RELEASE_NOTES.html *** The vote will be open for 72 hours (longer if needed) *** We are voting on artifacts for release from http://people.apache.org/~joestein/kafka-0.8.0-beta1-candidate2/ The tag to be voted upon (off the 0.8 branch): https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=34a278b4cc1bbce0a876e9255b2bfa7c2133f313 Kafka's KEYS file containing PGP keys we use to sign the release: http://svn.apache.org/repos/asf/kafka/KEYS /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop */ -- /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop */
Re: FAQ
Yes, I think that would be better. Thanks, Jun On Mon, Jun 24, 2013 at 10:30 AM, Jay Kreps jay.kr...@gmail.com wrote: I have noticed we don't do a good job of updating the FAQs. Would we do better if I migrated it to the wiki so it was easier to edit? -Jay
[jira] Subscription: outstanding kafka patches
Issue Subscription Filter: outstanding kafka patches (75 issues) The list of outstanding kafka patches Subscriber: kafka-mailing-list Key Summary KAFKA-946 Kafka Hadoop Consumer fails when verifying message checksum https://issues.apache.org/jira/browse/KAFKA-946 KAFKA-943 Move all configuration key string to constants https://issues.apache.org/jira/browse/KAFKA-943 KAFKA-932 System Test - set retry.backoff.ms=300 to all test cases https://issues.apache.org/jira/browse/KAFKA-932 KAFKA-925 Add optional partition key override in producer https://issues.apache.org/jira/browse/KAFKA-925 KAFKA-923 Improve controller failover latency https://issues.apache.org/jira/browse/KAFKA-923 KAFKA-922 System Test - set retry.backoff.ms=300 to testcase_0119 https://issues.apache.org/jira/browse/KAFKA-922 KAFKA-917 Expose zk.session.timeout.ms in console consumer https://issues.apache.org/jira/browse/KAFKA-917 KAFKA-915 System Test - Mirror Maker testcase_5001 failed https://issues.apache.org/jira/browse/KAFKA-915 KAFKA-911 Bug in controlled shutdown logic in controller leads to controller not sending out some state change request https://issues.apache.org/jira/browse/KAFKA-911 KAFKA-898 Add a KafkaMetricsReporter that wraps Librato's reporter https://issues.apache.org/jira/browse/KAFKA-898 KAFKA-896 merge 0.8 (988d4d8e65a14390abd748318a64e281e4a37c19) to trunk https://issues.apache.org/jira/browse/KAFKA-896 KAFKA-885 sbt package builds two kafka jars https://issues.apache.org/jira/browse/KAFKA-885 KAFKA-881 Kafka broker not respecting log.roll.hours https://issues.apache.org/jira/browse/KAFKA-881 KAFKA-877 Still getting kafka.common.NotLeaderForPartitionException https://issues.apache.org/jira/browse/KAFKA-877 KAFKA-873 Consider replacing zkclient with curator (with zkclient-bridge) https://issues.apache.org/jira/browse/KAFKA-873 KAFKA-868 System Test - add test case for rolling controlled shutdown https://issues.apache.org/jira/browse/KAFKA-868 KAFKA-863 System Test - update 0.7 version of kafka-run-class.sh for Migration Tool test cases https://issues.apache.org/jira/browse/KAFKA-863 KAFKA-859 support basic auth protection of mx4j console https://issues.apache.org/jira/browse/KAFKA-859 KAFKA-855 Ant+Ivy build for Kafka https://issues.apache.org/jira/browse/KAFKA-855 KAFKA-854 Upgrade dependencies for 0.8 https://issues.apache.org/jira/browse/KAFKA-854 KAFKA-852 Remove clientId from OffsetFetchResponse and OffsetCommitResponse https://issues.apache.org/jira/browse/KAFKA-852 KAFKA-836 Update quickstart for Kafka 0.8 https://issues.apache.org/jira/browse/KAFKA-836 KAFKA-835 Update 0.8 configs on the website https://issues.apache.org/jira/browse/KAFKA-835 KAFKA-815 Improve SimpleConsumerShell to take in a max messages config option https://issues.apache.org/jira/browse/KAFKA-815 KAFKA-745 Remove getShutdownReceive() and other kafka specific code from the RequestChannel https://issues.apache.org/jira/browse/KAFKA-745 KAFKA-739 Handle null values in Message payload https://issues.apache.org/jira/browse/KAFKA-739 KAFKA-735 Add looping and JSON output for ConsumerOffsetChecker https://issues.apache.org/jira/browse/KAFKA-735 KAFKA-717 scala 2.10 build support https://issues.apache.org/jira/browse/KAFKA-717 KAFKA-705 Controlled shutdown doesn't seem to work on more than one broker in a cluster https://issues.apache.org/jira/browse/KAFKA-705 KAFKA-686 0.8 Kafka broker should give a better error message when running against 0.7 zookeeper https://issues.apache.org/jira/browse/KAFKA-686 KAFKA-682 java.lang.OutOfMemoryError: Java heap space https://issues.apache.org/jira/browse/KAFKA-682 KAFKA-677 Retention process gives exception if an empty segment is chosen for collection https://issues.apache.org/jira/browse/KAFKA-677 KAFKA-674 Clean Shutdown Testing - Log segments checksums mismatch https://issues.apache.org/jira/browse/KAFKA-674 KAFKA-652 Create testcases for clean shut-down https://issues.apache.org/jira/browse/KAFKA-652 KAFKA-649 Cleanup log4j logging https://issues.apache.org/jira/browse/KAFKA-649 KAFKA-645 Create a shell script to run System Test with DEBUG details and tee console output to a file https://issues.apache.org/jira/browse/KAFKA-645 KAFKA-637 Separate log4j environment variable from KAFKA_OPTS in kafka-run-class.sh https://issues.apache.org/jira/browse/KAFKA-637 KAFKA-621 System Test 9051 : ConsoleConsumer doesn't receives any data for 20 topics but works for 10
[jira] [Commented] (KAFKA-937) ConsumerFetcherThread can deadlock
[ https://issues.apache.org/jira/browse/KAFKA-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692418#comment-13692418 ] Alexey Ozeritskiy commented on KAFKA-937: - https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=blobdiff;f=core/src/main/scala/kafka/consumer/SimpleConsumer.scala;h=1c283280873eef597018f2f0a5ddfec942356c18;hp=bdeee9174a32a02209d769c18a0337ade0356e99;hb=5bd33c1517bb2e7734166dc3e787ac90a4ef8f86;hpb=640026467cf705fbcf6fd6bcada058b18a95bff5 ConsumerFetcherThread can deadlock -- Key: KAFKA-937 URL: https://issues.apache.org/jira/browse/KAFKA-937 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Jun Rao Fix For: 0.8 Attachments: kafka-937_delta.patch, kafka-937.patch We have the following access pattern that can introduce a deadlock. AbstractFetcherThread.processPartitionsWithError() - ConsumerFetcherThread.processPartitionsWithError() - ConsumerFetcherManager.addPartitionsWithError() wait for lock - LeaderFinderThread holding lock while calling AbstractFetcherManager.shutdownIdleFetcherThreads() - AbstractFetcherManager calling fetcher.shutdown, which needs to wait until AbstractFetcherThread.processPartitionsWithError() completes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (KAFKA-937) ConsumerFetcherThread can deadlock
[ https://issues.apache.org/jira/browse/KAFKA-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692418#comment-13692418 ] Alexey Ozeritskiy edited comment on KAFKA-937 at 6/24/13 9:44 PM: -- This patch touches SimpleConsumer. proof: https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=blobdiff;f=core/src/main/scala/kafka/consumer/SimpleConsumer.scala;h=1c283280873eef597018f2f0a5ddfec942356c18;hp=bdeee9174a32a02209d769c18a0337ade0356e99;hb=5bd33c1517bb2e7734166dc3e787ac90a4ef8f86;hpb=640026467cf705fbcf6fd6bcada058b18a95bff5 was (Author: aozeritsky): https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=blobdiff;f=core/src/main/scala/kafka/consumer/SimpleConsumer.scala;h=1c283280873eef597018f2f0a5ddfec942356c18;hp=bdeee9174a32a02209d769c18a0337ade0356e99;hb=5bd33c1517bb2e7734166dc3e787ac90a4ef8f86;hpb=640026467cf705fbcf6fd6bcada058b18a95bff5 ConsumerFetcherThread can deadlock -- Key: KAFKA-937 URL: https://issues.apache.org/jira/browse/KAFKA-937 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Jun Rao Fix For: 0.8 Attachments: kafka-937_delta.patch, kafka-937.patch We have the following access pattern that can introduce a deadlock. AbstractFetcherThread.processPartitionsWithError() - ConsumerFetcherThread.processPartitionsWithError() - ConsumerFetcherManager.addPartitionsWithError() wait for lock - LeaderFinderThread holding lock while calling AbstractFetcherManager.shutdownIdleFetcherThreads() - AbstractFetcherManager calling fetcher.shutdown, which needs to wait until AbstractFetcherThread.processPartitionsWithError() completes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-133) Publish kafka jar to a public maven repository
[ https://issues.apache.org/jira/browse/KAFKA-133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692519#comment-13692519 ] Russell Jurney commented on KAFKA-133: -- When I type: ./sbt actions, I see one to create a POM.xml We're having to build kafka and host it on S3 ourselves, which is unfortunate. Publish kafka jar to a public maven repository -- Key: KAFKA-133 URL: https://issues.apache.org/jira/browse/KAFKA-133 Project: Kafka Issue Type: Improvement Affects Versions: 0.6, 0.8 Reporter: Neha Narkhede Labels: patch Fix For: 0.8 Attachments: KAFKA-133.patch, pom.xml The released kafka jar must be download manually and then deploy to a private repository before they can be used by a developer using maven2. Similar to other Apache projects, it will be nice to have a way to publish Kafka releases to a public maven repo. In the past, we gave it a try using sbt publish to Sonatype Nexus maven repo, but ran into some authentication problems. It will be good to revisit this and get it resolved. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-656) Add Quotas to Kafka
[ https://issues.apache.org/jira/browse/KAFKA-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692627#comment-13692627 ] Prashanth Menon commented on KAFKA-656: --- Hey Jonathan, have you made any head way on this? Let me know, I'd like to give this a go if you're tied up :) Add Quotas to Kafka --- Key: KAFKA-656 URL: https://issues.apache.org/jira/browse/KAFKA-656 Project: Kafka Issue Type: New Feature Components: core Affects Versions: 0.8.1 Reporter: Jay Kreps Labels: project It would be nice to implement a quota system in Kafka to improve our support for highly multi-tenant usage. The goal of this system would be to prevent one naughty user from accidently overloading the whole cluster. There are several quantities we would want to track: 1. Requests pers second 2. Bytes written per second 3. Bytes read per second There are two reasonable groupings we would want to aggregate and enforce these thresholds at: 1. Topic level 2. Client level (e.g. by client id from the request) When a request hits one of these limits we will simply reject it with a QUOTA_EXCEEDED exception. To avoid suddenly breaking things without warning, we should ideally support two thresholds: a soft threshold at which we produce some kind of warning and a hard threshold at which we give the error. The soft threshold could just be defined as 80% (or whatever) of the hard threshold. There are nuances to getting this right. If you measure second-by-second a single burst may exceed the threshold, so we need a sustained measurement over a period of time. Likewise when do we stop giving this error? To make this work right we likely need to charge against the quota for request *attempts* not just successful requests. Otherwise a client that is overloading the server will just flap on and off--i.e. we would disable them for a period of time but when we re-enabled them they would likely still be abusing us. It would be good to a wiki design on how this would all work as a starting point for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-656) Add Quotas to Kafka
[ https://issues.apache.org/jira/browse/KAFKA-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692629#comment-13692629 ] Jonathan Creasy commented on KAFKA-656: --- Go for it, I knew someone would likely get to this before I got back to it, I didn't really do anything on it yet. Add Quotas to Kafka --- Key: KAFKA-656 URL: https://issues.apache.org/jira/browse/KAFKA-656 Project: Kafka Issue Type: New Feature Components: core Affects Versions: 0.8.1 Reporter: Jay Kreps Labels: project It would be nice to implement a quota system in Kafka to improve our support for highly multi-tenant usage. The goal of this system would be to prevent one naughty user from accidently overloading the whole cluster. There are several quantities we would want to track: 1. Requests pers second 2. Bytes written per second 3. Bytes read per second There are two reasonable groupings we would want to aggregate and enforce these thresholds at: 1. Topic level 2. Client level (e.g. by client id from the request) When a request hits one of these limits we will simply reject it with a QUOTA_EXCEEDED exception. To avoid suddenly breaking things without warning, we should ideally support two thresholds: a soft threshold at which we produce some kind of warning and a hard threshold at which we give the error. The soft threshold could just be defined as 80% (or whatever) of the hard threshold. There are nuances to getting this right. If you measure second-by-second a single burst may exceed the threshold, so we need a sustained measurement over a period of time. Likewise when do we stop giving this error? To make this work right we likely need to charge against the quota for request *attempts* not just successful requests. Otherwise a client that is overloading the server will just flap on and off--i.e. we would disable them for a period of time but when we re-enabled them they would likely still be abusing us. It would be good to a wiki design on how this would all work as a starting point for discussion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-937) ConsumerFetcherThread can deadlock
[ https://issues.apache.org/jira/browse/KAFKA-937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Rao updated KAFKA-937: -- Attachment: kafka-937_ConsumerOffsetChecker.patch Thanks for reporting this. It is actually a real issue. However, the problem is not because of the change in SimpleConsumer, but in how ConsumerOffsetChecker uses SimpleConsumer. It should only close a SimpleConsumer after it's no longer needed. Could you try the attached patch? ConsumerFetcherThread can deadlock -- Key: KAFKA-937 URL: https://issues.apache.org/jira/browse/KAFKA-937 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8 Reporter: Jun Rao Assignee: Jun Rao Fix For: 0.8 Attachments: kafka-937_ConsumerOffsetChecker.patch, kafka-937_delta.patch, kafka-937.patch We have the following access pattern that can introduce a deadlock. AbstractFetcherThread.processPartitionsWithError() - ConsumerFetcherThread.processPartitionsWithError() - ConsumerFetcherManager.addPartitionsWithError() wait for lock - LeaderFinderThread holding lock while calling AbstractFetcherManager.shutdownIdleFetcherThreads() - AbstractFetcherManager calling fetcher.shutdown, which needs to wait until AbstractFetcherThread.processPartitionsWithError() completes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (KAFKA-955) After a leader change, messages sent with ack=0 are lost
Jason Rosenberg created KAFKA-955: - Summary: After a leader change, messages sent with ack=0 are lost Key: KAFKA-955 URL: https://issues.apache.org/jira/browse/KAFKA-955 Project: Kafka Issue Type: Bug Reporter: Jason Rosenberg If the leader changes for a partition, and a producer is sending messages with ack=0, then messages will be lost, since the producer has no active way of knowing that the leader has changed, until it's next metadata refresh update. The broker receiving the message, which is no longer the leader, logs a message like this: Produce request with correlation id 7136261 from client on partition [mytopic,0] failed due to Leader not local for partition [mytopic,0] on broker 508818741 This is exacerbated by the controlled shutdown mechanism, which forces an immediate leader change. A possible solution to this would be for a broker which receives a message, for a topic that it is no longer the leader for (and if the ack level is 0), then the broker could just silently forward the message over to the current leader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira