[jira] [Updated] (KAFKA-2002) It does not work when kafka_mx4jenable is false
[ https://issues.apache.org/jira/browse/KAFKA-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yaguo Zhou updated KAFKA-2002: -- Attachment: kafka_mx4jenable.patch It does not work when kafka_mx4jenable is false --- Key: KAFKA-2002 URL: https://issues.apache.org/jira/browse/KAFKA-2002 Project: Kafka Issue Type: Bug Components: core Reporter: Yaguo Zhou Priority: Minor Attachments: kafka_mx4jenable.patch Should return false immediately when kafka_mx4jenable is false -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1908) Split brain
[ https://issues.apache.org/jira/browse/KAFKA-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347063#comment-14347063 ] Alexey Ozeritskiy commented on KAFKA-1908: -- Hi all, the following is our scenario. We use a custom consumer which works on broker hosts and always consumes leader partitions from localhost. Consumer reads data and pushs it to 3rdparty system. We send a metadata request to localhost and don't use the zk data. We use zk locks to guarantee that we read a single partition in one process. Sometimes we release the locks and consumer can begin to consume the data from broken host and reset offsets. Split brain --- Key: KAFKA-1908 URL: https://issues.apache.org/jira/browse/KAFKA-1908 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.2.0 Reporter: Alexey Ozeritskiy In some cases, there may be two leaders for one partition. Steps to reproduce: # We have 3 brokers, 1 partition with 3 replicas: {code} TopicAndPartition: [partition,0]Leader: 1 Replicas: [2,1,3] ISR: [1,2,3] {code} # controller works on broker 3 # let the kafka port be 9092. We execute on broker 1: {code} iptables -A INPUT -p tcp --dport 9092 -j REJECT {code} # Initiate replica election # As a result: Broker 1: {code} TopicAndPartition: [partition,0]Leader: 1 Replicas: [2,1,3] ISR: [1,2,3] {code} Broker 2: {code} TopicAndPartition: [partition,0]Leader: 2 Replicas: [2,1,3] ISR: [1,2,3] {code} # Flush the iptables rules on broker 1 Now we can produce messages to {code}[partition,0]{code}. Replica-1 will not receive new data. A consumer can read data from replica-1 or replica-2. When it reads from replica-1 it resets the offsets and than can read duplicates from replica-2. We saw this situation in our production cluster when it had network problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1809) Refactor brokers to allow listening on multiple ports and IPs
[ https://issues.apache.org/jira/browse/KAFKA-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347236#comment-14347236 ] Jun Rao commented on KAFKA-1809: Thanks for the patch. Some comments. 1. KafkaConfig: 1.1 val interBrokerSecurityProtocol = val intraBrokerSecurityProtocol 1.2 In the getListeners() and getAdvertisedListeners(), let's use the same default port 9092 as before. Utils.listenerListToEndPoints(PLAINTEXT:// + props.getString(advertised.host.name, props.getString(host.name, )) + : + props.getInt(advertised.port, props.getInt(port, 6667)).toString) 1.3 In the comment, lets use them = let's use them 1.4 In the following comment, those explicitly enumerated versions may get out of date. We can just point to ApiVersion for valid values. * Valid values are: 0.8.2.0, 0.8.3.0 1.5 We probably should validate that all specified security protocols are valid (i.e., those defined in SecurityProtocol) 2. ApiVersion: Instead of 0.8.2.0, should we just use 0.8.2 since the protocol for 0.8.2.0 and 0.8.2.1 are the same? 3. EndPoint: For the following, could we give a plain english description of the uri format a and give a few examples? val uriParseExp = ^(.*)://\[?([0-9a-z\-.:]*)\]?:([0-9]+).r 4. AdminUtils: In the following, no need to do id = id.int since id is already int. replicaInfo = getBrokerInfoFromCache(zkClient, cachedBrokerInfo, replicas.map(id = id.toInt)).map(_.getBrokerEndPoint(protocol)) 5. Broker: In the following, instead of representing endpoints as a single string, should we represent it as an array of strings, one for each endpoint? * endpoints: {PLAINTEXT://host1:9092,SSL://host1:9093} 6. ConsumerConfig: Do we need to expose the following config? Not sure how it's used in testing. val securityProtocol = SecurityProtocol.withName(props.getString(security.protocol, PLAINTEXT)) 7. unused import: FetchTest, KafkaConfigTest(under server), KafkaServerTestHarness, ProducerSendTest, RequestResponseSerializationTest, SocketServer 8. KafkaConfigTest: Can isValidKafkaConfig be private? 9. KafkaServer: Why is the getBrokerId() call moved after the creation of SocketServer? socketServer = new SocketServer(config.brokerId, config.listeners, config.numNetworkThreads, config.queuedMaxRequests, config.socketSendBufferBytes, config.socketReceiveBufferBytes, config.socketRequestMaxBytes, config.maxConnectionsPerIp, config.connectionsMaxIdleMs, config.maxConnectionsPerIpOverrides) socketServer.startup() /* generate brokerId */ config.brokerId = getBrokerId this.logIdent = [Kafka Server + config.brokerId + ], 10. SecurityProtocol: When will TRACE be used? It's not used in ProducerConfig. 11. Some of the files have extra empty lines (e.g. SocketServerTest) 12. testcase_1_properties.json: Could you explain why we need to change the log segment size? 13. UpdateMetadataRequest: sizeInBytes() needs to be versioned as well. Is there no unit test covering this? If so, we should add one. 14. Testing. 14.1 Did you run the system tests? 14.2 Did you try running existing tools? 14.3 How should we test the protocol upgrade from 0.8.2 to 0.8.3? Refactor brokers to allow listening on multiple ports and IPs -- Key: KAFKA-1809 URL: https://issues.apache.org/jira/browse/KAFKA-1809 Project: Kafka Issue Type: Sub-task Components: security Reporter: Gwen Shapira Assignee: Gwen Shapira Attachments: KAFKA-1809.patch, KAFKA-1809.v1.patch, KAFKA-1809.v2.patch, KAFKA-1809.v3.patch, KAFKA-1809_2014-12-25_11:04:17.patch, KAFKA-1809_2014-12-27_12:03:35.patch, KAFKA-1809_2014-12-27_12:22:56.patch, KAFKA-1809_2014-12-29_12:11:36.patch, KAFKA-1809_2014-12-30_11:25:29.patch, KAFKA-1809_2014-12-30_18:48:46.patch, KAFKA-1809_2015-01-05_14:23:57.patch, KAFKA-1809_2015-01-05_20:25:15.patch, KAFKA-1809_2015-01-05_21:40:14.patch, KAFKA-1809_2015-01-06_11:46:22.patch, KAFKA-1809_2015-01-13_18:16:21.patch, KAFKA-1809_2015-01-28_10:26:22.patch, KAFKA-1809_2015-02-02_11:55:16.patch, KAFKA-1809_2015-02-03_10:45:31.patch, KAFKA-1809_2015-02-03_10:46:42.patch, KAFKA-1809_2015-02-03_10:52:36.patch The goal is to eventually support different security mechanisms on different ports. Currently brokers are defined as host+port pair, and this definition exists throughout the code-base, therefore some refactoring is needed to support
[jira] [Updated] (KAFKA-1863) Exception categories / hierarchy in clients
[ https://issues.apache.org/jira/browse/KAFKA-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1863: - Status: Patch Available (was: Open) Exception categories / hierarchy in clients --- Key: KAFKA-1863 URL: https://issues.apache.org/jira/browse/KAFKA-1863 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1863.patch In the new clients package we introduces a new set of exceptions, but its hierarchy is not very clear as of today: {code} RuntimeException - KafkaException - BufferExhastedException - ConfigException - SerializationException - QuotaViolationException - SchemaException - ApiException ApiException - InvalidTopicException - OffsetMetadataTooLarge (probabaly need to be renamed) - RecordBatchTooLargeException - RecordTooLargeException - UnknownServerException - RetriableException RetriableException - CorruptRecordException - InvalidMetadataException - NotEnoughtReplicasAfterAppendException - NotEnoughReplicasException - OffsetOutOfRangeException - TimeoutException - UnknownTopicOrPartitionException {code} KafkaProducer.send() may throw KafkaExceptions that are not ApiExceptions; other exceptions will be set in the returned future metadata. We need better to 1. Re-examine the hierarchy. For example, for producers only exceptions that are thrown directly from the caller thread before it is appended to the batch buffer should be ApiExceptions; some exceptions could be renamed / merged. 2. Clearly document the exception category / hierarchy as part of the release. [~criccomini] may have some more feedbacks for this issue from Samza's usage experience. [~jkreps] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 31735: Fix KAFKA-1863
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31735/ --- Review request for kafka. Bugs: KAFKA-1863 https://issues.apache.org/jira/browse/KAFKA-1863 Repository: kafka Description --- Add docs for ApiExceptions in callbacks and send / construtors Diffs - clients/src/main/java/org/apache/kafka/clients/producer/Callback.java b89aa582f64e8aa38e75cef1d153760400df335e clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java ed9c63a6679e3aaf83d19fde19268553a4c107c2 clients/src/main/java/org/apache/kafka/common/protocol/types/Field.java 899195819159a3544e14ebfb09aff1b9152ff5dd clients/src/main/java/org/apache/kafka/common/protocol/types/Schema.java 716470125866641210591df1d5c773183949819a clients/src/main/java/org/apache/kafka/common/protocol/types/SchemaException.java ea4e46f4d768ef90a0d725ad117aab4645f6cb76 clients/src/main/java/org/apache/kafka/common/protocol/types/Struct.java ff89f0e37d5fa787b0218eff86d169aaeae2107b clients/src/main/java/org/apache/kafka/common/protocol/types/Type.java f0d5a8286380d0138b7a815ea7492311603cc72f Diff: https://reviews.apache.org/r/31735/diff/ Testing --- Thanks, Guozhang Wang
[jira] [Updated] (KAFKA-527) Compression support does numerous byte copies
[ https://issues.apache.org/jira/browse/KAFKA-527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuhiro Matsuda updated KAFKA-527: --- Attachment: KAFKA-527.patch Compression support does numerous byte copies - Key: KAFKA-527 URL: https://issues.apache.org/jira/browse/KAFKA-527 Project: Kafka Issue Type: Bug Components: compression Reporter: Jay Kreps Priority: Critical Attachments: KAFKA-527.message-copy.history, KAFKA-527.patch, java.hprof.no-compression.txt, java.hprof.snappy.text The data path for compressing or decompressing messages is extremely inefficient. We do something like 7 (?) complete copies of the data, often for simple things like adding a 4 byte size to the front. I am not sure how this went by unnoticed. This is likely the root cause of the performance issues we saw in doing bulk recompression of data in mirror maker. The mismatch between the InputStream and OutputStream interfaces and the Message/MessageSet interfaces which are based on byte buffers is the cause of many of these. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-527) Compression support does numerous byte copies
[ https://issues.apache.org/jira/browse/KAFKA-527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347441#comment-14347441 ] Yasuhiro Matsuda commented on KAFKA-527: Created reviewboard https://reviews.apache.org/r/31742/diff/ against branch origin/trunk Compression support does numerous byte copies - Key: KAFKA-527 URL: https://issues.apache.org/jira/browse/KAFKA-527 Project: Kafka Issue Type: Bug Components: compression Reporter: Jay Kreps Priority: Critical Attachments: KAFKA-527.message-copy.history, KAFKA-527.patch, java.hprof.no-compression.txt, java.hprof.snappy.text The data path for compressing or decompressing messages is extremely inefficient. We do something like 7 (?) complete copies of the data, often for simple things like adding a 4 byte size to the front. I am not sure how this went by unnoticed. This is likely the root cause of the performance issues we saw in doing bulk recompression of data in mirror maker. The mismatch between the InputStream and OutputStream interfaces and the Message/MessageSet interfaces which are based on byte buffers is the cause of many of these. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-527) Compression support does numerous byte copies
[ https://issues.apache.org/jira/browse/KAFKA-527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuhiro Matsuda updated KAFKA-527: --- Assignee: Yasuhiro Matsuda Status: Patch Available (was: Open) Compression support does numerous byte copies - Key: KAFKA-527 URL: https://issues.apache.org/jira/browse/KAFKA-527 Project: Kafka Issue Type: Bug Components: compression Reporter: Jay Kreps Assignee: Yasuhiro Matsuda Priority: Critical Attachments: KAFKA-527.message-copy.history, KAFKA-527.patch, java.hprof.no-compression.txt, java.hprof.snappy.text The data path for compressing or decompressing messages is extremely inefficient. We do something like 7 (?) complete copies of the data, often for simple things like adding a 4 byte size to the front. I am not sure how this went by unnoticed. This is likely the root cause of the performance issues we saw in doing bulk recompression of data in mirror maker. The mismatch between the InputStream and OutputStream interfaces and the Message/MessageSet interfaces which are based on byte buffers is the cause of many of these. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1863) Exception categories / hierarchy in clients
[ https://issues.apache.org/jira/browse/KAFKA-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347342#comment-14347342 ] Guozhang Wang commented on KAFKA-1863: -- Add some more docs in the possible exception hierarchy, with this and Jay's modified throws exception in send() I think this ticket can be covered. Exception categories / hierarchy in clients --- Key: KAFKA-1863 URL: https://issues.apache.org/jira/browse/KAFKA-1863 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1863.patch In the new clients package we introduces a new set of exceptions, but its hierarchy is not very clear as of today: {code} RuntimeException - KafkaException - BufferExhastedException - ConfigException - SerializationException - QuotaViolationException - SchemaException - ApiException ApiException - InvalidTopicException - OffsetMetadataTooLarge (probabaly need to be renamed) - RecordBatchTooLargeException - RecordTooLargeException - UnknownServerException - RetriableException RetriableException - CorruptRecordException - InvalidMetadataException - NotEnoughtReplicasAfterAppendException - NotEnoughReplicasException - OffsetOutOfRangeException - TimeoutException - UnknownTopicOrPartitionException {code} KafkaProducer.send() may throw KafkaExceptions that are not ApiExceptions; other exceptions will be set in the returned future metadata. We need better to 1. Re-examine the hierarchy. For example, for producers only exceptions that are thrown directly from the caller thread before it is appended to the batch buffer should be ApiExceptions; some exceptions could be renamed / merged. 2. Clearly document the exception category / hierarchy as part of the release. [~criccomini] may have some more feedbacks for this issue from Samza's usage experience. [~jkreps] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] KIP-2 - Refactor brokers to allow listening on multiple ports and IPs
+1 Thanks, Jun On Tue, Mar 3, 2015 at 12:14 PM, Gwen Shapira gshap...@cloudera.com wrote: Details are in the wiki: https://cwiki.apache.org/confluence/display/KAFKA/KIP-2+-+Refactor+brokers+to+allow+listening+on+multiple+ports+and+IPs
[jira] [Commented] (KAFKA-527) Compression support does numerous byte copies
[ https://issues.apache.org/jira/browse/KAFKA-527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347478#comment-14347478 ] Yasuhiro Matsuda commented on KAFKA-527: This patch introduces BufferingOutputStream, an alternative for ByteArrayOutputStream. It is backed by a chain of byte arrays, so it does not copy bytes when increasing its capacity. Also, it has a method that writes the content to ByteBuffer directly, so there is no need to create an array instance to transfer the content to ByteBuffer. Lastly, it has a deferred write, which means that you reserve a number of bytes before knowing the value and fill it later. In MessageWriter (a new class), it is used for writing the CRC value and the payload length. On laptop,I tested the performance using TestLinearWriteSpeed with snappy. Previously 26.64786026813998 MB per sec With the patch 35.78401869390889 MB per sec The improvement is about 34% better throughput. Compression support does numerous byte copies - Key: KAFKA-527 URL: https://issues.apache.org/jira/browse/KAFKA-527 Project: Kafka Issue Type: Bug Components: compression Reporter: Jay Kreps Assignee: Yasuhiro Matsuda Priority: Critical Attachments: KAFKA-527.message-copy.history, KAFKA-527.patch, java.hprof.no-compression.txt, java.hprof.snappy.text The data path for compressing or decompressing messages is extremely inefficient. We do something like 7 (?) complete copies of the data, often for simple things like adding a 4 byte size to the front. I am not sure how this went by unnoticed. This is likely the root cause of the performance issues we saw in doing bulk recompression of data in mirror maker. The mismatch between the InputStream and OutputStream interfaces and the Message/MessageSet interfaces which are based on byte buffers is the cause of many of these. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1863) Exception categories / hierarchy in clients
[ https://issues.apache.org/jira/browse/KAFKA-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1863: - Attachment: KAFKA-1863.patch Exception categories / hierarchy in clients --- Key: KAFKA-1863 URL: https://issues.apache.org/jira/browse/KAFKA-1863 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1863.patch In the new clients package we introduces a new set of exceptions, but its hierarchy is not very clear as of today: {code} RuntimeException - KafkaException - BufferExhastedException - ConfigException - SerializationException - QuotaViolationException - SchemaException - ApiException ApiException - InvalidTopicException - OffsetMetadataTooLarge (probabaly need to be renamed) - RecordBatchTooLargeException - RecordTooLargeException - UnknownServerException - RetriableException RetriableException - CorruptRecordException - InvalidMetadataException - NotEnoughtReplicasAfterAppendException - NotEnoughReplicasException - OffsetOutOfRangeException - TimeoutException - UnknownTopicOrPartitionException {code} KafkaProducer.send() may throw KafkaExceptions that are not ApiExceptions; other exceptions will be set in the returned future metadata. We need better to 1. Re-examine the hierarchy. For example, for producers only exceptions that are thrown directly from the caller thread before it is appended to the batch buffer should be ApiExceptions; some exceptions could be renamed / merged. 2. Clearly document the exception category / hierarchy as part of the release. [~criccomini] may have some more feedbacks for this issue from Samza's usage experience. [~jkreps] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1863) Exception categories / hierarchy in clients
[ https://issues.apache.org/jira/browse/KAFKA-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347338#comment-14347338 ] Guozhang Wang commented on KAFKA-1863: -- Created reviewboard https://reviews.apache.org/r/31735/diff/ against branch origin/trunk Exception categories / hierarchy in clients --- Key: KAFKA-1863 URL: https://issues.apache.org/jira/browse/KAFKA-1863 Project: Kafka Issue Type: Bug Reporter: Guozhang Wang Assignee: Guozhang Wang Fix For: 0.9.0 Attachments: KAFKA-1863.patch In the new clients package we introduces a new set of exceptions, but its hierarchy is not very clear as of today: {code} RuntimeException - KafkaException - BufferExhastedException - ConfigException - SerializationException - QuotaViolationException - SchemaException - ApiException ApiException - InvalidTopicException - OffsetMetadataTooLarge (probabaly need to be renamed) - RecordBatchTooLargeException - RecordTooLargeException - UnknownServerException - RetriableException RetriableException - CorruptRecordException - InvalidMetadataException - NotEnoughtReplicasAfterAppendException - NotEnoughReplicasException - OffsetOutOfRangeException - TimeoutException - UnknownTopicOrPartitionException {code} KafkaProducer.send() may throw KafkaExceptions that are not ApiExceptions; other exceptions will be set in the returned future metadata. We need better to 1. Re-examine the hierarchy. For example, for producers only exceptions that are thrown directly from the caller thread before it is appended to the batch buffer should be ApiExceptions; some exceptions could be renamed / merged. 2. Clearly document the exception category / hierarchy as part of the release. [~criccomini] may have some more feedbacks for this issue from Samza's usage experience. [~jkreps] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 31742: Patch for KAFKA-527
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31742/ --- Review request for kafka. Bugs: KAFKA-527 https://issues.apache.org/jira/browse/KAFKA-527 Repository: kafka Description --- less byte copies Diffs - core/src/main/scala/kafka/message/ByteBufferMessageSet.scala 9c694719dc9b515fb3c3ae96435a87b334044272 core/src/main/scala/kafka/message/MessageWriter.scala PRE-CREATION core/src/test/scala/unit/kafka/message/MessageWriterTest.scala PRE-CREATION Diff: https://reviews.apache.org/r/31742/diff/ Testing --- Thanks, Yasuhiro Matsuda
[jira] [Commented] (KAFKA-1998) Partitions Missing From MetadataResponse
[ https://issues.apache.org/jira/browse/KAFKA-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347645#comment-14347645 ] Mayuresh Gharat commented on KAFKA-1998: I suppose what we need to do is top expose the availablePartitionsForTopic() api in producer. I discussed this with [~guozhang] and he said that its going to happen as a part of other patch that Jun is working on. [~guozhang] can you comment on this. We can decide whether we can close this ticket if required. Partitions Missing From MetadataResponse Key: KAFKA-1998 URL: https://issues.apache.org/jira/browse/KAFKA-1998 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.2.0 Reporter: Evan Huus Assignee: Mayuresh Gharat It is known behaviour that when a partition is entirely offline (it has no leader because all of its replicas are down) then that partition will not be included in the metadata returned by other brokers. For example, if topic foo has 3 partitions, but all replicas of partition 3 are offline, then requesting metadata for foo will only return information about partitions 1 and 2. This means that there is no way to reliably determine the number of partitions for a topic via kafka's metadata API; if I receive information on partitions 1 and 2, I don't know if partition 3 is offline or if it is simply that there are only two partitions total. (You can presumably still ask zookeeper directly, but that is a work-around). This ambiguity, in turn, can lead to a consistency problem with the default partitioner, since that effectively implements `hash(key) mod #partitions`. If a partition goes offline and is removed from the metadata response, then the number of partitions the producer knows about will change (on its next metadata refresh) and the mapping from keys to partitions will also change. Instead of distributing messages among (for example) 3 partitions, and failing to produce to the offline partition, it will distribute *all* messages among the two online partitions. This results in messages being sent to the wrong partition. Since kafka already returns partitions with error messages in many cases (e.g. `LeaderNotAvailable`) I think it makes much more sense and fixes the above partition problem if it would simply return offline partitions as well with the appropriate error (whether that is `LeaderNotAvailable` or it would be better to add an additional error is up to you). CC [~guozhang] (This issue was originally described/discussed on the kafka-users mailing list, in the thread involving https://mail-archives.apache.org/mod_mbox/kafka-users/201503.mbox/%3CCAA4pprAZvp2XhdNmy0%2BqVZ1UVdVxmUfz3DDArhGbwP-iiH%2BGyg%40mail.gmail.com%3E) If there are any questions I am happy to clarify, I realize the scenario is somewhat complex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-2003) Add upgrade tests
Gwen Shapira created KAFKA-2003: --- Summary: Add upgrade tests Key: KAFKA-2003 URL: https://issues.apache.org/jira/browse/KAFKA-2003 Project: Kafka Issue Type: Improvement Reporter: Gwen Shapira Assignee: Ashish K Singh To test protocol changes, compatibility and upgrade process, we need a good way to test different versions of the product together and to test end-to-end upgrade process. For example, for 0.8.2 to 0.8.3 test we want to check: * Can we start a cluster with a mix of 0.8.2 and 0.8.3 brokers? * Can a cluster of 0.8.3 brokers bump the protocol level one broker at a time? * Can 0.8.2 clients run against a cluster of 0.8.3 brokers? There are probably more questions. But an automated framework that can test those and report results will be a good start. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [KIP-DISCUSSION] KIP-7 Security - IP Filtering
Thanks for confirming. My sample-set is obviously limited to the companies where I have worked - where operations folks do have root access on the services that they manage. That said, it seems fairly straightforward to add this and would help users who don't have the benefit of su privileges. Joel On Wed, Mar 04, 2015 at 12:01:59AM -0500, Jeff Holoman wrote: Hey Joel, good questions As a first thought, my experience with customers in large corporate environments probably has me somewhat jaded :). You know it really shouldn't take 3 weeks to get ports opened on a load balancer, but, that really does happen. Coordination across those teams also can and should / does happen but I've noted that operators appreciate measures they can take that keep them out of more internal process. 1) Yes probably. After all we're really just checking what's returned from InetAddress and trusting that. The check is pretty lightweight. I think what you are getting at is that a security check that doesn't go all the way can be bad as it can engender a false sense sense of security and end up leaving the system more vulnerable to attack than if other, more standard, approaches are taken. This is a fair point. I'm not deep enough into network security to comment all that intelligently but I do think that reducing the exposure to say, IP spoofing on internal traffic vs free-for-all data consumption is a step in the right direction. 2) Yes they may have access to this, and it could be redundant. On customers that I interface with, operators typically get their root-level privileges through something like PowerBroker, so access to IPTables is not a given, and even if it's available typically does not fall within their realm of accepted responsibilities. Additionally, when I first got this request I suggested IPTables and was told that due to the difficulties and complexities of configuration and management (from their perspective) that it would not be an acceptable solution (also the it's not in the corporate standard line) I noted in the KIP that I look at this not only as a potential security measure by reducing attack vector size but also as a guard against human error. Hardcoded configs sometimes make their way all the way to production and this would help to limit that. You could argue that it might not be Kafka's responsibility to enforce this type of control, but there is precedence here with HDFS (dfs.hosts and dfs.hosts.exclude) and Flume (*https://issues.apache.org/jira/browse/FLUME-2189 https://issues.apache.org/jira/browse/FLUME-2189*). In short, I don't think that this supplants more robust security functionality but I do think it gives an additional (lightweight) control which would be useful. Security is about defense in depth, and this raises the bar a tad. Thanks Jeff On Tue, Mar 3, 2015 at 8:58 PM, Joel Koshy jjkosh...@gmail.com wrote: The proposal itself looks reasonable, but I have a couple of questions as you made reference to operators of the system; and network team in your wiki. - Are spoofing attacks a concern even with this in place? If so, it would require some sort of internal ingress filtering which presumably need cooperation with network teams right? - Also, the operators of the (Kafka) system really should have access to iptables on the Kafka brokers so wouldn't this feature be effectively redundant? Thanks, Joel On Thu, Jan 22, 2015 at 01:50:41PM -0500, Joe Stein wrote: Hey Jeff, thanks for the patch and writing this up. I think the approach to explicitly deny and then set what is allowed or explicitly allow then deny specifics makes sense. Supporting CIDR notation and ip4 and ip6 both good too. Waiting for KAFKA-1845 to get committed I think makes sense before reworking this anymore right now, yes. Andrii posted a patch yesterday for it so hopefully in the next ~ week(s). Not sure what other folks think of this approach but whatever that is would be good to have it in prior to reworking for the config def changes. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / On Wed, Jan 21, 2015 at 8:47 PM, Jeff Holoman jholo...@cloudera.com wrote: Posted a KIP for IP Filtering: https://cwiki.apache.org/confluence/display/KAFKA/KIP-7+-+Security+-+IP+Filtering Relevant JIRA: https://issues.apache.org/jira/browse/KAFKA-1810 Appreciate any feedback. Thanks Jeff -- Jeff Holoman Systems Engineer -- Joel
[jira] [Updated] (KAFKA-1997) Refactor Mirror Maker
[ https://issues.apache.org/jira/browse/KAFKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin updated KAFKA-1997: Attachment: KAFKA-1997_2015-03-04_15:42:45.patch Refactor Mirror Maker - Key: KAFKA-1997 URL: https://issues.apache.org/jira/browse/KAFKA-1997 Project: Kafka Issue Type: Improvement Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1997.patch, KAFKA-1997_2015-03-03_16:28:46.patch, KAFKA-1997_2015-03-04_15:07:46.patch, KAFKA-1997_2015-03-04_15:42:45.patch Refactor mirror maker based on KIP-3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[KIP-DISCUSSION] KIP-13 Quotas
Posted a KIP for quotas in kafka. https://cwiki.apache.org/confluence/display/KAFKA/KIP-13+-+Quotas Appreciate any feedback. Aditya
[jira] [Commented] (KAFKA-1313) Support adding replicas to existing topic partitions
[ https://issues.apache.org/jira/browse/KAFKA-1313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347795#comment-14347795 ] Geoffrey Anderson commented on KAFKA-1313: -- Quoting the Kafka docs: The first step is to hand craft the custom reassignment plan in a json file- cat increase-replication-factor.json {version:1, partitions:[{topic:foo,partition:0,replicas:[5,6,7]}]} Then, use the json file with the --execute option to start the reassignment process- bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute --- As I understand it, the pain point is hand crafting the reassignment plan. From there on out, updating replication factor is functionally identical to any other reassignment. So my proposal is to add another file type (--replication-factor-json-file) to kafka-reassign-partitions.sh --generate (and/or --rebalance per KIP 6) which allows users to specify desired replication_factor per topic/partition. For example: cat update-replication-factor.json {version:1, partitions:[{topic:foo,partition:0,replication_factor:3}]} bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --replication-factor-json-file update-replication-factor.json --generate The user would then feed generated output into kafka-reassign-partitions.sh --execute as before. [~nehanarkhede], let me know if this seems reasonable/is in the ballpark of what you had in mind. Support adding replicas to existing topic partitions Key: KAFKA-1313 URL: https://issues.apache.org/jira/browse/KAFKA-1313 Project: Kafka Issue Type: New Feature Components: tools Affects Versions: 0.8.0 Reporter: Marc Labbe Assignee: Geoffrey Anderson Priority: Critical Labels: newbie++ Fix For: 0.9.0 There is currently no easy way to add replicas to an existing topic partitions. For example, topic create-test has been created with ReplicationFactor=1: Topic:create-test PartitionCount:3ReplicationFactor:1 Configs: Topic: create-test Partition: 0Leader: 1 Replicas: 1 Isr: 1 Topic: create-test Partition: 1Leader: 2 Replicas: 2 Isr: 2 Topic: create-test Partition: 2Leader: 3 Replicas: 3 Isr: 3 I would like to increase the ReplicationFactor=2 (or more) so it shows up like this instead. Topic:create-test PartitionCount:3ReplicationFactor:2 Configs: Topic: create-test Partition: 0Leader: 1 Replicas: 1,2 Isr: 1,2 Topic: create-test Partition: 1Leader: 2 Replicas: 2,3 Isr: 2,3 Topic: create-test Partition: 2Leader: 3 Replicas: 3,1 Isr: 3,1 Use cases for this: - adding brokers and thus increase fault tolerance - fixing human errors for topics created with wrong values -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1997) Refactor Mirror Maker
[ https://issues.apache.org/jira/browse/KAFKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347743#comment-14347743 ] Jiangjie Qin commented on KAFKA-1997: - Updated reviewboard https://reviews.apache.org/r/31706/diff/ against branch origin/trunk Refactor Mirror Maker - Key: KAFKA-1997 URL: https://issues.apache.org/jira/browse/KAFKA-1997 Project: Kafka Issue Type: Improvement Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1997.patch, KAFKA-1997_2015-03-03_16:28:46.patch, KAFKA-1997_2015-03-04_15:07:46.patch Refactor mirror maker based on KIP-3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1997) Refactor Mirror Maker
[ https://issues.apache.org/jira/browse/KAFKA-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jiangjie Qin updated KAFKA-1997: Attachment: KAFKA-1997_2015-03-04_15:07:46.patch Refactor Mirror Maker - Key: KAFKA-1997 URL: https://issues.apache.org/jira/browse/KAFKA-1997 Project: Kafka Issue Type: Improvement Reporter: Jiangjie Qin Assignee: Jiangjie Qin Attachments: KAFKA-1997.patch, KAFKA-1997_2015-03-03_16:28:46.patch, KAFKA-1997_2015-03-04_15:07:46.patch Refactor mirror maker based on KIP-3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2003) Add upgrade tests
[ https://issues.apache.org/jira/browse/KAFKA-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347804#comment-14347804 ] Ashish K Singh commented on KAFKA-2003: --- [~gwenshap] does the following outline make sense? $ some_tool from_version m_instances to_version n_instances 1. Compile jars into from_version_target 2. Compile jars into to_version_target 3. Bring up n to_version brokers 4. Run tests producer consumer tests with clients from from_version_target 5. Bring up m from_version brokers 6. Run tests producer consumer tests with clients from from_version_target 7. Run tests producer consumer tests with clients from to_version_target 8. {noformat} foreach(broker: brokers) Bump up the interbroker.protocol.version Run tests producer consumer tests with clients from from_version_target Run tests producer consumer tests with clients from to_version_target {noformat} Add upgrade tests - Key: KAFKA-2003 URL: https://issues.apache.org/jira/browse/KAFKA-2003 Project: Kafka Issue Type: Improvement Reporter: Gwen Shapira Assignee: Ashish K Singh To test protocol changes, compatibility and upgrade process, we need a good way to test different versions of the product together and to test end-to-end upgrade process. For example, for 0.8.2 to 0.8.3 test we want to check: * Can we start a cluster with a mix of 0.8.2 and 0.8.3 brokers? * Can a cluster of 0.8.3 brokers bump the protocol level one broker at a time? * Can 0.8.2 clients run against a cluster of 0.8.3 brokers? There are probably more questions. But an automated framework that can test those and report results will be a good start. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-2004) Write Kafka messages directly to HDFS
sutanu das created KAFKA-2004: - Summary: Write Kafka messages directly to HDFS Key: KAFKA-2004 URL: https://issues.apache.org/jira/browse/KAFKA-2004 Project: Kafka Issue Type: Bug Components: consumer, core, producer Affects Versions: 0.8.1.1 Reporter: sutanu das Assignee: Neha Narkhede Priority: Critical 1. Is there a way to write Kafka messages directly to HDFS without writing any consumer code? 2. Is there anyway to integrate Kafka with Storm or Spark so messages goes directly from Kafka consumers to HDFS sync? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] KIP-6 - New reassignment partition logic for re-balancing
Apologize for the late comment on this... So fair assignment by count (taking into account the current partition count of each broker) is very good. However, it's worth noting that all partitions are not created equal. We have actually been performing more rebalance work based on the partition size on disk, as given equal retention of all topics, the size on disk is a better indicator of the amount of traffic a partition gets, both in terms of storage and network traffic. Overall, this seems to be a better balance. In addition to this, I think there is very much a need to have Kafka be rack-aware. That is, to be able to assure that for a given cluster, you never assign all replicas for a given partition in the same rack. This would allow us to guard against maintenances or power failures that affect a full rack of systems (or a given switch). I think it would make sense to implement the reassignment logic as a pluggable component. That way it would be easy to select a scheme when performing a reassignment (count, size, rack aware). Configuring a default scheme for a cluster would allow for the brokers to create new topics and partitions in compliance with the requested policy. -Todd On Thu, Jan 22, 2015 at 10:13 PM, Joe Stein joe.st...@stealth.ly wrote: I will go back through the ticket and code and write more up. Should be able to-do that sometime next week. The intention was to not replace existing functionality by issue a WARN on use. The following version it is released we could then deprecate it... I will fix the KIP for that too. On Fri, Jan 23, 2015 at 12:34 AM, Neha Narkhede n...@confluent.io wrote: Hey Joe, 1. Could you add details to the Public Interface section of the KIP? This should include the proposed changes to the partition reassignment tool. Also, maybe the new option can be named --rebalance instead of --re-balance? 2. It makes sense to list --decommission-broker as part of this KIP. Similarly, shouldn't we also have an --add-broker option? The way I see this is that there are several events when a partition reassignment is required. Before this functionality is automated on the broker, the tool will generate an ideal replica placement for each such event. The users should merely have to specify the nature of the event e.g. adding a broker or decommissioning an existing broker or merely rebalancing. 3. If I understand the KIP correctly, the upgrade plan for this feature includes removing the existing --generate option on the partition reassignment tool in 0.8.3 while adding all the new options in the same release. Is that correct? Thanks, Neha On Thu, Jan 22, 2015 at 9:23 PM, Jay Kreps jay.kr...@gmail.com wrote: Ditto on this one. Can you give the algorithm we want to implement? Also I think in terms of scope this is just proposing to change the logic in ReassignPartitionsCommand? I think we've had the discussion various times on the mailing list that what people really want is just for Kafka to do it's best to balance data in an online fashion (for some definition of balance). i.e. if you add a new node partitions would slowly migrate to it, and if a node dies, partitions slowly migrate off it. This could potentially be more work, but I'm not sure how much more. Has anyone thought about how to do it? -Jay On Wed, Jan 21, 2015 at 10:11 PM, Joe Stein joe.st...@stealth.ly wrote: Posted a KIP for --re-balance for partition assignment in reassignment tool. https://cwiki.apache.org/confluence/display/KAFKA/KIP-6+-+New+reassignment+partition+logic+for+re-balancing JIRA https://issues.apache.org/jira/browse/KAFKA-1792 While going through the KIP I thought of one thing from the JIRA that we should change. We should preserve --generate to be existing functionality for the next release it is in. If folks want to use --re-balance then great, it just won't break any upgrade paths, yet. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / -- Thanks, Neha
Error when submit a patch set which contains a file rename.
Hi, I was trying to submit a patch set for review. In this patch set, I removed one file and created another to replace the removed. However, when I run kafka-patch-review.py I got an error. python kafka-patch-review.py -b origin/trunk -j KAFKA-1926 Configuring reviewboard url to https://reviews.apache.org Updating your remote branches to pull the latest changes Verifying JIRA connection configurations ERROR: Error validating diff core/src/main/scala/kafka/utils/Utils.scala: The file was not found in the repository. (HTTP 400, API Error 207) ERROR: reviewboard update failed. Exiting. I wonder if it is because I do not have enough of permission? the file to be renamed is clearly there. Any help will be appreciated. Thanks. Tong Li OpenStack Kafka Community Development Building 501/B205 liton...@us.ibm.com
[jira] [Commented] (KAFKA-1926) Replace kafka.utils.Utils with o.a.k.common.utils.Utils
[ https://issues.apache.org/jira/browse/KAFKA-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348038#comment-14348038 ] Tong Li commented on KAFKA-1926: Jay, I finally put together a new patch set just for this issue. (the SystemTime issue we can address in a separate issue since this patch set is already very big). In the new patch set, I will have the following changes made: This patch set is an attempt to refactor core utils class and remove the namespace overlap. Utils class was defined by both clients and core projects. Many methods are duplicate. This patch did the following: 1. Renames the core kafka.utils.Utils class to be kafka.utils.CoreUtils 2. Removed the abs method in kafka.utils.CoreUtils, anywhere using abs method will be replaced with the method in o.a.k.c.u.Utils.abs 3. Removed the Crc32 class in kafka.utils package. Also use the one defined in o.a.k.c.u.Utils class. 4. Removed asString method in CoreUtils since this was not used. 5. Removed ReadUnisnedInt (two methods), WriteUnsignedInt (two methods) since they are completely duplicate of the client methods, both signature and implementation are absolutely identical. I will submit the patch set for you to review. Since this patch set touches so many files, very time consuming. if we can get this in faster, then I won't need to do so many rounds of rebase. Thanks so much. Replace kafka.utils.Utils with o.a.k.common.utils.Utils --- Key: KAFKA-1926 URL: https://issues.apache.org/jira/browse/KAFKA-1926 Project: Kafka Issue Type: Improvement Affects Versions: 0.8.2.0 Reporter: Jay Kreps Labels: newbie, patch Attachments: KAFKA-1926.patch, KAFKA-1926.patch There is currently a lot of duplication between the Utils class in common and the one in core. Our plan has been to deprecate duplicate code in the server and replace it with the new common code. As such we should evaluate each method in the scala Utils and do one of the following: 1. Migrate it to o.a.k.common.utils.Utils if it is a sensible general purpose utility in active use that is not Kafka-specific. If we migrate it we should really think about the API and make sure there is some test coverage. A few things in there are kind of funky and we shouldn't just blindly copy them over. 2. Create a new class ServerUtils or ScalaUtils in kafka.utils that will hold any utilities that really need to make use of Scala features to be convenient. 3. Delete it if it is not used, or has a bad api. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 31706: Patch for KAFKA-1997
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31706/ --- (Updated March 4, 2015, 11:42 p.m.) Review request for kafka. Bugs: KAFKA-1997 https://issues.apache.org/jira/browse/KAFKA-1997 Repository: kafka Description (updated) --- Addressed Guozhang's comments. Changed the exit behavior on send failure because close(0) is not ready yet. Will submit followup patch after KAFKA-1660 is checked in. Expanded imports from _ and * to full class path Diffs (updated) - clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java d5c79e2481d5e9a2524ac2ef6a6879f61cb7cb5f core/src/main/scala/kafka/consumer/PartitionAssignor.scala e6ff7683a0df4a7d221e949767e57c34703d5aad core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 5487259751ebe19f137948249aa1fd2637d2deb4 core/src/main/scala/kafka/javaapi/consumer/ConsumerRebalanceListener.java 7f45a90ba6676290172b7da54c15ee5dc1a42a2e core/src/main/scala/kafka/tools/MirrorMaker.scala 5374280dc97dc8e01e9b3ba61fd036dc13ae48cb core/src/test/scala/unit/kafka/consumer/PartitionAssignorTest.scala 543070f4fd3e96f3183cae9ee2ccbe843409ee58 core/src/test/scala/unit/kafka/consumer/ZookeeperConsumerConnectorTest.scala a17e8532c44aadf84b8da3a57bcc797a848b5020 Diff: https://reviews.apache.org/r/31706/diff/ Testing --- Thanks, Jiangjie Qin
[jira] [Commented] (KAFKA-1998) Partitions Missing From MetadataResponse
[ https://issues.apache.org/jira/browse/KAFKA-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347740#comment-14347740 ] Jay Kreps commented on KAFKA-1998: -- This seems like a bug, shouldn't we always return all the partitions and mark them as having no leader? Partitions Missing From MetadataResponse Key: KAFKA-1998 URL: https://issues.apache.org/jira/browse/KAFKA-1998 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.2.0 Reporter: Evan Huus Assignee: Mayuresh Gharat It is known behaviour that when a partition is entirely offline (it has no leader because all of its replicas are down) then that partition will not be included in the metadata returned by other brokers. For example, if topic foo has 3 partitions, but all replicas of partition 3 are offline, then requesting metadata for foo will only return information about partitions 1 and 2. This means that there is no way to reliably determine the number of partitions for a topic via kafka's metadata API; if I receive information on partitions 1 and 2, I don't know if partition 3 is offline or if it is simply that there are only two partitions total. (You can presumably still ask zookeeper directly, but that is a work-around). This ambiguity, in turn, can lead to a consistency problem with the default partitioner, since that effectively implements `hash(key) mod #partitions`. If a partition goes offline and is removed from the metadata response, then the number of partitions the producer knows about will change (on its next metadata refresh) and the mapping from keys to partitions will also change. Instead of distributing messages among (for example) 3 partitions, and failing to produce to the offline partition, it will distribute *all* messages among the two online partitions. This results in messages being sent to the wrong partition. Since kafka already returns partitions with error messages in many cases (e.g. `LeaderNotAvailable`) I think it makes much more sense and fixes the above partition problem if it would simply return offline partitions as well with the appropriate error (whether that is `LeaderNotAvailable` or it would be better to add an additional error is up to you). CC [~guozhang] (This issue was originally described/discussed on the kafka-users mailing list, in the thread involving https://mail-archives.apache.org/mod_mbox/kafka-users/201503.mbox/%3CCAA4pprAZvp2XhdNmy0%2BqVZ1UVdVxmUfz3DDArhGbwP-iiH%2BGyg%40mail.gmail.com%3E) If there are any questions I am happy to clarify, I realize the scenario is somewhat complex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 30126: Patch for KAFKA-1845
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30126/ --- (Updated March 4, 2015, 11:12 p.m.) Review request for kafka. Bugs: KAFKA-1845 https://issues.apache.org/jira/browse/KAFKA-1845 Repository: kafka Description (updated) --- KAFKA-1845 - Fixed merge conflicts, ported added configs to KafkaConfig KAFKA-1845 - KafkaConfig to ConfigDef: moved validateValues so it's called on instantiating KafkaConfig KAFKA-1845 - KafkaConfig to ConfigDef: MaxConnectionsPerIpOverrides refactored KAFKA-1845 - code review fixes, merge conflicts after rebase KAFKA-1845 - rebase to trunk Diffs (updated) - clients/src/main/java/org/apache/kafka/common/config/ConfigDef.java 852a9b39a2f9cd71d176943be86531a43ede core/src/main/scala/kafka/Kafka.scala 77a49e12af6f869e63230162e9f87a7b0b12b610 core/src/main/scala/kafka/controller/KafkaController.scala e9b4dc62df3f139de8d4dc688e48ccc0a5513123 core/src/main/scala/kafka/controller/PartitionLeaderSelector.scala 4a31c7271c2d0a4b9e8b28be729340ecfa0696e5 core/src/main/scala/kafka/server/KafkaConfig.scala 14bf3216bae030331bdf76b3266ed0e73526c3de core/src/main/scala/kafka/server/KafkaServer.scala 8e3def9e9edaf49c0443a2e08cf37277d0f25306 core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 6879e730282185bda3d6bc3659cb15af0672cecf core/src/test/scala/integration/kafka/api/IntegrationTestHarness.scala 5650b4a7b950b48af3e272947bfb5e271c4238c9 core/src/test/scala/integration/kafka/api/ProducerCompressionTest.scala e63558889272bc76551accdfd554bdafde2e0dd6 core/src/test/scala/integration/kafka/api/ProducerFailureHandlingTest.scala d34ee3a40dcc8475e183435ad6842cd3d0a13ade core/src/test/scala/integration/kafka/api/ProducerSendTest.scala 8154a4210dc8dcceb26549c98bbbc4a1282a1c67 core/src/test/scala/unit/kafka/admin/AddPartitionsTest.scala 1bf2667f47853585bc33ffb3e28256ec5f24ae84 core/src/test/scala/unit/kafka/admin/AdminTest.scala e28979827110dfbbb92fe5b152e7f1cc973de400 core/src/test/scala/unit/kafka/admin/DeleteConsumerGroupTest.scala d530338728be41282925b3a62030d1f316b4f9c5 core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala c8f336aa034ab5702c7644a669fd32c746512d29 core/src/test/scala/unit/kafka/consumer/ConsumerIteratorTest.scala c0355cc0135c6af2e346b4715659353a31723b86 core/src/test/scala/unit/kafka/consumer/ZookeeperConsumerConnectorTest.scala a17e8532c44aadf84b8da3a57bcc797a848b5020 core/src/test/scala/unit/kafka/integration/AutoOffsetResetTest.scala 95303e098d40cd790fb370e9b5a47d20860a6da3 core/src/test/scala/unit/kafka/integration/FetcherTest.scala 25845abbcad2e79f56f729e59239b738d3ddbc9d core/src/test/scala/unit/kafka/integration/PrimitiveApiTest.scala aeb7a19acaefabcc161c2ee6144a56d9a8999a81 core/src/test/scala/unit/kafka/integration/RollingBounceTest.scala eab4b5f619015af42e4554660eafb5208e72ea33 core/src/test/scala/unit/kafka/integration/TopicMetadataTest.scala 35dc071b1056e775326981573c9618d8046e601d core/src/test/scala/unit/kafka/integration/UncleanLeaderElectionTest.scala ba3bcdcd1de9843e75e5395dff2fc31b39a5a9d5 core/src/test/scala/unit/kafka/javaapi/consumer/ZookeeperConsumerConnectorTest.scala d6248b09bb0f86ee7d3bd0ebce5b99135491453b core/src/test/scala/unit/kafka/log/LogTest.scala 1a4be70a21fe799d4d71dd13e84968e40fb8ad92 core/src/test/scala/unit/kafka/log4j/KafkaLog4jAppenderTest.scala 4ea0489c9fd36983fe190491a086b39413f3a9cd core/src/test/scala/unit/kafka/metrics/MetricsTest.scala 111e4a26c1efb6f7c151ca9217dbe107c27ab617 core/src/test/scala/unit/kafka/producer/AsyncProducerTest.scala 1db6ac329f7b54e600802c8a623f80d159d4e69b core/src/test/scala/unit/kafka/producer/ProducerTest.scala ce65dab4910d9182e6774f6ef1a7f45561ec0c23 core/src/test/scala/unit/kafka/producer/SyncProducerTest.scala d60d8e0f49443f4dc8bc2cad6e2f951eda28f5cb core/src/test/scala/unit/kafka/server/AdvertiseBrokerTest.scala f0c4a56b61b4f081cf4bee799c6e9c523ff45e19 core/src/test/scala/unit/kafka/server/DynamicConfigChangeTest.scala ad121169a5e80ebe1d311b95b219841ed69388e2 core/src/test/scala/unit/kafka/server/HighwatermarkPersistenceTest.scala 8913fc1d59f717c6b3ed12c8362080fb5698986b core/src/test/scala/unit/kafka/server/ISRExpirationTest.scala a703d2715048c5602635127451593903f8d20576 core/src/test/scala/unit/kafka/server/KafkaConfigConfigDefTest.scala PRE-CREATION core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala 82dce80d553957d8b5776a9e140c346d4e07f766 core/src/test/scala/unit/kafka/server/LeaderElectionTest.scala c2ba07c5fdbaf0e65ca033b2e4d88f45a8a15b2e core/src/test/scala/unit/kafka/server/LogOffsetTest.scala c06ee756bf0fe07e5d3c92823a476c960b37afd6 core/src/test/scala/unit/kafka/server/LogRecoveryTest.scala
[jira] [Updated] (KAFKA-1845) KafkaConfig should use ConfigDef
[ https://issues.apache.org/jira/browse/KAFKA-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrii Biletskyi updated KAFKA-1845: Attachment: KAFKA-1845_2015-03-05_01:12:22.patch KafkaConfig should use ConfigDef - Key: KAFKA-1845 URL: https://issues.apache.org/jira/browse/KAFKA-1845 Project: Kafka Issue Type: Sub-task Reporter: Gwen Shapira Assignee: Andrii Biletskyi Labels: newbie Fix For: 0.8.3 Attachments: KAFKA-1845.patch, KAFKA-1845_2015-02-08_17:05:22.patch, KAFKA-1845_2015-03-05_01:12:22.patch ConfigDef is already used for the new producer and for TopicConfig. Will be nice to standardize and use one configuration and validation library across the board. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-1845) KafkaConfig should use ConfigDef
[ https://issues.apache.org/jira/browse/KAFKA-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347751#comment-14347751 ] Andrii Biletskyi commented on KAFKA-1845: - Updated reviewboard https://reviews.apache.org/r/30126/diff/ against branch origin/trunk KafkaConfig should use ConfigDef - Key: KAFKA-1845 URL: https://issues.apache.org/jira/browse/KAFKA-1845 Project: Kafka Issue Type: Sub-task Reporter: Gwen Shapira Assignee: Andrii Biletskyi Labels: newbie Fix For: 0.8.3 Attachments: KAFKA-1845.patch, KAFKA-1845_2015-02-08_17:05:22.patch, KAFKA-1845_2015-03-05_01:12:22.patch ConfigDef is already used for the new producer and for TopicConfig. Will be nice to standardize and use one configuration and validation library across the board. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] KIP-6 - New reassignment partition logic for re-balancing
Todd, I think plugable design is good with solid default. The only issue I feel is when you use one and switch to another, will we end up with some unread messages hanging around and no one thinks or knows it is their responsibility to take care of them? Thanks. Tong Sent from my iPhone On Mar 5, 2015, at 10:46 AM, Todd Palino tpal...@gmail.com wrote: Apologize for the late comment on this... So fair assignment by count (taking into account the current partition count of each broker) is very good. However, it's worth noting that all partitions are not created equal. We have actually been performing more rebalance work based on the partition size on disk, as given equal retention of all topics, the size on disk is a better indicator of the amount of traffic a partition gets, both in terms of storage and network traffic. Overall, this seems to be a better balance. In addition to this, I think there is very much a need to have Kafka be rack-aware. That is, to be able to assure that for a given cluster, you never assign all replicas for a given partition in the same rack. This would allow us to guard against maintenances or power failures that affect a full rack of systems (or a given switch). I think it would make sense to implement the reassignment logic as a pluggable component. That way it would be easy to select a scheme when performing a reassignment (count, size, rack aware). Configuring a default scheme for a cluster would allow for the brokers to create new topics and partitions in compliance with the requested policy. -Todd On Thu, Jan 22, 2015 at 10:13 PM, Joe Stein joe.st...@stealth.ly wrote: I will go back through the ticket and code and write more up. Should be able to-do that sometime next week. The intention was to not replace existing functionality by issue a WARN on use. The following version it is released we could then deprecate it... I will fix the KIP for that too. On Fri, Jan 23, 2015 at 12:34 AM, Neha Narkhede n...@confluent.io wrote: Hey Joe, 1. Could you add details to the Public Interface section of the KIP? This should include the proposed changes to the partition reassignment tool. Also, maybe the new option can be named --rebalance instead of --re-balance? 2. It makes sense to list --decommission-broker as part of this KIP. Similarly, shouldn't we also have an --add-broker option? The way I see this is that there are several events when a partition reassignment is required. Before this functionality is automated on the broker, the tool will generate an ideal replica placement for each such event. The users should merely have to specify the nature of the event e.g. adding a broker or decommissioning an existing broker or merely rebalancing. 3. If I understand the KIP correctly, the upgrade plan for this feature includes removing the existing --generate option on the partition reassignment tool in 0.8.3 while adding all the new options in the same release. Is that correct? Thanks, Neha On Thu, Jan 22, 2015 at 9:23 PM, Jay Kreps jay.kr...@gmail.com wrote: Ditto on this one. Can you give the algorithm we want to implement? Also I think in terms of scope this is just proposing to change the logic in ReassignPartitionsCommand? I think we've had the discussion various times on the mailing list that what people really want is just for Kafka to do it's best to balance data in an online fashion (for some definition of balance). i.e. if you add a new node partitions would slowly migrate to it, and if a node dies, partitions slowly migrate off it. This could potentially be more work, but I'm not sure how much more. Has anyone thought about how to do it? -Jay On Wed, Jan 21, 2015 at 10:11 PM, Joe Stein joe.st...@stealth.ly wrote: Posted a KIP for --re-balance for partition assignment in reassignment tool. https://cwiki.apache.org/confluence/display/KAFKA/KIP-6+-+New +reassignment+partition+logic+for+re-balancing JIRA https://issues.apache.org/jira/browse/KAFKA-1792 While going through the KIP I thought of one thing from the JIRA that we should change. We should preserve --generate to be existing functionality for the next release it is in. If folks want to use --re-balance then great, it just won't break any upgrade paths, yet. /*** Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop / -- Thanks, Neha
Re: Review Request 31706: Patch for KAFKA-1997
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31706/ --- (Updated March 4, 2015, 11:07 p.m.) Review request for kafka. Bugs: KAFKA-1997 https://issues.apache.org/jira/browse/KAFKA-1997 Repository: kafka Description (updated) --- Addressed Guozhang's comments. Changed the exit behavior on send failure because close(0) is not ready yet. Will submit followup patch after KAFKA-1660 is checked in. Diffs (updated) - clients/src/main/java/org/apache/kafka/clients/producer/internals/RecordAccumulator.java d5c79e2481d5e9a2524ac2ef6a6879f61cb7cb5f core/src/main/scala/kafka/consumer/PartitionAssignor.scala e6ff7683a0df4a7d221e949767e57c34703d5aad core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 5487259751ebe19f137948249aa1fd2637d2deb4 core/src/main/scala/kafka/javaapi/consumer/ConsumerRebalanceListener.java 7f45a90ba6676290172b7da54c15ee5dc1a42a2e core/src/main/scala/kafka/tools/MirrorMaker.scala 5374280dc97dc8e01e9b3ba61fd036dc13ae48cb core/src/test/scala/unit/kafka/consumer/PartitionAssignorTest.scala 543070f4fd3e96f3183cae9ee2ccbe843409ee58 core/src/test/scala/unit/kafka/consumer/ZookeeperConsumerConnectorTest.scala a17e8532c44aadf84b8da3a57bcc797a848b5020 Diff: https://reviews.apache.org/r/31706/diff/ Testing --- Thanks, Jiangjie Qin
[jira] [Commented] (KAFKA-2004) Write Kafka messages directly to HDFS
[ https://issues.apache.org/jira/browse/KAFKA-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347828#comment-14347828 ] Jiangjie Qin commented on KAFKA-2004: - Is it just a question or it is a ticket to be followed up... Write Kafka messages directly to HDFS - Key: KAFKA-2004 URL: https://issues.apache.org/jira/browse/KAFKA-2004 Project: Kafka Issue Type: Bug Components: consumer, core, producer Affects Versions: 0.8.1.1 Reporter: sutanu das Assignee: Neha Narkhede Priority: Critical 1. Is there a way to write Kafka messages directly to HDFS without writing any consumer code? 2. Is there anyway to integrate Kafka with Storm or Spark so messages goes directly from Kafka consumers to HDFS sync? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] KIP-2 - Refactor brokers to allow listening on multiple ports and IPs
+1 binding On Tue, Mar 03, 2015 at 12:14:33PM -0800, Gwen Shapira wrote: Details are in the wiki: https://cwiki.apache.org/confluence/display/KAFKA/KIP-2+-+Refactor+brokers+to+allow+listening+on+multiple+ports+and+IPs
[jira] [Comment Edited] (KAFKA-2003) Add upgrade tests
[ https://issues.apache.org/jira/browse/KAFKA-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347804#comment-14347804 ] Ashish K Singh edited comment on KAFKA-2003 at 3/4/15 11:57 PM: [~gwenshap] does the following outline make sense? $ some_tool from_version m_instances to_version n_instances 1. Compile jars into from_version_target 2. Compile jars into to_version_target 3. Bring up n to_version brokers 4. Run tests producer consumer tests with clients from from_version_target 5. Bring up m from_version brokers 6. Run tests producer consumer tests with clients from from_version_target 7. Run tests producer consumer tests with clients from to_version_target 8. {noformat} foreach(broker: brokers) Bump up the interbroker.protocol.version Run tests producer consumer tests with clients from from_version_target Run tests producer consumer tests with clients from to_version_target {noformat} 9. Cleanup was (Author: singhashish): [~gwenshap] does the following outline make sense? $ some_tool from_version m_instances to_version n_instances 1. Compile jars into from_version_target 2. Compile jars into to_version_target 3. Bring up n to_version brokers 4. Run tests producer consumer tests with clients from from_version_target 5. Bring up m from_version brokers 6. Run tests producer consumer tests with clients from from_version_target 7. Run tests producer consumer tests with clients from to_version_target 8. {noformat} foreach(broker: brokers) Bump up the interbroker.protocol.version Run tests producer consumer tests with clients from from_version_target Run tests producer consumer tests with clients from to_version_target {noformat} Add upgrade tests - Key: KAFKA-2003 URL: https://issues.apache.org/jira/browse/KAFKA-2003 Project: Kafka Issue Type: Improvement Reporter: Gwen Shapira Assignee: Ashish K Singh To test protocol changes, compatibility and upgrade process, we need a good way to test different versions of the product together and to test end-to-end upgrade process. For example, for 0.8.2 to 0.8.3 test we want to check: * Can we start a cluster with a mix of 0.8.2 and 0.8.3 brokers? * Can a cluster of 0.8.3 brokers bump the protocol level one broker at a time? * Can 0.8.2 clients run against a cluster of 0.8.3 brokers? There are probably more questions. But an automated framework that can test those and report results will be a good start. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [kafka-clients] Re: [VOTE] 0.8.2.1 Candidate 2
+1. Verified quick start, unit tests. On Tue, Mar 3, 2015 at 12:09 PM, Joe Stein joe.st...@stealth.ly wrote: Ok, lets fix the transient test failure on trunk agreed not a blocker. +1 quick start passed, verified artifacts, updates in scala https://github.com/stealthly/scala-kafka/tree/0.8.2.1 and go https://github.com/stealthly/go_kafka_client/tree/0.8.2.1 look good ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Tue, Mar 3, 2015 at 12:30 PM, Jun Rao j...@confluent.io wrote: Hi, Joe, Yes, that unit test does have transient failures from time to time. The issue seems to be with the unit test itself and not the actual code. So, this is not a blocker for 0.8.2.1 release. I think we can just fix it in trunk. Thanks, Jun On Tue, Mar 3, 2015 at 9:08 AM, Joe Stein joe.st...@stealth.ly wrote: Jun, I have most everything looks good except I keep getting test failures from wget https://people.apache.org/~junrao/kafka-0.8.2.1-candidate2/kafka-0.8.2.1-src.tgz tar -xvf kafka-0.8.2.1-src.tgz cd kafka-0.8.2.1-src gradle ./gradlew test kafka.api.ProducerFailureHandlingTest testNotEnoughReplicasAfterBrokerShutdown FAILED org.scalatest.junit.JUnitTestFailedError: Expected NotEnoughReplicasException when producing to topic with fewer brokers than min.insync.replicas at org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:101) at org.scalatest.junit.JUnit3Suite.newAssertionFailedException(JUnit3Suite.scala:149) at org.scalatest.Assertions$class.fail(Assertions.scala:711) at org.scalatest.junit.JUnit3Suite.fail(JUnit3Suite.scala:149) at kafka.api.ProducerFailureHandlingTest.testNotEnoughReplicasAfterBrokerShutdown(ProducerFailureHandlingTest.scala:355) This happens to me all the time on a few different machines. ~ Joe Stein - - - - - - - - - - - - - - - - - http://www.stealth.ly - - - - - - - - - - - - - - - - - On Mon, Mar 2, 2015 at 7:36 PM, Jun Rao j...@confluent.io wrote: +1 from me. Verified quickstart and unit tests. Thanks, Jun On Thu, Feb 26, 2015 at 2:59 PM, Jun Rao j...@confluent.io wrote: This is the second candidate for release of Apache Kafka 0.8.2.1. This fixes 4 critical issue in 0.8.2.0. Release Notes for the 0.8.2.1 release https://people.apache.org/~junrao/kafka-0.8.2.1-candidate2/RELEASE_NOTES.html *** Please download, test and vote by Monday, Mar 2, 3pm PT Kafka's KEYS file containing PGP keys we use to sign the release: http://kafka.apache.org/KEYS in addition to the md5, sha1 and sha2 (SHA256) checksum. * Release artifacts to be voted upon (source and binary): https://people.apache.org/~junrao/kafka-0.8.2.1-candidate2/ * Maven artifacts to be voted upon prior to release: https://repository.apache.org/content/groups/staging/ * scala-doc https://people.apache.org/~junrao/kafka-0.8.2.1-candidate2/scaladoc/ * java-doc https://people.apache.org/~junrao/kafka-0.8.2.1-candidate2/javadoc/ * The tag to be voted upon (off the 0.8.2 branch) is the 0.8.2.1 tag https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=bd1bfb63ec73c10d08432ac893a23f28281ea021 (git commit ee1267b127f3081db491fa1bf9a287084c324e36) /*** Thanks, Jun -- You received this message because you are subscribed to the Google Groups kafka-clients group. To unsubscribe from this group and stop receiving emails from it, send an email to kafka-clients+unsubscr...@googlegroups.com. To post to this group, send email to kafka-clie...@googlegroups.com. Visit this group at http://groups.google.com/group/kafka-clients. To view this discussion on the web visit https://groups.google.com/d/msgid/kafka-clients/CAFc58G_5FuLKx3_kM4PCgqQL8d%2B4sqE0o-%2BXfu3FJicAgn5KPw%40mail.gmail.com https://groups.google.com/d/msgid/kafka-clients/CAFc58G_5FuLKx3_kM4PCgqQL8d%2B4sqE0o-%2BXfu3FJicAgn5KPw%40mail.gmail.com?utm_medium=emailutm_source=footer . For more options, visit https://groups.google.com/d/optout. -- Thanks, Neha
Re: [VOTE] KIP-2 - Refactor brokers to allow listening on multiple ports and IPs
Thanks guys! We have 4 binding +1, 2 non-binding +1 and no -1. I'll update the wiki that this proposal is accepted. If you want to comment on the implementation, feel free to take a look at KAFKA-1809 :) Gwen On Wed, Mar 4, 2015 at 6:31 PM, Joel Koshy jjkosh...@gmail.com wrote: +1 binding On Tue, Mar 03, 2015 at 12:14:33PM -0800, Gwen Shapira wrote: Details are in the wiki: https://cwiki.apache.org/confluence/display/KAFKA/KIP-2+-+Refactor+brokers+to+allow+listening+on+multiple+ports+and+IPs
[jira] [Commented] (KAFKA-1451) Broker stuck due to leader election race
[ https://issues.apache.org/jira/browse/KAFKA-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14346728#comment-14346728 ] Aldrin Seychell commented on KAFKA-1451: I just encountered this issue on version 0.8.2.0 after a period of slow PC performance and perhaps zookeeper and kafka were slow to communicate between each other (possibly the same issue highlighted by [~sinewy]. This resulted in an infinite loop in attempting to created the ephemeral node. The logs that were continously being written are as follows: [2015-03-03 13:48:19,831] INFO conflict in /brokers/ids/0 data: {jmx_port:-1,timestamp:1425386833617,host:MTDKP119.ix.com,version:1,port:9092} stored data: {jmx_port:-1,timestamp:1425380575230,host:MTDKP119.ix.com,version:1,port:9092} (kafka.utils.ZkUtils$) [2015-03-03 13:48:19,832] INFO I wrote this conflicted ephemeral node [{jmx_port:-1,timestamp:1425386833617,host:MTDKP119.ix.com,version:1,port:9092}] at /brokers/ids/0 a while back in a different session, hence I will backoff for this node to be deleted by Zookeeper and retry (kafka.utils.ZkUtils$) [2015-03-03 13:48:25,844] INFO conflict in /brokers/ids/0 data: {jmx_port:-1,timestamp:1425386833617,host:MTDKP119.ix.com,version:1,port:9092} stored data: {jmx_port:-1,timestamp:1425380575230,host:MTDKP119.ix.com,version:1,port:9092} (kafka.utils.ZkUtils$) Broker stuck due to leader election race - Key: KAFKA-1451 URL: https://issues.apache.org/jira/browse/KAFKA-1451 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.1.1 Reporter: Maciek Makowski Assignee: Manikumar Reddy Priority: Minor Labels: newbie Fix For: 0.8.2.0 Attachments: KAFKA-1451.patch, KAFKA-1451_2014-07-28_20:27:32.patch, KAFKA-1451_2014-07-29_10:13:23.patch h3. Symptoms The broker does not become available due to being stuck in an infinite loop while electing leader. This can be recognised by the following line being repeatedly written to server.log: {code} [2014-05-14 04:35:09,187] INFO I wrote this conflicted ephemeral node [{version:1,brokerid:1,timestamp:1400060079108}] at /controller a while back in a different session, hence I will backoff for this node to be deleted by Zookeeper and retry (kafka.utils.ZkUtils$) {code} h3. Steps to Reproduce In a single kafka 0.8.1.1 node, single zookeeper 3.4.6 (but will likely behave the same with the ZK version included in Kafka distribution) node setup: # start both zookeeper and kafka (in any order) # stop zookeeper # stop kafka # start kafka # start zookeeper h3. Likely Cause {{ZookeeperLeaderElector}} subscribes to data changes on startup, and then triggers an election. if the deletion of ephemeral {{/controller}} node associated with previous zookeeper session of the broker happens after subscription to changes in new session, election will be invoked twice, once from {{startup}} and once from {{handleDataDeleted}}: * {{startup}}: acquire {{controllerLock}} * {{startup}}: subscribe to data changes * zookeeper: delete {{/controller}} since the session that created it timed out * {{handleDataDeleted}}: {{/controller}} was deleted * {{handleDataDeleted}}: wait on {{controllerLock}} * {{startup}}: elect -- writes {{/controller}} * {{startup}}: release {{controllerLock}} * {{handleDataDeleted}}: acquire {{controllerLock}} * {{handleDataDeleted}}: elect -- attempts to write {{/controller}} and then gets into infinite loop as a result of conflict {{createEphemeralPathExpectConflictHandleZKBug}} assumes that the existing znode was written from different session, which is not true in this case; it was written from the same session. That adds to the confusion. h3. Suggested Fix In {{ZookeeperLeaderElector.startup}} first run {{elect}} and then subscribe to data changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KAFKA-2002) It does not work when kafka_mx4jenable is false
Yaguo Zhou created KAFKA-2002: - Summary: It does not work when kafka_mx4jenable is false Key: KAFKA-2002 URL: https://issues.apache.org/jira/browse/KAFKA-2002 Project: Kafka Issue Type: Bug Components: core Reporter: Yaguo Zhou Priority: Minor Should return false immediately when kafka_mx4jenable is false -- This message was sent by Atlassian JIRA (v6.3.4#6332)