[jira] [Created] (KAFKA-2349) `contributing` website page should link to Contributing Code Changes wiki page
Ismael Juma created KAFKA-2349: -- Summary: `contributing` website page should link to Contributing Code Changes wiki page Key: KAFKA-2349 URL: https://issues.apache.org/jira/browse/KAFKA-2349 Project: Kafka Issue Type: Task Reporter: Ismael Juma Assignee: Ismael Juma This should be merged at the same time as https://issues.apache.org/jira/browse/KAFKA-2321 and only after a vote takes place in the mailing list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2349) `contributing` website page should link to Contributing Code Changes wiki page
[ https://issues.apache.org/jira/browse/KAFKA-2349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-2349: --- Status: Patch Available (was: Open) `contributing` website page should link to Contributing Code Changes wiki page Key: KAFKA-2349 URL: https://issues.apache.org/jira/browse/KAFKA-2349 Project: Kafka Issue Type: Task Reporter: Ismael Juma Assignee: Ismael Juma Attachments: KAFKA-2349.patch This should be merged at the same time as https://issues.apache.org/jira/browse/KAFKA-2321 and only after a vote takes place in the mailing list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] JIRA issue required even for minor/hotfix pull requests?
On Mon, Jul 13, 2015 at 6:01 PM, Guozhang Wang wangg...@gmail.com wrote: changing the statement in wiki that you could create a PR with [KAFKA-] or [MINOR], [HOTFIX], etc I went with this for now: The PR title should usually be of the form [KAFKA-] Title, where KAFKA- is the relevant JIRA id and Title may be the JIRA's title or a more specific title describing the PR itself. For trivial cases where a JIRA is not required (see JIRA section for more details) [MINOR] or [HOTFIX] can be used as the PR title prefix. We can always change it if it doesn't work well. Ismael
Re: [DISCUSS] JIRA issue required even for minor/hotfix pull requests?
On Mon, Jul 20, 2015 at 10:24 AM, Ismael Juma ism...@juma.me.uk wrote: I went with this for now: Actually, I changed it to the following to match our existing commit prefix convention (instead of Spark's): The PR title should usually be of the form KAFKA-; Title, where KAFKA- is the relevant JIRA id and Title may be the JIRA's title or a more specific title describing the PR itself. For trivial cases where a JIRA is not required (see JIRA section for more details) MINOR; or HOTFIX; can be used as the PR title prefix. The script already works this way, I had just forgotten to update the documentation to match. Ismael
[jira] [Created] (KAFKA-2348) Drop support for Scala 2.9
Ismael Juma created KAFKA-2348: -- Summary: Drop support for Scala 2.9 Key: KAFKA-2348 URL: https://issues.apache.org/jira/browse/KAFKA-2348 Project: Kafka Issue Type: Task Reporter: Ismael Juma Assignee: Ismael Juma Summary of why we should drop Scala 2.9: * Doubles the number of builds required from 2 to 4 (2.9.1 and 2.9.2 are not binary compatible). * Code has been committed to trunk that doesn't build with Scala 2.9 weeks ago and no-one seems to have noticed or cared (well, I filed https://issues.apache.org/jira/browse/KAFKA-2325). Can we really support a version if we don't test it? * New clients library is written in Java and won't be affected. It also has received a lot of work and it's much improved since the last release. * It was released 4 years ago, it has been unsupported for a long time and most projects have dropped support for it (for example, we use a different version of ScalaTest for Scala 2.9) * Scala 2.10 introduced Futures and a few useful features like String interpolation and value classes. * Doesn't work with Java 8 (https://issues.apache.org/jira/browse/KAFKA-2203). Vote thread: http://search-hadoop.com/m/uyzND1DIE422mz94I1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2349) `contributing` website page should link to Contributing Code Changes wiki page
[ https://issues.apache.org/jira/browse/KAFKA-2349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-2349: --- Attachment: KAFKA-2349.patch Links to Contributing Code Changes, adds a section on contributing changes to the website and a few clean-ups. `contributing` website page should link to Contributing Code Changes wiki page Key: KAFKA-2349 URL: https://issues.apache.org/jira/browse/KAFKA-2349 Project: Kafka Issue Type: Task Reporter: Ismael Juma Assignee: Ismael Juma Attachments: KAFKA-2349.patch This should be merged at the same time as https://issues.apache.org/jira/browse/KAFKA-2321 and only after a vote takes place in the mailing list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2348) Drop support for Scala 2.9
[ https://issues.apache.org/jira/browse/KAFKA-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633397#comment-14633397 ] ASF GitHub Bot commented on KAFKA-2348: --- GitHub user ijuma opened a pull request: https://github.com/apache/kafka/pull/87 KAFKA-2348; Drop support for Scala 2.9 `testAll` passed locally. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ijuma/kafka kafka-2348-drop-support-for-scala-2.9 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/87.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #87 commit 00ac57ac12ce56d06311845916cae45a9db48d5e Author: Ismael Juma ism...@juma.me.uk Date: 2015-07-18T14:57:16Z KAFKA-2348; Drop support for Scala 2.9 Drop support for Scala 2.9 -- Key: KAFKA-2348 URL: https://issues.apache.org/jira/browse/KAFKA-2348 Project: Kafka Issue Type: Task Reporter: Ismael Juma Assignee: Ismael Juma Summary of why we should drop Scala 2.9: * Doubles the number of builds required from 2 to 4 (2.9.1 and 2.9.2 are not binary compatible). * Code has been committed to trunk that doesn't build with Scala 2.9 weeks ago and no-one seems to have noticed or cared (well, I filed https://issues.apache.org/jira/browse/KAFKA-2325). Can we really support a version if we don't test it? * New clients library is written in Java and won't be affected. It also has received a lot of work and it's much improved since the last release. * It was released 4 years ago, it has been unsupported for a long time and most projects have dropped support for it (for example, we use a different version of ScalaTest for Scala 2.9) * Scala 2.10 introduced Futures and a few useful features like String interpolation and value classes. * Doesn't work with Java 8 (https://issues.apache.org/jira/browse/KAFKA-2203). Vote thread: http://search-hadoop.com/m/uyzND1DIE422mz94I1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] kafka pull request: KAFKA-2348; Drop support for Scala 2.9
GitHub user ijuma opened a pull request: https://github.com/apache/kafka/pull/87 KAFKA-2348; Drop support for Scala 2.9 `testAll` passed locally. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ijuma/kafka kafka-2348-drop-support-for-scala-2.9 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/87.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #87 commit 00ac57ac12ce56d06311845916cae45a9db48d5e Author: Ismael Juma ism...@juma.me.uk Date: 2015-07-18T14:57:16Z KAFKA-2348; Drop support for Scala 2.9 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: [VOTE] Drop support for Scala 2.9 for the next release
Thank you for voting. 72 hours have passed and the vote has passed with 5 binding +1s and 5 non-binding +1s. I filed https://issues.apache.org/jira/browse/KAFKA-2348 and created a pull request with the change. Best, Ismael On Fri, Jul 17, 2015 at 11:26 AM, Ismael Juma ism...@juma.me.uk wrote: Hi all, I would like to start a vote on dropping support for Scala 2.9 for the next release. People seemed to be in favour of the idea in previous discussions: * http://search-hadoop.com/m/uyzND1uIW3k2fZVfU1 * http://search-hadoop.com/m/uyzND1KMLNK11Rmo72 Summary of why we should drop Scala 2.9: * Doubles the number of builds required from 2 to 4 (2.9.1 and 2.9.2 are not binary compatible). * Code has been committed to trunk that doesn't build with Scala 2.9 weeks ago and no-one seems to have noticed or cared (well, I filed https://issues.apache.org/jira/browse/KAFKA-2325). Can we really support a version if we don't test it? * New clients library is written in Java and won't be affected. It also has received a lot of work and it's much improved since the last release. * It was released 4 years ago, it has been unsupported for a long time and most projects have dropped support for it (for example, we use a different version of ScalaTest for Scala 2.9) * Scala 2.10 introduced Futures and a few useful features like String interpolation and value classes. * Doesn't work with Java 8 ( https://issues.apache.org/jira/browse/KAFKA-2203). The reason not to drop it is to maintain compatibility for people stuck in 2.9 who also want to upgrade both client and broker to the next Kafka release. The vote will run for 72 hours. +1 (non-binding) from me. Best, Ismael
[jira] [Updated] (KAFKA-2348) Drop support for Scala 2.9
[ https://issues.apache.org/jira/browse/KAFKA-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-2348: --- Status: Patch Available (was: Open) `testAll` passed. Drop support for Scala 2.9 -- Key: KAFKA-2348 URL: https://issues.apache.org/jira/browse/KAFKA-2348 Project: Kafka Issue Type: Task Reporter: Ismael Juma Assignee: Ismael Juma Summary of why we should drop Scala 2.9: * Doubles the number of builds required from 2 to 4 (2.9.1 and 2.9.2 are not binary compatible). * Code has been committed to trunk that doesn't build with Scala 2.9 weeks ago and no-one seems to have noticed or cared (well, I filed https://issues.apache.org/jira/browse/KAFKA-2325). Can we really support a version if we don't test it? * New clients library is written in Java and won't be affected. It also has received a lot of work and it's much improved since the last release. * It was released 4 years ago, it has been unsupported for a long time and most projects have dropped support for it (for example, we use a different version of ScalaTest for Scala 2.9) * Scala 2.10 introduced Futures and a few useful features like String interpolation and value classes. * Doesn't work with Java 8 (https://issues.apache.org/jira/browse/KAFKA-2203). Vote thread: http://search-hadoop.com/m/uyzND1DIE422mz94I1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] kafka pull request: fixed typo
Github user mosch closed the pull request at: https://github.com/apache/kafka/pull/17 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (KAFKA-1690) new java producer needs ssl support as a client
[ https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633579#comment-14633579 ] Sriharsha Chintalapani commented on KAFKA-1690: --- Updated reviewboard https://reviews.apache.org/r/33620/diff/ against branch origin/trunk new java producer needs ssl support as a client --- Key: KAFKA-1690 URL: https://issues.apache.org/jira/browse/KAFKA-1690 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Sriharsha Chintalapani Fix For: 0.8.3 Attachments: KAFKA-1690.patch, KAFKA-1690.patch, KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1690) new java producer needs ssl support as a client
[ https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriharsha Chintalapani updated KAFKA-1690: -- Attachment: KAFKA-1690_2015-07-20_06:10:42.patch new java producer needs ssl support as a client --- Key: KAFKA-1690 URL: https://issues.apache.org/jira/browse/KAFKA-1690 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Sriharsha Chintalapani Fix For: 0.8.3 Attachments: KAFKA-1690.patch, KAFKA-1690.patch, KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 33620: Patch for KAFKA-1690
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33620/ --- (Updated July 20, 2015, 1:10 p.m.) Review request for kafka. Bugs: KAFKA-1690 https://issues.apache.org/jira/browse/KAFKA-1690 Repository: kafka Description (updated) --- KAFKA-1690. new java producer needs ssl support as a client. KAFKA-1690. new java producer needs ssl support as a client. KAFKA-1690. new java producer needs ssl support as a client. KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests. KAFKA-1690. new java producer needs ssl support as a client. Added PrincipalBuilder. KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews. KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews. KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews. KAFKA-1690. new java producer needs ssl support as a client. Fixed minor issues with the patch. KAFKA-1690. new java producer needs ssl support as a client. Fixed minor issues with the patch. KAFKA-1690. new java producer needs ssl support as a client. KAFKA-1690. new java producer needs ssl support as a client. Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1 KAFKA-1690. Broker side ssl changes. KAFKA-1684. SSL for socketServer. KAFKA-1690. Added SSLProducerSendTest and fixes to get right port for SSL. Merge branch 'trunk' into KAFKA-1690-V1 KAFKA-1690. Post merge fixes. KAFKA-1690. Added SSLProducerSendTest. KAFKA-1690. Minor fixes based on patch review comments. Merge commit KAFKA-1690. Added SSL Consumer Test. KAFKA-1690. SSL Support. KAFKA-1690. Addressing reviews. Merge branch 'trunk' into KAFKA-1690-V1 Merge branch 'trunk' into KAFKA-1690-V1 Diffs (updated) - build.gradle fb9084307ae41bbb62e32720ccd7b94f26e910c8 checkstyle/import-control.xml 19e0659ef9385433d9f94dee43cd70a52b18c9e5 clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 2c421f42ed3fc5d61cf9c87a7eaa7bb23e26f63b clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 70377ae2fa46deb381139d28590ce6d4115e1adc clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java aa264202f2724907924985a5ecbe74afc4c6c04b clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java bae528d31516679bed88ee61b408f209f185a8cc clients/src/main/java/org/apache/kafka/common/config/SSLConfigs.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/Authenticator.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/ByteBufferSend.java df0e6d5105ca97b7e1cb4d334ffb7b443506bd0b clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/KafkaChannel.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/NetworkReceive.java 3ca0098b8ec8cfdf81158465b2d40afc47eb6f80 clients/src/main/java/org/apache/kafka/common/network/PlainTextChannelBuilder.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/Selectable.java 618a0fa53848ae6befea7eba39c2f3285b734494 clients/src/main/java/org/apache/kafka/common/network/Selector.java aaf60c98c2c0f4513a8d65ee0db67953a529d598 clients/src/main/java/org/apache/kafka/common/network/Send.java 8f6daadf6b67c3414911cda77765512131e56fd3 clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java dab1a94dd29563688b6ecf4eeb0e180b06049d3f clients/src/main/java/org/apache/kafka/common/security/auth/DefaultPrincipalBuilder.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/security/auth/KafkaPrincipal.java PRE-CREATION
Failing kafka-trunk-git-pr builds now fixed
Hi, All GitHub pull request builds were failing after we had a few successful ones. This should now be fixed and the same issue should not happen again. See the following for details: https://issues.apache.org/jira/browse/BUILDS-99 Best, Ismael
[GitHub] kafka pull request: Trunk
Github user abayer closed the pull request at: https://github.com/apache/kafka/pull/42 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] kafka pull request: KAFKA-294
Github user fsaintjacques closed the pull request at: https://github.com/apache/kafka/pull/2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (KAFKA-294) Path length must be 0 error during startup
[ https://issues.apache.org/jira/browse/KAFKA-294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633653#comment-14633653 ] ASF GitHub Bot commented on KAFKA-294: -- Github user fsaintjacques closed the pull request at: https://github.com/apache/kafka/pull/2 Path length must be 0 error during startup -- Key: KAFKA-294 URL: https://issues.apache.org/jira/browse/KAFKA-294 Project: Kafka Issue Type: Bug Reporter: Thomas Dudziak Fix For: 0.8.2.0 When starting Kafka 0.7.0 using zkclient-0.1.jar, I get this error: INFO 2012-03-06 02:39:04,072 main kafka.server.KafkaZooKeeper Registering broker /brokers/ids/1 FATAL 2012-03-06 02:39:04,111 main kafka.server.KafkaServer Fatal error during startup. java.lang.IllegalArgumentException: Path length must be 0 at org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:48) at org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:35) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:620) at org.I0Itec.zkclient.ZkConnection.create(ZkConnection.java:87) at org.I0Itec.zkclient.ZkClient$1.call(ZkClient.java:308) at org.I0Itec.zkclient.ZkClient$1.call(ZkClient.java:304) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) at org.I0Itec.zkclient.ZkClient.create(ZkClient.java:304) at org.I0Itec.zkclient.ZkClient.createPersistent(ZkClient.java:213) at org.I0Itec.zkclient.ZkClient.createPersistent(ZkClient.java:223) at org.I0Itec.zkclient.ZkClient.createPersistent(ZkClient.java:223) at kafka.utils.ZkUtils$.createParentPath(ZkUtils.scala:48) at kafka.utils.ZkUtils$.createEphemeralPath(ZkUtils.scala:60) at kafka.utils.ZkUtils$.createEphemeralPathExpectConflict(ZkUtils.scala:72) at kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:57) at kafka.log.LogManager.startup(LogManager.scala:124) at kafka.server.KafkaServer.startup(KafkaServer.scala:80) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:47) at kafka.Kafka$.main(Kafka.scala:60) at kafka.Kafka.main(Kafka.scala) The problem seems to be this code in ZkClient's createPersistent method: String parentDir = path.substring(0, path.lastIndexOf('/')); createPersistent(parentDir, createParents); createPersistent(path, createParents); which doesn't check for whether parentDir is an empty string, which it will become for /brokers/ids/1 after two recursions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Failing kafka-trunk-git-pr builds now fixed
Hello Ismael, Can you please trigger the build for all of the currently opened pull requests? E.g. my PR https://github.com/apache/kafka/pull/85 last automatically added comment is that the build has failed while it should have been success - only javadocs changes are included in PR. Kind regards, Stevo Slavic. On Mon, Jul 20, 2015 at 4:34 PM, Ismael Juma ism...@juma.me.uk wrote: Hi, All GitHub pull request builds were failing after we had a few successful ones. This should now be fixed and the same issue should not happen again. See the following for details: https://issues.apache.org/jira/browse/BUILDS-99 Best, Ismael
Re: Review Request 36570: Patch for KAFKA-2337
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/#review92282 --- Ship it! Ship It! - Edward Ribeiro On Julho 20, 2015, 5:37 p.m., Grant Henke wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/ --- (Updated Julho 20, 2015, 5:37 p.m.) Review request for kafka. Bugs: KAFKA-2337 https://issues.apache.org/jira/browse/KAFKA-2337 Repository: kafka Description --- KAFKA-2337: Verify that metric names will not collide when creating new topics Diffs - core/src/main/scala/kafka/admin/AdminUtils.scala f06edf41c732a7b794e496d0048b0ce6f897e72b core/src/main/scala/kafka/admin/TopicCommand.scala a90aa8787ff21b963765a547980154363c1c93c6 core/src/main/scala/kafka/common/Topic.scala 32595d6fe432141119db26d3b5ebe229aac40805 core/src/test/scala/unit/kafka/admin/AdminTest.scala 252ac813c8df1780c2dc5fa9e698fb43bb6d5cf8 core/src/test/scala/unit/kafka/common/TopicTest.scala 79532c89c41572ba953c4dc3319a05354927e961 Diff: https://reviews.apache.org/r/36570/diff/ Testing --- Thanks, Grant Henke
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92286 --- clients/src/main/java/org/apache/kafka/clients/Metadata.java (line 52) https://reviews.apache.org/r/36590/#comment146348 It's a best practice to cluster fields together at the beginning of the class, so we better move this to L#43. - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92293 --- clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java (line 184) https://reviews.apache.org/r/36590/#comment146356 Same here, regarding diamond operators: MapString, ListPartitionInfo map = new HashMap(); - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36570: Patch for KAFKA-2337
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/#review92296 --- Ship it! Ship It! - Ashish Singh On July 20, 2015, 5:37 p.m., Grant Henke wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/ --- (Updated July 20, 2015, 5:37 p.m.) Review request for kafka. Bugs: KAFKA-2337 https://issues.apache.org/jira/browse/KAFKA-2337 Repository: kafka Description --- KAFKA-2337: Verify that metric names will not collide when creating new topics Diffs - core/src/main/scala/kafka/admin/AdminUtils.scala f06edf41c732a7b794e496d0048b0ce6f897e72b core/src/main/scala/kafka/admin/TopicCommand.scala a90aa8787ff21b963765a547980154363c1c93c6 core/src/main/scala/kafka/common/Topic.scala 32595d6fe432141119db26d3b5ebe229aac40805 core/src/test/scala/unit/kafka/admin/AdminTest.scala 252ac813c8df1780c2dc5fa9e698fb43bb6d5cf8 core/src/test/scala/unit/kafka/common/TopicTest.scala 79532c89c41572ba953c4dc3319a05354927e961 Diff: https://reviews.apache.org/r/36570/diff/ Testing --- Thanks, Grant Henke
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92298 --- clients/src/main/java/org/apache/kafka/common/Cluster.java (line 194) https://reviews.apache.org/r/36590/#comment146362 Thif for-loop is unnecessary, as we are not doing any processing on PartitionInfo inside the loop. The for-loop can be replaced by: partitionInfos.addAll(partitionInfo); - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92301 --- clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java (line 414) https://reviews.apache.org/r/36590/#comment146372 Totally unrelated to this issue, but worth mentioning (imho) as the changes eventually touch this file: wouldn't be safer to make ``closed`` a volatile variable too? - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 33620: Patch for KAFKA-1690
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/33620/ --- (Updated July 20, 2015, 7 p.m.) Review request for kafka. Bugs: KAFKA-1690 https://issues.apache.org/jira/browse/KAFKA-1690 Repository: kafka Description (updated) --- KAFKA-1690. new java producer needs ssl support as a client. KAFKA-1690. new java producer needs ssl support as a client. KAFKA-1690. new java producer needs ssl support as a client. KAFKA-1690. new java producer needs ssl support as a client. SSLFactory tests. KAFKA-1690. new java producer needs ssl support as a client. Added PrincipalBuilder. KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews. KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews. KAFKA-1690. new java producer needs ssl support as a client. Addressing reviews. KAFKA-1690. new java producer needs ssl support as a client. Fixed minor issues with the patch. KAFKA-1690. new java producer needs ssl support as a client. Fixed minor issues with the patch. KAFKA-1690. new java producer needs ssl support as a client. KAFKA-1690. new java producer needs ssl support as a client. Merge remote-tracking branch 'refs/remotes/origin/trunk' into KAFKA-1690-V1 KAFKA-1690. Broker side ssl changes. KAFKA-1684. SSL for socketServer. KAFKA-1690. Added SSLProducerSendTest and fixes to get right port for SSL. Merge branch 'trunk' into KAFKA-1690-V1 KAFKA-1690. Post merge fixes. KAFKA-1690. Added SSLProducerSendTest. KAFKA-1690. Minor fixes based on patch review comments. Merge commit KAFKA-1690. Added SSL Consumer Test. KAFKA-1690. SSL Support. KAFKA-1690. Addressing reviews. Merge branch 'trunk' into KAFKA-1690-V1 Merge branch 'trunk' into KAFKA-1690-V1 KAFKA-1690. Addressing reviews. Removed interestOps from SSLTransportLayer. KAFKA-1690. Addressing reviews. Diffs (updated) - build.gradle fb9084307ae41bbb62e32720ccd7b94f26e910c8 checkstyle/import-control.xml 19e0659ef9385433d9f94dee43cd70a52b18c9e5 clients/src/main/java/org/apache/kafka/clients/ClientUtils.java 0d68bf1e1e90fe9d5d4397ddf817b9a9af8d9f7a clients/src/main/java/org/apache/kafka/clients/CommonClientConfigs.java 2c421f42ed3fc5d61cf9c87a7eaa7bb23e26f63b clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java 70377ae2fa46deb381139d28590ce6d4115e1adc clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/producer/KafkaProducer.java 03b8dd23df63a8d8a117f02eabcce4a2d48c44f7 clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java aa264202f2724907924985a5ecbe74afc4c6c04b clients/src/main/java/org/apache/kafka/common/config/AbstractConfig.java bae528d31516679bed88ee61b408f209f185a8cc clients/src/main/java/org/apache/kafka/common/config/SSLConfigs.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/Authenticator.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/ByteBufferSend.java df0e6d5105ca97b7e1cb4d334ffb7b443506bd0b clients/src/main/java/org/apache/kafka/common/network/ChannelBuilder.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/DefaultAuthenticator.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/KafkaChannel.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/NetworkReceive.java 3ca0098b8ec8cfdf81158465b2d40afc47eb6f80 clients/src/main/java/org/apache/kafka/common/network/PlainTextChannelBuilder.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/PlainTextTransportLayer.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/SSLChannelBuilder.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/SSLFactory.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/SSLTransportLayer.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/network/Selectable.java 618a0fa53848ae6befea7eba39c2f3285b734494 clients/src/main/java/org/apache/kafka/common/network/Selector.java aaf60c98c2c0f4513a8d65ee0db67953a529d598 clients/src/main/java/org/apache/kafka/common/network/Send.java 8f6daadf6b67c3414911cda77765512131e56fd3 clients/src/main/java/org/apache/kafka/common/network/TransportLayer.java PRE-CREATION clients/src/main/java/org/apache/kafka/common/protocol/SecurityProtocol.java dab1a94dd29563688b6ecf4eeb0e180b06049d3f clients/src/main/java/org/apache/kafka/common/security/auth/DefaultPrincipalBuilder.java PRE-CREATION
[jira] [Commented] (KAFKA-2169) Upgrade to zkclient-0.5
[ https://issues.apache.org/jira/browse/KAFKA-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633766#comment-14633766 ] ASF GitHub Bot commented on KAFKA-2169: --- Github user Parth-Brahmbhatt closed the pull request at: https://github.com/apache/kafka/pull/61 Upgrade to zkclient-0.5 --- Key: KAFKA-2169 URL: https://issues.apache.org/jira/browse/KAFKA-2169 Project: Kafka Issue Type: Bug Affects Versions: 0.8.2.0 Reporter: Neha Narkhede Assignee: Parth Brahmbhatt Priority: Critical Fix For: 0.8.3 Attachments: KAFKA-2169.patch, KAFKA-2169.patch, KAFKA-2169_2015-05-11_13:52:57.patch, KAFKA-2169_2015-05-15_10:18:41.patch zkclient-0.5 is released http://mvnrepository.com/artifact/com.101tec/zkclient/0.5 and has the fix for KAFKA-824 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92288 --- clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java (line 1042) https://reviews.apache.org/r/36590/#comment146351 With java7 diamonds operators this line can be simplified as: MapString, ListPartitionInfo topicAndPartitionInfoMap = new HashMap(); - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92290 --- clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java (line 1066) https://reviews.apache.org/r/36590/#comment146353 why put this method variable as final? - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92291 --- clients/src/main/java/org/apache/kafka/common/Cluster.java (line 189) https://reviews.apache.org/r/36590/#comment146354 Same here, use diamond operators: SetPartitionInfo partitionInfos = new HashSet(); - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
[jira] [Resolved] (KAFKA-1230) shell script files under bin don't work with cygwin (bash on windows)
[ https://issues.apache.org/jira/browse/KAFKA-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alok Lal resolved KAFKA-1230. - Resolution: Cannot Reproduce shell script files under bin don't work with cygwin (bash on windows) - Key: KAFKA-1230 URL: https://issues.apache.org/jira/browse/KAFKA-1230 Project: Kafka Issue Type: Bug Components: tools Affects Versions: 0.8.0 Environment: The change have been tested under GNU bash, version 4.1.11(2)-release (x86_64-unknown-cygwin) running on Windows 7 Enterprise. Reporter: Alok Lal Fix For: 0.8.3 Attachments: 0001-Added-changes-so-that-bin-.sh-files-can-work-with-CY.patch Original Estimate: 24h Remaining Estimate: 24h h3. Introduction This bug is being created for a pull request that I had submitted earlier for these. Per Jun this is so changes confirm to Apache license. h3. Background The script files to run Kafka under Windows don't work as is. One needs to hand tweak them since their location is not bin but bin/windows. Further, the script files under bin/windows are not a complete replica of those under bin. To be sure, this isn't a complaint. To the contrary most projects now-a-days don't bother to support running on Windows or do so very late. Just that because of these limitation it might be more prudent to make the script files under bin itself run under windows rather than trying to make the files under bin/windows work or to make them complete. h3. Change Summary Most common unix-like shell on windows is the bash shell which is a part of the cygwin project. Out of the box the scripts don't work mostly due to peculiarities of the directory paths and class path separators. This change set makes a focused change to a single file under bin so that all of the script files under bin would work as is on windows platform when using bash shell of Cygwin distribution. h3. Motivation Acceptance of this change would enable a vast body of developers that use (or have to use) Windows as their development/testing/production platform to use Kafka's with ease. More importantly by making the running of examples smoothly on Windoes+Cygwin-bash it would make the process of evaluation of Kafka simpler and smoother and potentially make for a favorable evaluation. For, it would show commitment of the Kafka team to espouse deployments on Windows (albeit only under cygwin). Further, as the number of people whom use Kafka on Windows increases, one would attract people who can eventually fix the script files under bin/Windows itself so that need to run under Cygwin would also go away, too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] kafka pull request: Added changes so that bin/*.sh files can work ...
Github user aloklal99 closed the pull request at: https://github.com/apache/kafka/pull/13 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (KAFKA-2275) Add a ListTopics() API to the new consumer
[ https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashish K Singh updated KAFKA-2275: -- Attachment: KAFKA-2275_2015-07-20_10:44:19.patch Add a ListTopics() API to the new consumer -- Key: KAFKA-2275 URL: https://issues.apache.org/jira/browse/KAFKA-2275 Project: Kafka Issue Type: Sub-task Components: consumer Reporter: Guozhang Wang Assignee: Ashish K Singh Priority: Critical Fix For: 0.8.3 Attachments: KAFKA-2275.patch, KAFKA-2275_2015-07-17_21:39:27.patch, KAFKA-2275_2015-07-20_10:44:19.patch With regex subscription like {code} consumer.subscribe(topic*) {code} The partition assignment is automatically done at the Kafka side, while there are some use cases where consumers want regex subscriptions but not Kafka-side partition assignment, rather with their own specific partition assignment. With ListTopics() they can periodically check for topic list changes and specifically subscribe to the partitions of the new topics. For implementation, it involves sending a TopicMetadataRequest to a random broker and parse the response. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2275) Add a ListTopics() API to the new consumer
[ https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633891#comment-14633891 ] Ashish K Singh commented on KAFKA-2275: --- Updated reviewboard https://reviews.apache.org/r/36590/ against branch trunk Add a ListTopics() API to the new consumer -- Key: KAFKA-2275 URL: https://issues.apache.org/jira/browse/KAFKA-2275 Project: Kafka Issue Type: Sub-task Components: consumer Reporter: Guozhang Wang Assignee: Ashish K Singh Priority: Critical Fix For: 0.8.3 Attachments: KAFKA-2275.patch, KAFKA-2275_2015-07-17_21:39:27.patch, KAFKA-2275_2015-07-20_10:44:19.patch With regex subscription like {code} consumer.subscribe(topic*) {code} The partition assignment is automatically done at the Kafka side, while there are some use cases where consumers want regex subscriptions but not Kafka-side partition assignment, rather with their own specific partition assignment. With ListTopics() they can periodically check for topic list changes and specifically subscribe to the partitions of the new topics. For implementation, it involves sending a TopicMetadataRequest to a random broker and parse the response. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92294 --- clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java (line 1072) https://reviews.apache.org/r/36590/#comment146357 why did you put this method variable as final? - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92295 --- clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java (line 188) https://reviews.apache.org/r/36590/#comment146359 I would rewrite this snippet as: ListPartitionInfo parts = this.partitions.get(topic); if (parts == null) { parts = Collections.PartitionInfoemptyList(); } map.put(topic, parts); But it's more a question of taste than anything else, I confess. - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92302 --- clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java (line 119) https://reviews.apache.org/r/36590/#comment146373 This is unrelated to the issue (imho): declaring the acessor (i.e., ``public``) is redundant with Java interfaces as every declared method signature is public by default. Not a big deal, but worth mentioning. ;-) - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
[jira] [Commented] (KAFKA-1690) new java producer needs ssl support as a client
[ https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633941#comment-14633941 ] Sriharsha Chintalapani commented on KAFKA-1690: --- Updated reviewboard https://reviews.apache.org/r/33620/diff/ against branch origin/trunk new java producer needs ssl support as a client --- Key: KAFKA-1690 URL: https://issues.apache.org/jira/browse/KAFKA-1690 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Sriharsha Chintalapani Fix For: 0.8.3 Attachments: KAFKA-1690.patch, KAFKA-1690.patch, KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, KAFKA-1690_2015-07-20_11:59:57.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-1690) new java producer needs ssl support as a client
[ https://issues.apache.org/jira/browse/KAFKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sriharsha Chintalapani updated KAFKA-1690: -- Attachment: KAFKA-1690_2015-07-20_11:59:57.patch new java producer needs ssl support as a client --- Key: KAFKA-1690 URL: https://issues.apache.org/jira/browse/KAFKA-1690 Project: Kafka Issue Type: Sub-task Reporter: Joe Stein Assignee: Sriharsha Chintalapani Fix For: 0.8.3 Attachments: KAFKA-1690.patch, KAFKA-1690.patch, KAFKA-1690_2015-05-10_23:20:30.patch, KAFKA-1690_2015-05-10_23:31:42.patch, KAFKA-1690_2015-05-11_16:09:36.patch, KAFKA-1690_2015-05-12_16:20:08.patch, KAFKA-1690_2015-05-15_07:18:21.patch, KAFKA-1690_2015-05-20_14:54:35.patch, KAFKA-1690_2015-05-21_10:37:08.patch, KAFKA-1690_2015-06-03_18:52:29.patch, KAFKA-1690_2015-06-23_13:18:20.patch, KAFKA-1690_2015-07-20_06:10:42.patch, KAFKA-1690_2015-07-20_11:59:57.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92304 --- clients/src/main/java/org/apache/kafka/common/Cluster.java (line 191) https://reviews.apache.org/r/36590/#comment146378 This if-condition is unnecessary (as of *now*). See, partitionsByTopic is defined as a final Map (L#27) so it never will be ``null``. pS: we could leave this if-condition as defensive programming for future changes, but it would never be considered a best practice make a final field non final, imho. - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: [DISCUSS] KIP-27 - Conditional Publish
Hi Jun, Thanks for the close reading! Responses inline. Thanks for the write-up. The single producer use case you mentioned makes sense. It would be useful to include that in the KIP wiki. Great -- I'll make sure that the wiki is clear about this. 1. What happens when the leader of the partition changes in the middle of a produce request? In this case, the producer client is not sure whether the request succeeds or not. If there is only a single message in the request, the producer can just resend the request. If it sees an OffsetMismatch error, it knows that the previous send actually succeeded and can proceed with the next write. This is nice since it not only allows the producer to proceed during transient failures in the broker, it also avoids duplicates during producer resend. One caveat is when there are multiple messages in the same partition in a produce request. The issue is that in our current replication protocol, it's possible for some, but not all messages in the request to be committed. This makes resend a bit harder to deal with since on receiving an OffsetMismatch error, it's not clear which messages have been committed. One possibility is to expect that compression is enabled, in which case multiple messages are compressed into a single message. I was thinking that another possibility is for the broker to return the current high watermark when sending an OffsetMismatch error. Based on this info, the producer can resend the subset of messages that have not been committed. However, this may not work in a compacted topic since there can be holes in the offset. This is a excellent question. It's my understanding that at least a *prefix* of messages will be committed (right?) -- which seems to be enough for many cases. I'll try and come up with a more concrete answer here. 2. Is this feature only intended to be used with ack = all? The client doesn't get the offset with ack = 0. With ack = 1, it's possible for a previously acked message to be lost during leader transition, which will make the client logic more complicated. It's true that acks = 0 doesn't seem to be particularly useful; in all the cases I've come across, the client eventually wants to know about the mismatch error. However, it seems like there are some cases where acks = 1 would be fine -- eg. in a bulk load of a fixed dataset, losing messages during a leader transition just means you need to rewind / restart the load, which is not especially catastrophic. For many other interesting cases, acks = all is probably preferable. 3. How does the producer client know the offset to send the first message? Do we need to expose an API in producer to get the current high watermark? You're right, it might be irritating to have to go through the consumer API just for this. There are some cases where the offsets are already available -- like the commit-log-for-KV-store example -- but in general, being able to get the offsets from the producer interface does sound convenient. We plan to have a KIP discussion meeting tomorrow at 11am PST. Perhaps you can describe this KIP a bit then? Sure, happy to join. Thanks, Jun On Sat, Jul 18, 2015 at 10:37 AM, Ben Kirwin b...@kirw.in wrote: Just wanted to flag a little discussion that happened on the ticket: https://issues.apache.org/jira/browse/KAFKA-2260?focusedCommentId=14632259page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14632259 In particular, Yasuhiro Matsuda proposed an interesting variant on this that performs the offset check on the message key (instead of just the partition), with bounded space requirements, at the cost of potentially some spurious failures. (ie. the produce request may fail even if that particular key hasn't been updated recently.) This addresses a couple of the drawbacks of the per-key approach mentioned at the bottom of the KIP. On Fri, Jul 17, 2015 at 6:47 PM, Ben Kirwin b...@kirw.in wrote: Hi all, So, perhaps it's worth adding a couple specific examples of where this feature is useful, to make this a bit more concrete: - Suppose I'm using Kafka as a commit log for a partitioned KV store, like Samza or Pistachio (?) do. We bootstrap the process state by reading from that partition, and log all state updates to that partition when we're running. Now imagine that one of my processes locks up -- GC or similar -- and the system transitions that partition over to another node. When the GC is finished, the old 'owner' of that partition might still be trying to write to the commit log at the same as the new one is. A process might detect this by noticing that the offset of the published message is bigger than it thought the upcoming offset was, which implies someone else has been writing to the log... but by then it's too late, and the commit log is already corrupt. With a 'conditional produce', one of those processes will have it's publish request
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description (updated) --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs (updated) - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
On July 19, 2015, 1:11 a.m., Jason Gustafson wrote: Jason, thanks for your review! I looked into ConsumerNetworkClient/ NetwrokClient, Metadata and Cluster classes. On receiving metadataUpdate, cluster instance in metadata is updated. However, when a topic is added by consumer, it is added to metadata.topics. After considering various options, I have updated the patch with what I think is the least obtrusive changes. So, we still keep metadata.topics as the list of topics we are interested in maintaining the state for, however we can choose to get metadata for all topics by setting metadata.needMetadataForAllTopics. One thing to notice is that in the current implementation there is no caching for allTopics metadata, which might be OK depending on how we are planning to use it. We can discuss further once you take a look at the latest patch. - Ashish --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92194 --- On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92289 --- clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java (line 1046) https://reviews.apache.org/r/36590/#comment146352 Same here. Can be simplified to: ListString missingTopics = new ArrayList(); - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92292 --- clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java (line 196) https://reviews.apache.org/r/36590/#comment146355 It's considered a best practice in Java to rewrite this for as: for (Map.EntryString,ListPartitionInfo e: partitions.entrySet()) { map.put(e.getKey(), e.getValue()); } - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92299 --- clients/src/main/java/org/apache/kafka/common/Cluster.java (line 192) https://reviews.apache.org/r/36590/#comment146363 Also, didn't get why yet another method variable as final. Defensive programming? I mean, what is does bring to the table? - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92297 --- clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java (lines 1065 - 1069) https://reviews.apache.org/r/36590/#comment146360 It's not a big deal, but you could move this block into the above if statement. clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java (line 1071) https://reviews.apache.org/r/36590/#comment146365 I'm not sure, but I think there might be an asynchronous issue here. Since we are using the same Cluster object in Metadata, could a pending normal metadata request (for the subscribed topics) inadvertently override our request for all metadata? clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java (lines 1074 - 1077) https://reviews.apache.org/r/36590/#comment146361 Is it an actual problem if we return this topic to the user? - Jason Gustafson On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: [DISCUSS] KIP-27 - Conditional Publish
The per-key generalization is useful. As Jiangjie mentioned in KAFKA-2260, one thing that we need to sort out is what happens if a produce request has messages with different keys and some of the messages have expected offsets while some others don't. Currently, the produce response has an error code per partition, not per message. One way is to just define the semantics as: the produce request will only go through if all keys in the request pass the offset test. Thanks, Jun On Sat, Jul 18, 2015 at 10:37 AM, Ben Kirwin b...@kirw.in wrote: Just wanted to flag a little discussion that happened on the ticket: https://issues.apache.org/jira/browse/KAFKA-2260?focusedCommentId=14632259page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14632259 In particular, Yasuhiro Matsuda proposed an interesting variant on this that performs the offset check on the message key (instead of just the partition), with bounded space requirements, at the cost of potentially some spurious failures. (ie. the produce request may fail even if that particular key hasn't been updated recently.) This addresses a couple of the drawbacks of the per-key approach mentioned at the bottom of the KIP. On Fri, Jul 17, 2015 at 6:47 PM, Ben Kirwin b...@kirw.in wrote: Hi all, So, perhaps it's worth adding a couple specific examples of where this feature is useful, to make this a bit more concrete: - Suppose I'm using Kafka as a commit log for a partitioned KV store, like Samza or Pistachio (?) do. We bootstrap the process state by reading from that partition, and log all state updates to that partition when we're running. Now imagine that one of my processes locks up -- GC or similar -- and the system transitions that partition over to another node. When the GC is finished, the old 'owner' of that partition might still be trying to write to the commit log at the same as the new one is. A process might detect this by noticing that the offset of the published message is bigger than it thought the upcoming offset was, which implies someone else has been writing to the log... but by then it's too late, and the commit log is already corrupt. With a 'conditional produce', one of those processes will have it's publish request refused -- so we've avoided corrupting the state. - Envision some copycat-like system, where we have some sharded postgres setup and we're tailing each shard into its own partition. Normally, it's fairly easy to avoid duplicates here: we can track which offset in the WAL corresponds to which offset in Kafka, and we know how many messages we've written to Kafka already, so the state is very simple. However, it is possible that for a moment -- due to rebalancing or operator error or some other thing -- two different nodes are tailing the same postgres shard at once! Normally this would introduce duplicate messages, but by specifying the expected offset, we can avoid this. So perhaps it's better to say that this is useful when a single producer is *expected*, but multiple producers are *possible*? (In the same way that the high-level consumer normally has 1 consumer in a group reading from a partition, but there are small windows where more than one might be reading at the same time.) This is also the spirit of the 'runtime cost' comment -- in the common case, where there is little to no contention, there's no performance overhead either. I mentioned this a little in the Motivation section -- maybe I should flesh that out a little bit? For me, the motivation to work this up was that I kept running into cases, like the above, where the existing API was almost-but-not-quite enough to give the guarantees I was looking for -- and the extension needed to handle those cases too was pretty small and natural-feeling. On Fri, Jul 17, 2015 at 4:49 PM, Ashish Singh asi...@cloudera.com wrote: Good concept. I have a question though. Say there are two producers A and B. Both producers are producing to same partition. - A sends a message with expected offset, x1 - Broker accepts is and sends an Ack - B sends a message with expected offset, x1 - Broker rejects it, sends nack - B sends message again with expected offset, x1+1 - Broker accepts it and sends Ack I guess this is what this KIP suggests, right? If yes, then how does this ensure that same message will not be written twice when two producers are producing to same partition? Producer on receiving a nack will try again with next offset and will keep doing so till the message is accepted. Am I missing something? Also, you have mentioned on KIP, it imposes little to no runtime cost in memory or time, I think that is not true for time. With this approach producers' performance will reduce proportionally to number of producers writing to same partition. Please correct me if I am missing out something. On Fri, Jul
Re: Permission to edit KIP pages
Thanks Jun. Thanks, Mayuresh On Mon, Jul 20, 2015 at 10:03 AM, Jun Rao j...@confluent.io wrote: Added. Thanks, Jun On Mon, Jul 20, 2015 at 9:48 AM, Mayuresh Gharat gharatmayures...@gmail.com wrote: My username is : mgharat On Mon, Jul 20, 2015 at 9:46 AM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Hi, I wanted to edit a KIP page and would like to get permission for that. Currently I don't have edit authorization. It does not show me an option to edit. Can one of the committers grant me permission? Thanks. -- -Regards, Mayuresh R. Gharat (862) 250-7125 -- -Regards, Mayuresh R. Gharat (862) 250-7125 -- -Regards, Mayuresh R. Gharat (862) 250-7125
Re: New Producer and acks configuration
acks=0 is a one-way send, the client doesn't need to wait on the response. Whether this is useful sort of depends on the client implementation. The new java producer does all sends async so waiting on a response isn't really a thing. For a client that lacks this, though, as some of them do, acks=0 will be a lot faster. It also makes some sense in terms of what is completed when the request is considered satisfied acks = 0 - message is written to the network (buffer) acks = 1 - message is written to the leader log acks = -1 - message is committed -Jay On Sat, Jul 18, 2015 at 10:50 PM, Gwen Shapira gshap...@cloudera.com wrote: Hi, I was looking into the different between acks = 0 and acks = 1 in the new producer, and was a bit surprised at what I found. Basically, if I understand correctly, the only difference is that with acks = 0, if the leader fails to append locally, it closes the network connection silently and with acks = 1, it sends an actual error message. Which seems to mean that with acks = 0, any failed produce will lead to metadata refresh and a retry (because network error), while acks = 1 will lead to either retries or abort, depending on the error. Not only this doesn't match the documentation, it doesn't even make much sense... acks = 0 was supposed to somehow makes things less safe but faster, and it doesn't seem to be doing that any more. I'm not even sure there's any case where the acks = 0 behavior is desirable. Is it my misunderstanding, or did we somehow screw up the logic here? Gwen
Re: Permission to edit KIP pages
Added. Thanks, Jun On Mon, Jul 20, 2015 at 9:48 AM, Mayuresh Gharat gharatmayures...@gmail.com wrote: My username is : mgharat On Mon, Jul 20, 2015 at 9:46 AM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Hi, I wanted to edit a KIP page and would like to get permission for that. Currently I don't have edit authorization. It does not show me an option to edit. Can one of the committers grant me permission? Thanks. -- -Regards, Mayuresh R. Gharat (862) 250-7125 -- -Regards, Mayuresh R. Gharat (862) 250-7125
Re: Review Request 36570: Patch for KAFKA-2337
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/#review92276 --- core/src/main/scala/kafka/admin/TopicCommand.scala (line 89) https://reviews.apache.org/r/36570/#comment146340 Probably typo? best to either - Ashish Singh On July 17, 2015, 4:17 p.m., Grant Henke wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/ --- (Updated July 17, 2015, 4:17 p.m.) Review request for kafka. Bugs: KAFKA-2337 https://issues.apache.org/jira/browse/KAFKA-2337 Repository: kafka Description --- KAFKA-2337: Verify that metric names will not collide when creating new topics Diffs - core/src/main/scala/kafka/admin/AdminUtils.scala f06edf41c732a7b794e496d0048b0ce6f897e72b core/src/main/scala/kafka/admin/TopicCommand.scala a90aa8787ff21b963765a547980154363c1c93c6 core/src/main/scala/kafka/common/Topic.scala 32595d6fe432141119db26d3b5ebe229aac40805 core/src/test/scala/unit/kafka/admin/AdminTest.scala 252ac813c8df1780c2dc5fa9e698fb43bb6d5cf8 core/src/test/scala/unit/kafka/common/TopicTest.scala 79532c89c41572ba953c4dc3319a05354927e961 Diff: https://reviews.apache.org/r/36570/diff/ Testing --- Thanks, Grant Henke
Re: Review Request 36570: Patch for KAFKA-2337
On July 20, 2015, 5:27 p.m., Ashish Singh wrote: LGTM, just a small comment. - Ashish --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/#review92276 --- On July 17, 2015, 4:17 p.m., Grant Henke wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/ --- (Updated July 17, 2015, 4:17 p.m.) Review request for kafka. Bugs: KAFKA-2337 https://issues.apache.org/jira/browse/KAFKA-2337 Repository: kafka Description --- KAFKA-2337: Verify that metric names will not collide when creating new topics Diffs - core/src/main/scala/kafka/admin/AdminUtils.scala f06edf41c732a7b794e496d0048b0ce6f897e72b core/src/main/scala/kafka/admin/TopicCommand.scala a90aa8787ff21b963765a547980154363c1c93c6 core/src/main/scala/kafka/common/Topic.scala 32595d6fe432141119db26d3b5ebe229aac40805 core/src/test/scala/unit/kafka/admin/AdminTest.scala 252ac813c8df1780c2dc5fa9e698fb43bb6d5cf8 core/src/test/scala/unit/kafka/common/TopicTest.scala 79532c89c41572ba953c4dc3319a05354927e961 Diff: https://reviews.apache.org/r/36570/diff/ Testing --- Thanks, Grant Henke
[GitHub] kafka pull request: KAFKA-2169: Moving to zkClient 0.5 release.
Github user Parth-Brahmbhatt closed the pull request at: https://github.com/apache/kafka/pull/61 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Permission to edit KIP pages
Hi, I wanted to edit a KIP page and would like to get permission for that. Currently I don't have edit authorization. It does not show me an option to edit. Can one of the committers grant me permission? Thanks. -- -Regards, Mayuresh R. Gharat (862) 250-7125
Re: Failing kafka-trunk-git-pr builds now fixed
On Mon, Jul 20, 2015 at 3:38 PM, Stevo Slavić ssla...@gmail.com wrote: Can you please trigger the build for all of the currently opened pull requests? E.g. my PR https://github.com/apache/kafka/pull/85 last automatically added comment is that the build has failed while it should have been success - only javadocs changes are included in PR. Unfortunately I can't do it. I think it has to be a committer, but I sent an email to the builds mailing list to figure out the details. I will report back. This is one of the last remaining points that we need to figure out before we can vote on moving to the new flow. Best, Ismael
[jira] [Commented] (KAFKA-2236) offset request reply racing with segment rolling
[ https://issues.apache.org/jira/browse/KAFKA-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633748#comment-14633748 ] William Thurston commented on KAFKA-2236: - https://github.com/apache/kafka/pull/86 offset request reply racing with segment rolling Key: KAFKA-2236 URL: https://issues.apache.org/jira/browse/KAFKA-2236 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.2.0 Environment: Linux x86_64, java.1.7.0_72, discovered using librdkafka based client. Reporter: Alfred Landrum Assignee: Jason Gustafson Priority: Critical Labels: newbie My use case with kafka involves an aggressive retention policy that rolls segment files frequently. My librdkafka based client sees occasional errors to offset requests, showing up in the broker log like: [2015-06-02 02:33:38,047] INFO Rolled new log segment for 'receiver-93b40462-3850-47c1-bcda-8a3e221328ca-50' in 1 ms. (kafka.log.Log) [2015-06-02 02:33:38,049] WARN [KafkaApi-0] Error while responding to offset request (kafka.server.KafkaApis) java.lang.ArrayIndexOutOfBoundsException: 3 at kafka.server.KafkaApis.fetchOffsetsBefore(KafkaApis.scala:469) at kafka.server.KafkaApis.fetchOffsets(KafkaApis.scala:449) at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:411) at kafka.server.KafkaApis$$anonfun$17.apply(KafkaApis.scala:402) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.immutable.Map$Map1.foreach(Map.scala:109) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.AbstractTraversable.map(Traversable.scala:105) at kafka.server.KafkaApis.handleOffsetRequest(KafkaApis.scala:402) at kafka.server.KafkaApis.handle(KafkaApis.scala:61) at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:59) at java.lang.Thread.run(Thread.java:745) quoting Guozhang Wang's reply to my query on the users list: I check the 0.8.2 code and may probably find a bug related to your issue. Basically, segsArray.last.size is called multiple times during handling offset requests, while segsArray.last could get concurrent appends. Hence it is possible that in line 461, if(segsArray.last.size 0) returns false while later in line 468, if(segsArray.last.size 0) could return true. http://mail-archives.apache.org/mod_mbox/kafka-users/201506.mbox/%3CCAHwHRrUK-3wdoEAaFbsD0E859Ea0gXixfxgDzF8E3%3D_8r7K%2Bpw%40mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Permission to edit KIP pages
My username is : mgharat On Mon, Jul 20, 2015 at 9:46 AM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Hi, I wanted to edit a KIP page and would like to get permission for that. Currently I don't have edit authorization. It does not show me an option to edit. Can one of the committers grant me permission? Thanks. -- -Regards, Mayuresh R. Gharat (862) 250-7125 -- -Regards, Mayuresh R. Gharat (862) 250-7125
[GitHub] kafka pull request: Adding rack-aware replication option.
Github user jmlvanre closed the pull request at: https://github.com/apache/kafka/pull/16 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: [DISCUSS] KIP-27 - Conditional Publish
Hi, Ben, Thanks for the write-up. The single producer use case you mentioned makes sense. It would be useful to include that in the KIP wiki. A couple questions on the design details. 1. What happens when the leader of the partition changes in the middle of a produce request? In this case, the producer client is not sure whether the request succeeds or not. If there is only a single message in the request, the producer can just resend the request. If it sees an OffsetMismatch error, it knows that the previous send actually succeeded and can proceed with the next write. This is nice since it not only allows the producer to proceed during transient failures in the broker, it also avoids duplicates during producer resend. One caveat is when there are multiple messages in the same partition in a produce request. The issue is that in our current replication protocol, it's possible for some, but not all messages in the request to be committed. This makes resend a bit harder to deal with since on receiving an OffsetMismatch error, it's not clear which messages have been committed. One possibility is to expect that compression is enabled, in which case multiple messages are compressed into a single message. I was thinking that another possibility is for the broker to return the current high watermark when sending an OffsetMismatch error. Based on this info, the producer can resend the subset of messages that have not been committed. However, this may not work in a compacted topic since there can be holes in the offset. 2. Is this feature only intended to be used with ack = all? The client doesn't get the offset with ack = 0. With ack = 1, it's possible for a previously acked message to be lost during leader transition, which will make the client logic more complicated. 3. How does the producer client know the offset to send the first message? Do we need to expose an API in producer to get the current high watermark? We plan to have a KIP discussion meeting tomorrow at 11am PST. Perhaps you can describe this KIP a bit then? Thanks, Jun On Sat, Jul 18, 2015 at 10:37 AM, Ben Kirwin b...@kirw.in wrote: Just wanted to flag a little discussion that happened on the ticket: https://issues.apache.org/jira/browse/KAFKA-2260?focusedCommentId=14632259page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14632259 In particular, Yasuhiro Matsuda proposed an interesting variant on this that performs the offset check on the message key (instead of just the partition), with bounded space requirements, at the cost of potentially some spurious failures. (ie. the produce request may fail even if that particular key hasn't been updated recently.) This addresses a couple of the drawbacks of the per-key approach mentioned at the bottom of the KIP. On Fri, Jul 17, 2015 at 6:47 PM, Ben Kirwin b...@kirw.in wrote: Hi all, So, perhaps it's worth adding a couple specific examples of where this feature is useful, to make this a bit more concrete: - Suppose I'm using Kafka as a commit log for a partitioned KV store, like Samza or Pistachio (?) do. We bootstrap the process state by reading from that partition, and log all state updates to that partition when we're running. Now imagine that one of my processes locks up -- GC or similar -- and the system transitions that partition over to another node. When the GC is finished, the old 'owner' of that partition might still be trying to write to the commit log at the same as the new one is. A process might detect this by noticing that the offset of the published message is bigger than it thought the upcoming offset was, which implies someone else has been writing to the log... but by then it's too late, and the commit log is already corrupt. With a 'conditional produce', one of those processes will have it's publish request refused -- so we've avoided corrupting the state. - Envision some copycat-like system, where we have some sharded postgres setup and we're tailing each shard into its own partition. Normally, it's fairly easy to avoid duplicates here: we can track which offset in the WAL corresponds to which offset in Kafka, and we know how many messages we've written to Kafka already, so the state is very simple. However, it is possible that for a moment -- due to rebalancing or operator error or some other thing -- two different nodes are tailing the same postgres shard at once! Normally this would introduce duplicate messages, but by specifying the expected offset, we can avoid this. So perhaps it's better to say that this is useful when a single producer is *expected*, but multiple producers are *possible*? (In the same way that the high-level consumer normally has 1 consumer in a group reading from a partition, but there are small windows where more than one might be reading at the same time.) This is also the spirit of the 'runtime cost' comment -- in the common case,
[jira] [Commented] (KAFKA-2260) Allow specifying expected offset on produce
[ https://issues.apache.org/jira/browse/KAFKA-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633721#comment-14633721 ] Jay Kreps commented on KAFKA-2260: -- Yes, exactly. Allow specifying expected offset on produce --- Key: KAFKA-2260 URL: https://issues.apache.org/jira/browse/KAFKA-2260 Project: Kafka Issue Type: Improvement Reporter: Ben Kirwin Assignee: Ewen Cheslack-Postava Priority: Minor Attachments: expected-offsets.patch I'd like to propose a change that adds a simple CAS-like mechanism to the Kafka producer. This update has a small footprint, but enables a bunch of interesting uses in stream processing or as a commit log for process state. h4. Proposed Change In short: - Allow the user to attach a specific offset to each message produced. - The server assigns offsets to messages in the usual way. However, if the expected offset doesn't match the actual offset, the server should fail the produce request instead of completing the write. This is a form of optimistic concurrency control, like the ubiquitous check-and-set -- but instead of checking the current value of some state, it checks the current offset of the log. h4. Motivation Much like check-and-set, this feature is only useful when there's very low contention. Happily, when Kafka is used as a commit log or as a stream-processing transport, it's common to have just one producer (or a small number) for a given partition -- and in many of these cases, predicting offsets turns out to be quite useful. - We get the same benefits as the 'idempotent producer' proposal: a producer can retry a write indefinitely and be sure that at most one of those attempts will succeed; and if two producers accidentally write to the end of the partition at once, we can be certain that at least one of them will fail. - It's possible to 'bulk load' Kafka this way -- you can write a list of n messages consecutively to a partition, even if the list is much larger than the buffer size or the producer has to be restarted. - If a process is using Kafka as a commit log -- reading from a partition to bootstrap, then writing any updates to that same partition -- it can be sure that it's seen all of the messages in that partition at the moment it does its first (successful) write. There's a bunch of other similar use-cases here, but they all have roughly the same flavour. h4. Implementation The major advantage of this proposal over other suggested transaction / idempotency mechanisms is its minimality: it gives the 'obvious' meaning to a currently-unused field, adds no new APIs, and requires very little new code or additional work from the server. - Produced messages already carry an offset field, which is currently ignored by the server. This field could be used for the 'expected offset', with a sigil value for the current behaviour. (-1 is a natural choice, since it's already used to mean 'next available offset'.) - We'd need a new error and error code for a 'CAS failure'. - The server assigns offsets to produced messages in {{ByteBufferMessageSet.validateMessagesAndAssignOffsets}}. After this changed, this method would assign offsets in the same way -- but if they don't match the offset in the message, we'd return an error instead of completing the write. - To avoid breaking existing clients, this behaviour would need to live behind some config flag. (Possibly global, but probably more useful per-topic?) I understand all this is unsolicited and possibly strange: happy to answer questions, and if this seems interesting, I'd be glad to flesh this out into a full KIP or patch. (And apologies if this is the wrong venue for this sort of thing!) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Kafka KIP meeting at 11am PST (Jul 21)
Hi, Everyone, We plan to have a Kafka KIP meeting tomorrow at 11am PST. If you want to attend, but haven't received an invitation, please let me know. The following is the agenda. Agenda: KIP-27: Conditional publish Go through jira backlogs: https://issues.apache.org/jira/issues/?jql=project%20%3D%20KAFKA%20AND%20status%20%3D%20%22Patch%20Available%22%20ORDER%20BY%20updated%20DESC Thanks, Jun
Re: Permission to edit KIP pages
Mayuresh, You should already have the permissions. Guozhang On Mon, Jul 20, 2015 at 9:48 AM, Mayuresh Gharat gharatmayures...@gmail.com wrote: My username is : mgharat On Mon, Jul 20, 2015 at 9:46 AM, Mayuresh Gharat gharatmayures...@gmail.com wrote: Hi, I wanted to edit a KIP page and would like to get permission for that. Currently I don't have edit authorization. It does not show me an option to edit. Can one of the committers grant me permission? Thanks. -- -Regards, Mayuresh R. Gharat (862) 250-7125 -- -Regards, Mayuresh R. Gharat (862) 250-7125 -- -- Guozhang
Re: Review Request 36570: Patch for KAFKA-2337
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/ --- (Updated July 20, 2015, 5:37 p.m.) Review request for kafka. Bugs: KAFKA-2337 https://issues.apache.org/jira/browse/KAFKA-2337 Repository: kafka Description --- KAFKA-2337: Verify that metric names will not collide when creating new topics Diffs (updated) - core/src/main/scala/kafka/admin/AdminUtils.scala f06edf41c732a7b794e496d0048b0ce6f897e72b core/src/main/scala/kafka/admin/TopicCommand.scala a90aa8787ff21b963765a547980154363c1c93c6 core/src/main/scala/kafka/common/Topic.scala 32595d6fe432141119db26d3b5ebe229aac40805 core/src/test/scala/unit/kafka/admin/AdminTest.scala 252ac813c8df1780c2dc5fa9e698fb43bb6d5cf8 core/src/test/scala/unit/kafka/common/TopicTest.scala 79532c89c41572ba953c4dc3319a05354927e961 Diff: https://reviews.apache.org/r/36570/diff/ Testing --- Thanks, Grant Henke
[jira] [Updated] (KAFKA-2337) Verify that metric names will not collide when creating new topics
[ https://issues.apache.org/jira/browse/KAFKA-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke updated KAFKA-2337: --- Attachment: KAFKA-2337_2015-07-20_12:36:41.patch Verify that metric names will not collide when creating new topics -- Key: KAFKA-2337 URL: https://issues.apache.org/jira/browse/KAFKA-2337 Project: Kafka Issue Type: Bug Reporter: Gwen Shapira Assignee: Grant Henke Attachments: KAFKA-2337.patch, KAFKA-2337_2015-07-17_11:17:30.patch, KAFKA-2337_2015-07-20_12:36:41.patch When creating a new topic, convert the proposed topic name to the name that will be used in metrics and validate that there are no collisions with existing names. See this discussion for context: http://s.apache.org/snW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2337) Verify that metric names will not collide when creating new topics
[ https://issues.apache.org/jira/browse/KAFKA-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633880#comment-14633880 ] Grant Henke commented on KAFKA-2337: Updated reviewboard https://reviews.apache.org/r/36570/diff/ against branch origin/trunk Verify that metric names will not collide when creating new topics -- Key: KAFKA-2337 URL: https://issues.apache.org/jira/browse/KAFKA-2337 Project: Kafka Issue Type: Bug Reporter: Gwen Shapira Assignee: Grant Henke Attachments: KAFKA-2337.patch, KAFKA-2337_2015-07-17_11:17:30.patch, KAFKA-2337_2015-07-20_12:36:41.patch When creating a new topic, convert the proposed topic name to the name that will be used in metrics and validate that there are no collisions with existing names. See this discussion for context: http://s.apache.org/snW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KAFKA-824) java.lang.NullPointerException in commitOffsets
[ https://issues.apache.org/jira/browse/KAFKA-824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Parth Brahmbhatt reassigned KAFKA-824: -- Assignee: Parth Brahmbhatt java.lang.NullPointerException in commitOffsets Key: KAFKA-824 URL: https://issues.apache.org/jira/browse/KAFKA-824 Project: Kafka Issue Type: Bug Components: consumer Affects Versions: 0.7.2, 0.8.2.0 Reporter: Yonghui Zhao Assignee: Parth Brahmbhatt Labels: newbie Attachments: ZkClient.0.3.txt, ZkClient.0.4.txt, screenshot-1.jpg Neha Narkhede Yes, I have. Unfortunately, I never quite around to fixing it. My guess is that it is caused due to a race condition between the rebalance thread and the offset commit thread when a rebalance is triggered or the client is being shutdown. Do you mind filing a bug ? 2013/03/25 12:08:32.020 WARN [ZookeeperConsumerConnector] [] 0_lu-ml-test10.bj-1364184411339-7c88f710 exception during commitOffsets java.lang.NullPointerException at org.I0Itec.zkclient.ZkConnection.writeData(ZkConnection.java:111) at org.I0Itec.zkclient.ZkClient$10.call(ZkClient.java:813) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) at org.I0Itec.zkclient.ZkClient.writeData(ZkClient.java:809) at org.I0Itec.zkclient.ZkClient.writeData(ZkClient.java:777) at kafka.utils.ZkUtils$.updatePersistentPath(ZkUtils.scala:103) at kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2$$anonfun$apply$4.apply(ZookeeperConsumerConnector.scala:251) at kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2$$anonfun$apply$4.apply(ZookeeperConsumerConnector.scala:248) at scala.collection.Iterator$class.foreach(Iterator.scala:631) at scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549) at scala.collection.IterableLike$class.foreach(IterableLike.scala:79) at scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:570) at kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2.apply(ZookeeperConsumerConnector.scala:248) at kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2.apply(ZookeeperConsumerConnector.scala:246) at scala.collection.Iterator$class.foreach(Iterator.scala:631) at kafka.utils.Pool$$anon$1.foreach(Pool.scala:53) at scala.collection.IterableLike$class.foreach(IterableLike.scala:79) at kafka.utils.Pool.foreach(Pool.scala:24) at kafka.consumer.ZookeeperConsumerConnector.commitOffsets(ZookeeperConsumerConnector.scala:246) at kafka.consumer.ZookeeperConsumerConnector.autoCommit(ZookeeperConsumerConnector.scala:232) at kafka.consumer.ZookeeperConsumerConnector$$anonfun$1.apply$mcV$sp(ZookeeperConsumerConnector.scala:126) at kafka.utils.Utils$$anon$2.run(Utils.scala:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-824) java.lang.NullPointerException in commitOffsets
[ https://issues.apache.org/jira/browse/KAFKA-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633776#comment-14633776 ] Parth Brahmbhatt commented on KAFKA-824: [~techwhizbang] I upgraded to zkClient-0.5 so I will verify this is fixed and update the jira. java.lang.NullPointerException in commitOffsets Key: KAFKA-824 URL: https://issues.apache.org/jira/browse/KAFKA-824 Project: Kafka Issue Type: Bug Components: consumer Affects Versions: 0.7.2, 0.8.2.0 Reporter: Yonghui Zhao Assignee: Parth Brahmbhatt Labels: newbie Attachments: ZkClient.0.3.txt, ZkClient.0.4.txt, screenshot-1.jpg Neha Narkhede Yes, I have. Unfortunately, I never quite around to fixing it. My guess is that it is caused due to a race condition between the rebalance thread and the offset commit thread when a rebalance is triggered or the client is being shutdown. Do you mind filing a bug ? 2013/03/25 12:08:32.020 WARN [ZookeeperConsumerConnector] [] 0_lu-ml-test10.bj-1364184411339-7c88f710 exception during commitOffsets java.lang.NullPointerException at org.I0Itec.zkclient.ZkConnection.writeData(ZkConnection.java:111) at org.I0Itec.zkclient.ZkClient$10.call(ZkClient.java:813) at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) at org.I0Itec.zkclient.ZkClient.writeData(ZkClient.java:809) at org.I0Itec.zkclient.ZkClient.writeData(ZkClient.java:777) at kafka.utils.ZkUtils$.updatePersistentPath(ZkUtils.scala:103) at kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2$$anonfun$apply$4.apply(ZookeeperConsumerConnector.scala:251) at kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2$$anonfun$apply$4.apply(ZookeeperConsumerConnector.scala:248) at scala.collection.Iterator$class.foreach(Iterator.scala:631) at scala.collection.JavaConversions$JIteratorWrapper.foreach(JavaConversions.scala:549) at scala.collection.IterableLike$class.foreach(IterableLike.scala:79) at scala.collection.JavaConversions$JCollectionWrapper.foreach(JavaConversions.scala:570) at kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2.apply(ZookeeperConsumerConnector.scala:248) at kafka.consumer.ZookeeperConsumerConnector$$anonfun$commitOffsets$2.apply(ZookeeperConsumerConnector.scala:246) at scala.collection.Iterator$class.foreach(Iterator.scala:631) at kafka.utils.Pool$$anon$1.foreach(Pool.scala:53) at scala.collection.IterableLike$class.foreach(IterableLike.scala:79) at kafka.utils.Pool.foreach(Pool.scala:24) at kafka.consumer.ZookeeperConsumerConnector.commitOffsets(ZookeeperConsumerConnector.scala:246) at kafka.consumer.ZookeeperConsumerConnector.autoCommit(ZookeeperConsumerConnector.scala:232) at kafka.consumer.ZookeeperConsumerConnector$$anonfun$1.apply$mcV$sp(ZookeeperConsumerConnector.scala:126) at kafka.utils.Utils$$anon$2.run(Utils.scala:58) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] kafka pull request: Remove non-functional variable definition in l...
Github user rocketraman closed the pull request at: https://github.com/apache/kafka/pull/36 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: [DISCUSS] KIP-27 - Conditional Publish
I'm with you on the races that could happen in the scenarios you describe, but I'm still not convinced that conditionally updating is the best call. Instead of conditionally updating, the broker could fence off the old owner to avoid spurious writes, and that's valid for all attempts. The advantage of fencing is that the broker does not accept at all requests from others, while the conditional update is a bit fragile to protect streams of publishes. -Flavio On Fri, Jul 17, 2015 at 11:47 PM, Ben Kirwin b...@kirw.in wrote: Hi all, So, perhaps it's worth adding a couple specific examples of where this feature is useful, to make this a bit more concrete: - Suppose I'm using Kafka as a commit log for a partitioned KV store, like Samza or Pistachio (?) do. We bootstrap the process state by reading from that partition, and log all state updates to that partition when we're running. Now imagine that one of my processes locks up -- GC or similar -- and the system transitions that partition over to another node. When the GC is finished, the old 'owner' of that partition might still be trying to write to the commit log at the same as the new one is. A process might detect this by noticing that the offset of the published message is bigger than it thought the upcoming offset was, which implies someone else has been writing to the log... but by then it's too late, and the commit log is already corrupt. With a 'conditional produce', one of those processes will have it's publish request refused -- so we've avoided corrupting the state. - Envision some copycat-like system, where we have some sharded postgres setup and we're tailing each shard into its own partition. Normally, it's fairly easy to avoid duplicates here: we can track which offset in the WAL corresponds to which offset in Kafka, and we know how many messages we've written to Kafka already, so the state is very simple. However, it is possible that for a moment -- due to rebalancing or operator error or some other thing -- two different nodes are tailing the same postgres shard at once! Normally this would introduce duplicate messages, but by specifying the expected offset, we can avoid this. So perhaps it's better to say that this is useful when a single producer is *expected*, but multiple producers are *possible*? (In the same way that the high-level consumer normally has 1 consumer in a group reading from a partition, but there are small windows where more than one might be reading at the same time.) This is also the spirit of the 'runtime cost' comment -- in the common case, where there is little to no contention, there's no performance overhead either. I mentioned this a little in the Motivation section -- maybe I should flesh that out a little bit? For me, the motivation to work this up was that I kept running into cases, like the above, where the existing API was almost-but-not-quite enough to give the guarantees I was looking for -- and the extension needed to handle those cases too was pretty small and natural-feeling. On Fri, Jul 17, 2015 at 4:49 PM, Ashish Singh asi...@cloudera.com wrote: Good concept. I have a question though. Say there are two producers A and B. Both producers are producing to same partition. - A sends a message with expected offset, x1 - Broker accepts is and sends an Ack - B sends a message with expected offset, x1 - Broker rejects it, sends nack - B sends message again with expected offset, x1+1 - Broker accepts it and sends Ack I guess this is what this KIP suggests, right? If yes, then how does this ensure that same message will not be written twice when two producers are producing to same partition? Producer on receiving a nack will try again with next offset and will keep doing so till the message is accepted. Am I missing something? Also, you have mentioned on KIP, it imposes little to no runtime cost in memory or time, I think that is not true for time. With this approach producers' performance will reduce proportionally to number of producers writing to same partition. Please correct me if I am missing out something. On Fri, Jul 17, 2015 at 11:32 AM, Mayuresh Gharat gharatmayures...@gmail.com wrote: If we have 2 producers producing to a partition, they can be out of order, then how does one producer know what offset to expect as it does not interact with other producer? Can you give an example flow that explains how it works with single producer and with multiple producers? Thanks, Mayuresh On Fri, Jul 17, 2015 at 10:28 AM, Flavio Junqueira fpjunque...@yahoo.com.invalid wrote: I like this feature, it reminds me of conditional updates in zookeeper. I'm not sure if it'd be best to have some mechanism for fencing rather than a conditional write like you're proposing. The reason I'm saying this is that the conditional write applies to requests individually, while it sounds
[jira] [Created] (KAFKA-2350) Add KafkaConsumer pause capability
Jason Gustafson created KAFKA-2350: -- Summary: Add KafkaConsumer pause capability Key: KAFKA-2350 URL: https://issues.apache.org/jira/browse/KAFKA-2350 Project: Kafka Issue Type: Improvement Reporter: Jason Gustafson Assignee: Jason Gustafson There are some use cases in stream processing where it is helpful to be able to pause consumption of a topic. For example, when joining two topics, you may need to delay processing of one topic while you wait for the consumer of the other topic to catch up. The new consumer currently doesn't provide a nice way to do this. If you skip poll() or if you unsubscribe, then a rebalance will be triggered and your partitions will be reassigned. One way to achieve this would be to add two new methods to KafkaConsumer: {code} void pause(String... partitions); void unpause(String... partitions); {code} When a topic is paused, a call to KafkaConsumer.poll will not initiate any new fetches for that topic. After it is unpaused, fetches will begin again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2260) Allow specifying expected offset on produce
[ https://issues.apache.org/jira/browse/KAFKA-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634117#comment-14634117 ] Flavio Junqueira commented on KAFKA-2260: - I like the use of an array to increase the degree of concurrency. This is actually a common trick in concurrent data structures, so suitable here. But, in this case, unless I'm missing the point, isn't it the case that you can't guarantee that two publishers end up succeeding when publishing concurrently, which is one of the use cases that [~bkirwi] says he is trying to avoid? Could you guys clarify this, please? Allow specifying expected offset on produce --- Key: KAFKA-2260 URL: https://issues.apache.org/jira/browse/KAFKA-2260 Project: Kafka Issue Type: Improvement Reporter: Ben Kirwin Assignee: Ewen Cheslack-Postava Priority: Minor Attachments: expected-offsets.patch I'd like to propose a change that adds a simple CAS-like mechanism to the Kafka producer. This update has a small footprint, but enables a bunch of interesting uses in stream processing or as a commit log for process state. h4. Proposed Change In short: - Allow the user to attach a specific offset to each message produced. - The server assigns offsets to messages in the usual way. However, if the expected offset doesn't match the actual offset, the server should fail the produce request instead of completing the write. This is a form of optimistic concurrency control, like the ubiquitous check-and-set -- but instead of checking the current value of some state, it checks the current offset of the log. h4. Motivation Much like check-and-set, this feature is only useful when there's very low contention. Happily, when Kafka is used as a commit log or as a stream-processing transport, it's common to have just one producer (or a small number) for a given partition -- and in many of these cases, predicting offsets turns out to be quite useful. - We get the same benefits as the 'idempotent producer' proposal: a producer can retry a write indefinitely and be sure that at most one of those attempts will succeed; and if two producers accidentally write to the end of the partition at once, we can be certain that at least one of them will fail. - It's possible to 'bulk load' Kafka this way -- you can write a list of n messages consecutively to a partition, even if the list is much larger than the buffer size or the producer has to be restarted. - If a process is using Kafka as a commit log -- reading from a partition to bootstrap, then writing any updates to that same partition -- it can be sure that it's seen all of the messages in that partition at the moment it does its first (successful) write. There's a bunch of other similar use-cases here, but they all have roughly the same flavour. h4. Implementation The major advantage of this proposal over other suggested transaction / idempotency mechanisms is its minimality: it gives the 'obvious' meaning to a currently-unused field, adds no new APIs, and requires very little new code or additional work from the server. - Produced messages already carry an offset field, which is currently ignored by the server. This field could be used for the 'expected offset', with a sigil value for the current behaviour. (-1 is a natural choice, since it's already used to mean 'next available offset'.) - We'd need a new error and error code for a 'CAS failure'. - The server assigns offsets to produced messages in {{ByteBufferMessageSet.validateMessagesAndAssignOffsets}}. After this changed, this method would assign offsets in the same way -- but if they don't match the offset in the message, we'd return an error instead of completing the write. - To avoid breaking existing clients, this behaviour would need to live behind some config flag. (Possibly global, but probably more useful per-topic?) I understand all this is unsolicited and possibly strange: happy to answer questions, and if this seems interesting, I'd be glad to flesh this out into a full KIP or patch. (And apologies if this is the wrong venue for this sort of thing!) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Kafka Unit Test Failures on a Mac
Thanks Ismael! I agree clear failures or no failures is optimal. I did some hacky analysis of the open files by running the tests and utilizing the lsof command. In one run of the core tests I found the following: - 4584 regular files (REG) - 376 .jar files - Not much one can/should do here. Many are from gradle itself. - 2392 kafka .log files - why are these being leaked? - after a single test no file handles should remain - 1162 kafka .log.deleted files - why are these being leaked? - 469 kafka .index files - This is due to Java's handling of MappedByteBuffer - A mapped byte buffer and the file mapping that it represents remain valid until the buffer itself is garbage-collected. - http://bugs.java.com/view_bug.do?bug_id=4724038 - http://stackoverflow.com/questions/2972986/how-to-unmap-a-file-from-memory-mapped-using-filechannel-in-java - Perhaps setting mmap to null when kafka.log.OffsetIndex.close is called would help ensure this gets GC'd asap. - 943 of types PIPE KQUEUE - 629 PIPE - 314 KQUEUE - should do some analysis sometime - 47 of other types (TCP, unix, IPv6, ...) On Sun, Jul 19, 2015 at 3:16 PM, Ismael Juma ism...@juma.me.uk wrote: Hello Grant, Thanks for figuring this out. I have also run into this issue when running the tests on OS X Yosemite. Ideally the tests would fail in a way that would make it clear what the issue is. That may be complicated, so we should at least document it as you suggest. I'll let you know if the issues goes away for me too with this change. Best, Ismael On Sun, Jul 19, 2015 at 4:24 PM, Grant Henke ghe...@cloudera.com wrote: When running all Kafka tests I had been getting failures most every time. Usually in the SocketServerTest class. However, when I would run individual tests, there were no failures. After a bit of digging I found this is due to the small default open files limit in Mac Yosemite. I am positing how to increase the limit here in case anyone else has been running into the issue. Let me know if this helped you too. If it is fairly common we can put something on the wiki. *Adjusting Open File Limits in Yosemite:* Note: You can choose your own limits as appropriate 1. Write the following xml to /Library/LaunchDaemons/limit.maxfiles.plist: ?xml version=1.0 encoding=UTF-8? !DOCTYPE plist PUBLIC -//Apple//DTD PLIST 1.0//EN http://www.apple.com/DTDs/PropertyList-1.0.dtd; plist version=1.0 dict keyLabel/key stringlimit.maxfiles/string keyProgramArguments/key array stringlaunchctl/string stringlimit/string stringmaxfiles/string string65536/string string65536/string /array keyRunAtLoad/key true/ keyServiceIPC/key false/ /dict /plist 2. Then write the following to /Library/LaunchDaemons/limit.maxproc.plist: ?xml version=1.0 encoding=UTF-8? !DOCTYPE plist PUBLIC -//Apple/DTD PLIST 1.0//EN http://www.apple.com/DTDs/PropertyList-1.0.dtd; plist version=1.0 dict keyLabel/key stringlimit.maxproc/string keyProgramArguments/key array stringlaunchctl/string stringlimit/string stringmaxproc/string string2048/string string2048/string /array keyRunAtLoad/key true / keyServiceIPC/key false / /dict /plist 3. Add the following to your bashrc or bashprofile: ulimit -n 65536 ulimit -u 2048 4. Restart your computer. After restart validate settings by executing: launchctl limit *Adjusting Open File Limits in Older Versions of OS X:* Note: You can choose your own limits as appropriate 1. Add the following command to /etc/launchd.conf: limit maxfiles 32768 65536 2. Restart your computer. After restart validate settings by executing: launchctl limit -- Grant Henke Solutions Consultant | Cloudera ghe...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
Re: [DISCUSS] KIP-27 - Conditional Publish
It would be worth fleshing out the use cases a bit more and thinking through the overlap with the other proposals for transactions and idempotence (since likely we will end up with both). The advantage of this proposal is that it is really simple. If we go through use cases: 1. Stream processing: I suspect in this case data is partitioned over multiple partitions/topics by multiple writers so it needs a more general atomicity across partitions. 2. Copycat: This is the case where you're publishing data from an external system. For some external systems I think this mechanism could provide an exactly-once publication mechanism however there are some details about retries to think through. 3. Key-value store/event sourcing: This is the case where you are building a log-centric key-value store or an event sourced application. I think this could potentially use this feature but it needs thinking through. One subtlety to think through is the relationship with request pipelining and retries. -Jay On Mon, Jul 20, 2015 at 12:05 PM, Ben Kirwin b...@kirw.in wrote: Hi Jun, Thanks for the close reading! Responses inline. Thanks for the write-up. The single producer use case you mentioned makes sense. It would be useful to include that in the KIP wiki. Great -- I'll make sure that the wiki is clear about this. 1. What happens when the leader of the partition changes in the middle of a produce request? In this case, the producer client is not sure whether the request succeeds or not. If there is only a single message in the request, the producer can just resend the request. If it sees an OffsetMismatch error, it knows that the previous send actually succeeded and can proceed with the next write. This is nice since it not only allows the producer to proceed during transient failures in the broker, it also avoids duplicates during producer resend. One caveat is when there are multiple messages in the same partition in a produce request. The issue is that in our current replication protocol, it's possible for some, but not all messages in the request to be committed. This makes resend a bit harder to deal with since on receiving an OffsetMismatch error, it's not clear which messages have been committed. One possibility is to expect that compression is enabled, in which case multiple messages are compressed into a single message. I was thinking that another possibility is for the broker to return the current high watermark when sending an OffsetMismatch error. Based on this info, the producer can resend the subset of messages that have not been committed. However, this may not work in a compacted topic since there can be holes in the offset. This is a excellent question. It's my understanding that at least a *prefix* of messages will be committed (right?) -- which seems to be enough for many cases. I'll try and come up with a more concrete answer here. 2. Is this feature only intended to be used with ack = all? The client doesn't get the offset with ack = 0. With ack = 1, it's possible for a previously acked message to be lost during leader transition, which will make the client logic more complicated. It's true that acks = 0 doesn't seem to be particularly useful; in all the cases I've come across, the client eventually wants to know about the mismatch error. However, it seems like there are some cases where acks = 1 would be fine -- eg. in a bulk load of a fixed dataset, losing messages during a leader transition just means you need to rewind / restart the load, which is not especially catastrophic. For many other interesting cases, acks = all is probably preferable. 3. How does the producer client know the offset to send the first message? Do we need to expose an API in producer to get the current high watermark? You're right, it might be irritating to have to go through the consumer API just for this. There are some cases where the offsets are already available -- like the commit-log-for-KV-store example -- but in general, being able to get the offsets from the producer interface does sound convenient. We plan to have a KIP discussion meeting tomorrow at 11am PST. Perhaps you can describe this KIP a bit then? Sure, happy to join. Thanks, Jun On Sat, Jul 18, 2015 at 10:37 AM, Ben Kirwin b...@kirw.in wrote: Just wanted to flag a little discussion that happened on the ticket: https://issues.apache.org/jira/browse/KAFKA-2260?focusedCommentId=14632259page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14632259 In particular, Yasuhiro Matsuda proposed an interesting variant on this that performs the offset check on the message key (instead of just the partition), with bounded space requirements, at the cost of potentially some spurious failures. (ie. the produce request may fail even if that particular key hasn't been updated recently.) This addresses a
[jira] [Commented] (KAFKA-313) Add JSON/CSV output and looping options to ConsumerGroupCommand
[ https://issues.apache.org/jira/browse/KAFKA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634069#comment-14634069 ] Gwen Shapira commented on KAFKA-313: [~nehanarkhede] - mind if I review? Add JSON/CSV output and looping options to ConsumerGroupCommand --- Key: KAFKA-313 URL: https://issues.apache.org/jira/browse/KAFKA-313 Project: Kafka Issue Type: Improvement Reporter: Dave DeMaagd Assignee: Ashish K Singh Priority: Minor Labels: newbie, patch Fix For: 0.8.3 Attachments: KAFKA-313-2012032200.diff, KAFKA-313.1.patch, KAFKA-313.patch, KAFKA-313_2015-02-23_18:11:32.patch, KAFKA-313_2015-06-24_11:14:24.patch Adds: * '--loop N' - causes the program to loop forever, sleeping for up to N seconds between loops (loop time minus collection time, unless that's less than 0, at which point it will just run again immediately) * '--asjson' - display as a JSON string instead of the more human readable output format. Neither of the above depend on each other (you can loop in the human readable output, or do a single shot execution with JSON output). Existing behavior/output maintained if neither of the above are used. Diff Attached. Impacted files: core/src/main/scala/kafka/admin/ConsumerGroupCommand.scala -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 36590: Patch for KAFKA-2275
On July 19, 2015, 1:11 a.m., Jason Gustafson wrote: Ashish Singh wrote: Jason, thanks for your review! I looked into ConsumerNetworkClient/ NetwrokClient, Metadata and Cluster classes. On receiving metadataUpdate, cluster instance in metadata is updated. However, when a topic is added by consumer, it is added to metadata.topics. After considering various options, I have updated the patch with what I think is the least obtrusive changes. So, we still keep metadata.topics as the list of topics we are interested in maintaining the state for, however we can choose to get metadata for all topics by setting metadata.needMetadataForAllTopics. One thing to notice is that in the current implementation there is no caching for allTopics metadata, which might be OK depending on how we are planning to use it. We can discuss further once you take a look at the latest patch. Jason Gustafson wrote: Hey Ashish, that makes sense and I agree that it seems less obtrusive. One concern I have is that we're using the same Cluster object in Metadata for representing both the set of all metadata and for just a subset (those topics that have been added through subscriptions). It seems like there might be potential for conflict there. Additionally I'm not sure how we'll be able to extend this to handle regex subscriptions. Basically we need to be able to listen for metadata changes and update our subscriptions based on any topic changes. We could block to get all metdata, but it's probably best if we can do this asynchronously. Do you have any thoughts on this? {quote} One concern I have is that we're using the same Cluster object in Metadata for representing both the set of all metadata and for just a subset (those topics that have been added through subscriptions). It seems like there might be potential for conflict there. {quote} Maybe I should move the flag, indicating cluster has metadata for all topics or subset of topics, to Cluster. Makes sense? {quote} Additionally I'm not sure how we'll be able to extend this to handle regex subscriptions. Basically we need to be able to listen for metadata changes and update our subscriptions based on any topic changes. We could block to get all metdata, but it's probably best if we can do this asynchronously. Do you have any thoughts on this? {quote} I do not think there is a way to directly subscribe to metadata changes as of now. Correct me if my understanding is wrong. One would have to periodically poll to get metadata updates. Now, the question becomes where should this polling be done? With the current modification, the regex subscriber will have to manage the polling logic. We can definitely push the polling logic down to say Network client, but then the question will be is it required? Let me know your thoughts. - Ashish --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92194 --- On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
[jira] [Commented] (KAFKA-2299) kafka-patch-review tool does not correctly capture testing done
[ https://issues.apache.org/jira/browse/KAFKA-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634068#comment-14634068 ] Gwen Shapira commented on KAFKA-2299: - [~nehanarkhede] - mind if I review? kafka-patch-review tool does not correctly capture testing done --- Key: KAFKA-2299 URL: https://issues.apache.org/jira/browse/KAFKA-2299 Project: Kafka Issue Type: Bug Reporter: Ashish K Singh Assignee: Ashish K Singh Attachments: KAFKA-2299.patch kafka-patch-review tool does not correctly capture testing done when specified with -t or --testing-done. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2275) Add a ListTopics() API to the new consumer
[ https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634080#comment-14634080 ] Ashish K Singh commented on KAFKA-2275: --- I guess, it will be better to keep the design level discussion on the JIRA and not on RB. Copying relevant conversation from RB here. {quote} Jason: Adding the topic to the Metadata object means that from this point forward, we will always fetch the associated metadata for whatever topics were used in partitionsFor, even if we don't actually care about them anymore. Seems a little unfortunate, though I doubt it's much of an issue since users would probably only call this method for subscribed topics. Ashish: Jason, thanks for your review! I looked into ConsumerNetworkClient/ NetwrokClient, Metadata and Cluster classes. On receiving metadataUpdate, cluster instance in metadata is updated. However, when a topic is added by consumer, it is added to metadata.topics. After considering various options, I have updated the patch with what I think is the least obtrusive changes. So, we still keep metadata.topics as the list of topics we are interested in maintaining the state for, however we can choose to get metadata for all topics by setting metadata.needMetadataForAllTopics. One thing to notice is that in the current implementation there is no caching for allTopics metadata, which might be OK depending on how we are planning to use it. We can discuss further once you take a look at the latest patch. Jason Gustafson 1 hour, 50 minutes ago (July 20, 2015, 7:12 p.m.) Hey Ashish, that makes sense and I agree that it seems less obtrusive. One concern I have is that we're using the same Cluster object in Metadata for representing both the set of all metadata and for just a subset (those topics that have been added through subscriptions). It seems like there might be potential for conflict there. Additionally I'm not sure how we'll be able to extend this to handle regex subscriptions. Basically we need to be able to listen for metadata changes and update our subscriptions based on any topic changes. We could block to get all metdata, but it's probably best if we can do this asynchronously. Do you have any thoughts on this? Ashish Singh 4 minutes ago (July 20th, 2015, 8:58 p.m.) One concern I have is that we're using the same Cluster object in Metadata for representing both the set of all metadata and for just a subset (those topics that have been added through subscriptions). It seems like there might be potential for conflict there. Maybe I should move the flag, indicating cluster has metadata for all topics or subset of topics, to Cluster. Makes sense? Additionally I'm not sure how we'll be able to extend this to handle regex subscriptions. Basically we need to be able to listen for metadata changes and update our subscriptions based on any topic changes. We could block to get all metdata, but it's probably best if we can do this asynchronously. Do you have any thoughts on this? I do not think there is a way to directly subscribe to metadata changes as of now. Correct me if my understanding is wrong. One would have to periodically poll to get metadata updates. Now, the question becomes where should this polling be done? With the current modification, the regex subscriber will have to manage the polling logic. We can definitely push the polling logic down to say Network client, but then the question will be is it required? Let me know your thoughts. {quote} [~hachikuji], [~guozhang] thoughts? Add a ListTopics() API to the new consumer -- Key: KAFKA-2275 URL: https://issues.apache.org/jira/browse/KAFKA-2275 Project: Kafka Issue Type: Sub-task Components: consumer Reporter: Guozhang Wang Assignee: Ashish K Singh Priority: Critical Fix For: 0.8.3 Attachments: KAFKA-2275.patch, KAFKA-2275_2015-07-17_21:39:27.patch, KAFKA-2275_2015-07-20_10:44:19.patch With regex subscription like {code} consumer.subscribe(topic*) {code} The partition assignment is automatically done at the Kafka side, while there are some use cases where consumers want regex subscriptions but not Kafka-side partition assignment, rather with their own specific partition assignment. With ListTopics() they can periodically check for topic list changes and specifically subscribe to the partitions of the new topics. For implementation, it involves sending a TopicMetadataRequest to a random broker and parse the response. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 36570: Patch for KAFKA-2337
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/#review92323 --- Ship it! LGTM. Just a small nit. core/src/main/scala/kafka/admin/AdminUtils.scala (lines 249 - 251) https://reviews.apache.org/r/36570/#comment146413 Nit: Our code standard includes not using curly brackets on a single line if block. core/src/main/scala/kafka/admin/TopicCommand.scala (lines 88 - 90) https://reviews.apache.org/r/36570/#comment146412 Nit: Our code standard includes not using curly brackets on a single line if block. - Gwen Shapira On July 20, 2015, 5:37 p.m., Grant Henke wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/ --- (Updated July 20, 2015, 5:37 p.m.) Review request for kafka. Bugs: KAFKA-2337 https://issues.apache.org/jira/browse/KAFKA-2337 Repository: kafka Description --- KAFKA-2337: Verify that metric names will not collide when creating new topics Diffs - core/src/main/scala/kafka/admin/AdminUtils.scala f06edf41c732a7b794e496d0048b0ce6f897e72b core/src/main/scala/kafka/admin/TopicCommand.scala a90aa8787ff21b963765a547980154363c1c93c6 core/src/main/scala/kafka/common/Topic.scala 32595d6fe432141119db26d3b5ebe229aac40805 core/src/test/scala/unit/kafka/admin/AdminTest.scala 252ac813c8df1780c2dc5fa9e698fb43bb6d5cf8 core/src/test/scala/unit/kafka/common/TopicTest.scala 79532c89c41572ba953c4dc3319a05354927e961 Diff: https://reviews.apache.org/r/36570/diff/ Testing --- Thanks, Grant Henke
[jira] [Commented] (KAFKA-2350) Add KafkaConsumer pause capability
[ https://issues.apache.org/jira/browse/KAFKA-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634124#comment-14634124 ] Jason Gustafson commented on KAFKA-2350: Sure, I just meant that if you fail to call poll() periodically (in order to pause consumption), then no heartbeats can be sent, which will cause the coordinator to rebalance. This only applies if you are using assignment from the coordinator. Add KafkaConsumer pause capability -- Key: KAFKA-2350 URL: https://issues.apache.org/jira/browse/KAFKA-2350 Project: Kafka Issue Type: Improvement Reporter: Jason Gustafson Assignee: Jason Gustafson There are some use cases in stream processing where it is helpful to be able to pause consumption of a topic. For example, when joining two topics, you may need to delay processing of one topic while you wait for the consumer of the other topic to catch up. The new consumer currently doesn't provide a nice way to do this. If you skip poll() or if you unsubscribe, then a rebalance will be triggered and your partitions will be reassigned. One way to achieve this would be to add two new methods to KafkaConsumer: {code} void pause(String... topics); void unpause(String... topics); {code} When a topic is paused, a call to KafkaConsumer.poll will not initiate any new fetches for that topic. After it is unpaused, fetches will begin again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 36570: Patch for KAFKA-2337
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/#review92327 --- Ship it! Ship It! - Grant Henke On July 20, 2015, 9:48 p.m., Grant Henke wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/ --- (Updated July 20, 2015, 9:48 p.m.) Review request for kafka. Bugs: KAFKA-2337 https://issues.apache.org/jira/browse/KAFKA-2337 Repository: kafka Description --- KAFKA-2337: Verify that metric names will not collide when creating new topics Diffs - core/src/main/scala/kafka/admin/AdminUtils.scala f06edf41c732a7b794e496d0048b0ce6f897e72b core/src/main/scala/kafka/admin/TopicCommand.scala a90aa8787ff21b963765a547980154363c1c93c6 core/src/main/scala/kafka/common/Topic.scala 32595d6fe432141119db26d3b5ebe229aac40805 core/src/test/scala/unit/kafka/admin/AdminTest.scala 252ac813c8df1780c2dc5fa9e698fb43bb6d5cf8 core/src/test/scala/unit/kafka/common/TopicTest.scala 79532c89c41572ba953c4dc3319a05354927e961 Diff: https://reviews.apache.org/r/36570/diff/ Testing --- Thanks, Grant Henke
[jira] [Commented] (KAFKA-2337) Verify that metric names will not collide when creating new topics
[ https://issues.apache.org/jira/browse/KAFKA-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634136#comment-14634136 ] Grant Henke commented on KAFKA-2337: Updated reviewboard https://reviews.apache.org/r/36570/diff/ against branch origin/trunk Verify that metric names will not collide when creating new topics -- Key: KAFKA-2337 URL: https://issues.apache.org/jira/browse/KAFKA-2337 Project: Kafka Issue Type: Bug Reporter: Gwen Shapira Assignee: Grant Henke Attachments: KAFKA-2337.patch, KAFKA-2337_2015-07-17_11:17:30.patch, KAFKA-2337_2015-07-20_12:36:41.patch, KAFKA-2337_2015-07-20_16:48:25.patch When creating a new topic, convert the proposed topic name to the name that will be used in metrics and validate that there are no collisions with existing names. See this discussion for context: http://s.apache.org/snW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSS] KIP-27 - Conditional Publish
I'm with you on the races that could happen in the scenarios you describe, but I'm still not convinced that conditionally updating is the best call. Instead of conditionally updating, the broker could fence off the old owner to avoid spurious writes, and that's valid for all attempts. The advantage of fencing is that the broker does not accept at all requests from others, while the conditional update is a bit fragile to protect streams of publishes. -Flavio On Fri, Jul 17, 2015 at 11:47 PM, Ben Kirwin b...@kirw.in wrote: Hi all, So, perhaps it's worth adding a couple specific examples of where this feature is useful, to make this a bit more concrete: - Suppose I'm using Kafka as a commit log for a partitioned KV store, like Samza or Pistachio (?) do. We bootstrap the process state by reading from that partition, and log all state updates to that partition when we're running. Now imagine that one of my processes locks up -- GC or similar -- and the system transitions that partition over to another node. When the GC is finished, the old 'owner' of that partition might still be trying to write to the commit log at the same as the new one is. A process might detect this by noticing that the offset of the published message is bigger than it thought the upcoming offset was, which implies someone else has been writing to the log... but by then it's too late, and the commit log is already corrupt. With a 'conditional produce', one of those processes will have it's publish request refused -- so we've avoided corrupting the state. - Envision some copycat-like system, where we have some sharded postgres setup and we're tailing each shard into its own partition. Normally, it's fairly easy to avoid duplicates here: we can track which offset in the WAL corresponds to which offset in Kafka, and we know how many messages we've written to Kafka already, so the state is very simple. However, it is possible that for a moment -- due to rebalancing or operator error or some other thing -- two different nodes are tailing the same postgres shard at once! Normally this would introduce duplicate messages, but by specifying the expected offset, we can avoid this. So perhaps it's better to say that this is useful when a single producer is *expected*, but multiple producers are *possible*? (In the same way that the high-level consumer normally has 1 consumer in a group reading from a partition, but there are small windows where more than one might be reading at the same time.) This is also the spirit of the 'runtime cost' comment -- in the common case, where there is little to no contention, there's no performance overhead either. I mentioned this a little in the Motivation section -- maybe I should flesh that out a little bit? For me, the motivation to work this up was that I kept running into cases, like the above, where the existing API was almost-but-not-quite enough to give the guarantees I was looking for -- and the extension needed to handle those cases too was pretty small and natural-feeling. On Fri, Jul 17, 2015 at 4:49 PM, Ashish Singh asi...@cloudera.com wrote: Good concept. I have a question though. Say there are two producers A and B. Both producers are producing to same partition. - A sends a message with expected offset, x1 - Broker accepts is and sends an Ack - B sends a message with expected offset, x1 - Broker rejects it, sends nack - B sends message again with expected offset, x1+1 - Broker accepts it and sends Ack I guess this is what this KIP suggests, right? If yes, then how does this ensure that same message will not be written twice when two producers are producing to same partition? Producer on receiving a nack will try again with next offset and will keep doing so till the message is accepted. Am I missing something? Also, you have mentioned on KIP, it imposes little to no runtime cost in memory or time, I think that is not true for time. With this approach producers' performance will reduce proportionally to number of producers writing to same partition. Please correct me if I am missing out something. On Fri, Jul 17, 2015 at 11:32 AM, Mayuresh Gharat gharatmayures...@gmail.com wrote: If we have 2 producers producing to a partition, they can be out of order, then how does one producer know what offset to expect as it does not interact with other producer? Can you give an example flow that explains how it works with single producer and with multiple producers? Thanks, Mayuresh On Fri, Jul 17, 2015 at 10:28 AM, Flavio Junqueira fpjunque...@yahoo.com.invalid wrote: I like this feature, it reminds me of conditional updates in zookeeper. I'm not sure if it'd be best to have some mechanism for fencing rather than a conditional write like you're proposing. The reason I'm saying this is that the conditional write applies to requests individually, while it sounds
[jira] [Updated] (KAFKA-2350) Add KafkaConsumer pause capability
[ https://issues.apache.org/jira/browse/KAFKA-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-2350: --- Description: There are some use cases in stream processing where it is helpful to be able to pause consumption of a topic. For example, when joining two topics, you may need to delay processing of one topic while you wait for the consumer of the other topic to catch up. The new consumer currently doesn't provide a nice way to do this. If you skip poll() or if you unsubscribe, then a rebalance will be triggered and your partitions will be reassigned. One way to achieve this would be to add two new methods to KafkaConsumer: {code} void pause(String... topics); void unpause(String... topics); {code} When a topic is paused, a call to KafkaConsumer.poll will not initiate any new fetches for that topic. After it is unpaused, fetches will begin again. was: There are some use cases in stream processing where it is helpful to be able to pause consumption of a topic. For example, when joining two topics, you may need to delay processing of one topic while you wait for the consumer of the other topic to catch up. The new consumer currently doesn't provide a nice way to do this. If you skip poll() or if you unsubscribe, then a rebalance will be triggered and your partitions will be reassigned. One way to achieve this would be to add two new methods to KafkaConsumer: {code} void pause(String... partitions); void unpause(String... partitions); {code} When a topic is paused, a call to KafkaConsumer.poll will not initiate any new fetches for that topic. After it is unpaused, fetches will begin again. Add KafkaConsumer pause capability -- Key: KAFKA-2350 URL: https://issues.apache.org/jira/browse/KAFKA-2350 Project: Kafka Issue Type: Improvement Reporter: Jason Gustafson Assignee: Jason Gustafson There are some use cases in stream processing where it is helpful to be able to pause consumption of a topic. For example, when joining two topics, you may need to delay processing of one topic while you wait for the consumer of the other topic to catch up. The new consumer currently doesn't provide a nice way to do this. If you skip poll() or if you unsubscribe, then a rebalance will be triggered and your partitions will be reassigned. One way to achieve this would be to add two new methods to KafkaConsumer: {code} void pause(String... topics); void unpause(String... topics); {code} When a topic is paused, a call to KafkaConsumer.poll will not initiate any new fetches for that topic. After it is unpaused, fetches will begin again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KAFKA-2350) Add KafkaConsumer pause capability
[ https://issues.apache.org/jira/browse/KAFKA-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634114#comment-14634114 ] Gwen Shapira commented on KAFKA-2350: - Cool feature :) Can you clarify: If you skip poll() ... then a rebalance will be triggered When does a delay count as skipping? Are we obligated to do the next poll() immediately after the first one ended? I expect to use the consumer to do something like: poll until I get N messages, write those messages elsewhere, poll again. If the write messages elsewhere takes longer than expected (DB is busy kinda scenario), the consumer will lose the partitions? (sorry if I missed important discussion elsewhere, feel free to refer me to another JIRA or thread) Add KafkaConsumer pause capability -- Key: KAFKA-2350 URL: https://issues.apache.org/jira/browse/KAFKA-2350 Project: Kafka Issue Type: Improvement Reporter: Jason Gustafson Assignee: Jason Gustafson There are some use cases in stream processing where it is helpful to be able to pause consumption of a topic. For example, when joining two topics, you may need to delay processing of one topic while you wait for the consumer of the other topic to catch up. The new consumer currently doesn't provide a nice way to do this. If you skip poll() or if you unsubscribe, then a rebalance will be triggered and your partitions will be reassigned. One way to achieve this would be to add two new methods to KafkaConsumer: {code} void pause(String... topics); void unpause(String... topics); {code} When a topic is paused, a call to KafkaConsumer.poll will not initiate any new fetches for that topic. After it is unpaused, fetches will begin again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Kafka High level consumer rebalancing
Hi all, Is there any way I can force Zookeeper/Kafka to rebalance new consumers only for subset of total number of partitions. I have a situation where out of 120 partitions 60 have been already consumed, but the zookeeper also assigns these empty/inactive partitions as well for the re-balancing, I want my resources to be used only for the partitions which still have some messages left to read. Thanks -Pranay
[jira] [Commented] (KAFKA-2350) Add KafkaConsumer pause capability
[ https://issues.apache.org/jira/browse/KAFKA-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634126#comment-14634126 ] Gwen Shapira commented on KAFKA-2350: - oh, for some reason I expected heartbeats to be handled in a separate consumer thread. Not sure why though, so never mind :) Add KafkaConsumer pause capability -- Key: KAFKA-2350 URL: https://issues.apache.org/jira/browse/KAFKA-2350 Project: Kafka Issue Type: Improvement Reporter: Jason Gustafson Assignee: Jason Gustafson There are some use cases in stream processing where it is helpful to be able to pause consumption of a topic. For example, when joining two topics, you may need to delay processing of one topic while you wait for the consumer of the other topic to catch up. The new consumer currently doesn't provide a nice way to do this. If you skip poll() or if you unsubscribe, then a rebalance will be triggered and your partitions will be reassigned. One way to achieve this would be to add two new methods to KafkaConsumer: {code} void pause(String... topics); void unpause(String... topics); {code} When a topic is paused, a call to KafkaConsumer.poll will not initiate any new fetches for that topic. After it is unpaused, fetches will begin again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KAFKA-2337) Verify that metric names will not collide when creating new topics
[ https://issues.apache.org/jira/browse/KAFKA-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke updated KAFKA-2337: --- Attachment: KAFKA-2337_2015-07-20_16:48:25.patch Verify that metric names will not collide when creating new topics -- Key: KAFKA-2337 URL: https://issues.apache.org/jira/browse/KAFKA-2337 Project: Kafka Issue Type: Bug Reporter: Gwen Shapira Assignee: Grant Henke Attachments: KAFKA-2337.patch, KAFKA-2337_2015-07-17_11:17:30.patch, KAFKA-2337_2015-07-20_12:36:41.patch, KAFKA-2337_2015-07-20_16:48:25.patch When creating a new topic, convert the proposed topic name to the name that will be used in metrics and validate that there are no collisions with existing names. See this discussion for context: http://s.apache.org/snW -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 36570: Patch for KAFKA-2337
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36570/ --- (Updated July 20, 2015, 9:48 p.m.) Review request for kafka. Bugs: KAFKA-2337 https://issues.apache.org/jira/browse/KAFKA-2337 Repository: kafka Description --- KAFKA-2337: Verify that metric names will not collide when creating new topics Diffs (updated) - core/src/main/scala/kafka/admin/AdminUtils.scala f06edf41c732a7b794e496d0048b0ce6f897e72b core/src/main/scala/kafka/admin/TopicCommand.scala a90aa8787ff21b963765a547980154363c1c93c6 core/src/main/scala/kafka/common/Topic.scala 32595d6fe432141119db26d3b5ebe229aac40805 core/src/test/scala/unit/kafka/admin/AdminTest.scala 252ac813c8df1780c2dc5fa9e698fb43bb6d5cf8 core/src/test/scala/unit/kafka/common/TopicTest.scala 79532c89c41572ba953c4dc3319a05354927e961 Diff: https://reviews.apache.org/r/36570/diff/ Testing --- Thanks, Grant Henke
[jira] [Commented] (KAFKA-2275) Add a ListTopics() API to the new consumer
[ https://issues.apache.org/jira/browse/KAFKA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634151#comment-14634151 ] Jason Gustafson commented on KAFKA-2275: [~singhashish], thanks for the response. Comments below. {quote} Maybe I should move the flag, indicating cluster has metadata for all topics or subset of topics, to Cluster. Makes sense? {quote} Yeah, that might work. I've also wondered if we could just keep a separate Cluster object when querying for all metadata, but it feels like overkill. I actually sort of think that we need to be able to send metadata requests through NetworkClient without it hijacking the response. Then we wouldn't need to worry about partitionsFor polluting the state of the consumer with metadata we don't care about. Perhaps this could be done by having NetworkClient peek at the in-flight requests to see if there is a pending metadata request instead of just consuming the response directly. {quote} I do not think there is a way to directly subscribe to metadata changes as of now. Correct me if my understanding is wrong. One would have to periodically poll to get metadata updates. Now, the question becomes where should this polling be done? With the current modification, the regex subscriber will have to manage the polling logic. We can definitely push the polling logic down to say Network client, but then the question will be is it required? Let me know your thoughts. {quote} I think we can manage the polling with a background task (sort of like how heartbeats and auto-commits are done). But if we're sort of concurrently sending out requests for all topics and for only a subset of the topics, we'd have to get a little lucky that the right metadata is available when the task runs. Does that make sense? Add a ListTopics() API to the new consumer -- Key: KAFKA-2275 URL: https://issues.apache.org/jira/browse/KAFKA-2275 Project: Kafka Issue Type: Sub-task Components: consumer Reporter: Guozhang Wang Assignee: Ashish K Singh Priority: Critical Fix For: 0.8.3 Attachments: KAFKA-2275.patch, KAFKA-2275_2015-07-17_21:39:27.patch, KAFKA-2275_2015-07-20_10:44:19.patch With regex subscription like {code} consumer.subscribe(topic*) {code} The partition assignment is automatically done at the Kafka side, while there are some use cases where consumers want regex subscriptions but not Kafka-side partition assignment, rather with their own specific partition assignment. With ListTopics() they can periodically check for topic list changes and specifically subscribe to the partitions of the new topics. For implementation, it involves sending a TopicMetadataRequest to a random broker and parse the response. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92307 --- clients/src/main/java/org/apache/kafka/common/Cluster.java (line 188) https://reviews.apache.org/r/36590/#comment146380 This method name is sort of a misnomer: ``pruneCluster`` for what? Firstly, it doesn't specify what it is pruning (the topics? the partitionInfo? Both?). Secondly, it is not modifying the current cluster object, but returning a new instance with only topic that have one or more ``partitionInfo``. I don't know which name would be better (pruneEmptyPartitionTopics?), but we can come up with something a bit more descriptive, I guess. :) - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: [DISCUSS] KIP-27 - Conditional Publish
Up to Ben to clarify, but I'd think that in this case, it is up to the logic of B to decide what to do. B knows that the offset isn't what it expects, so it can react accordingly. If it chooses to try again, then it should not violate any application invariant. -Flavio On Fri, Jul 17, 2015 at 9:49 PM, Ashish Singh asi...@cloudera.com wrote: Good concept. I have a question though. Say there are two producers A and B. Both producers are producing to same partition. - A sends a message with expected offset, x1 - Broker accepts is and sends an Ack - B sends a message with expected offset, x1 - Broker rejects it, sends nack - B sends message again with expected offset, x1+1 - Broker accepts it and sends Ack I guess this is what this KIP suggests, right? If yes, then how does this ensure that same message will not be written twice when two producers are producing to same partition? Producer on receiving a nack will try again with next offset and will keep doing so till the message is accepted. Am I missing something? Also, you have mentioned on KIP, it imposes little to no runtime cost in memory or time, I think that is not true for time. With this approach producers' performance will reduce proportionally to number of producers writing to same partition. Please correct me if I am missing out something. On Fri, Jul 17, 2015 at 11:32 AM, Mayuresh Gharat gharatmayures...@gmail.com wrote: If we have 2 producers producing to a partition, they can be out of order, then how does one producer know what offset to expect as it does not interact with other producer? Can you give an example flow that explains how it works with single producer and with multiple producers? Thanks, Mayuresh On Fri, Jul 17, 2015 at 10:28 AM, Flavio Junqueira fpjunque...@yahoo.com.invalid wrote: I like this feature, it reminds me of conditional updates in zookeeper. I'm not sure if it'd be best to have some mechanism for fencing rather than a conditional write like you're proposing. The reason I'm saying this is that the conditional write applies to requests individually, while it sounds like you want to make sure that there is a single client writing so over multiple requests. -Flavio On 17 Jul 2015, at 07:30, Ben Kirwin b...@kirw.in wrote: Hi there, I just added a KIP for a 'conditional publish' operation: a simple CAS-like mechanism for the Kafka producer. The wiki page is here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-27+-+Conditional+Publish And there's some previous discussion on the ticket and the users list: https://issues.apache.org/jira/browse/KAFKA-2260 https://mail-archives.apache.org/mod_mbox/kafka-users/201506.mbox/%3CCAAeOB6ccyAA13YNPqVQv2o-mT5r=c9v7a+55sf2wp93qg7+...@mail.gmail.com%3E As always, comments and suggestions are very welcome. Thanks, Ben -- -Regards, Mayuresh R. Gharat (862) 250-7125 -- Regards, Ashish
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92316 --- clients/src/main/java/org/apache/kafka/common/Cluster.java (line 194) https://reviews.apache.org/r/36590/#comment146392 Sorry, a correction: partitionInfos.addAll(partitions); - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92309 --- clients/src/main/java/org/apache/kafka/clients/NetworkClient.java (line 478) https://reviews.apache.org/r/36590/#comment146382 ``topics`` is a SetString. Also, it's best practice to use Collections.StringemptySet() instead of Collections.EMPTY_SET. - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92313 --- clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java (line 1056) https://reviews.apache.org/r/36590/#comment146385 There's any reason NOT to reuse parts here? I mean, ``topicAndPartitionInfoMap.put(topic, parts)`` instead of calling ``cluster.partitionsForTopic(topic)`` again? Maybe because the partitionInfo can change under our feet between the executions of lines L#1051 and L#1056??? - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
On July 19, 2015, 1:11 a.m., Jason Gustafson wrote: Ashish Singh wrote: Jason, thanks for your review! I looked into ConsumerNetworkClient/ NetwrokClient, Metadata and Cluster classes. On receiving metadataUpdate, cluster instance in metadata is updated. However, when a topic is added by consumer, it is added to metadata.topics. After considering various options, I have updated the patch with what I think is the least obtrusive changes. So, we still keep metadata.topics as the list of topics we are interested in maintaining the state for, however we can choose to get metadata for all topics by setting metadata.needMetadataForAllTopics. One thing to notice is that in the current implementation there is no caching for allTopics metadata, which might be OK depending on how we are planning to use it. We can discuss further once you take a look at the latest patch. Hey Ashish, that makes sense and I agree that it seems less obtrusive. One concern I have is that we're using the same Cluster object in Metadata for representing both the set of all metadata and for just a subset (those topics that have been added through subscriptions). It seems like there might be potential for conflict there. Additionally I'm not sure how we'll be able to extend this to handle regex subscriptions. Basically we need to be able to listen for metadata changes and update our subscriptions based on any topic changes. We could block to get all metdata, but it's probably best if we can do this asynchronously. Do you have any thoughts on this? - Jason --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92194 --- On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
Re: Review Request 36590: Patch for KAFKA-2275
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/#review92315 --- clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java (line 1081) https://reviews.apache.org/r/36590/#comment146387 Not a big deal here, but it would be nice to return a ``topicAndPartitionInfoMap`` wrapped into a ``Collections.unmodifiableMap``. Same would be nice for original ``partitionsFor`` (a ``Collections.unmodifiableList`` in that case) - Edward Ribeiro On July 20, 2015, 5:44 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36590/ --- (Updated July 20, 2015, 5:44 p.m.) Review request for kafka. Bugs: KAFKA-2275 https://issues.apache.org/jira/browse/KAFKA-2275 Repository: kafka Description --- Add logic to get all topics when needMetadataForAllTopics is set on metadata Return metadata for all topics if empty list is passed to partitionsFor KAFKA-2275: Add a MapString, ListPartitionInfo partitionsFor(String... topics) API to the new consumer Diffs - clients/src/main/java/org/apache/kafka/clients/Metadata.java 0387f2602c93a62cd333f1b3c569ca6b66b5b779 clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 48fe7961e2215372d8033ece4af739ea06c6457b clients/src/main/java/org/apache/kafka/clients/consumer/Consumer.java 252b759c0801f392e3526b0f31503b4b8fbf1c8a clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java bea3d737c51be77d5b5293cdd944d33b905422ba clients/src/main/java/org/apache/kafka/clients/consumer/MockConsumer.java c14eed1e95f2e682a235159a366046f00d1d90d6 clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerNetworkClient.java 9517d9d0cd480d5ba1d12f1fde7963e60528d2f8 clients/src/main/java/org/apache/kafka/common/Cluster.java 60594a7dce90130911a626ea80cf80d815aeb46e core/src/test/scala/integration/kafka/api/ConsumerTest.scala 3eb5f95731a3f06f662b334ab2b3d0ad7fa9e1ca Diff: https://reviews.apache.org/r/36590/diff/ Testing --- Thanks, Ashish Singh
[jira] [Commented] (KAFKA-2339) broker becomes unavailable if bad data is passed through the protocol
[ https://issues.apache.org/jira/browse/KAFKA-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14633756#comment-14633756 ] Joe Stein commented on KAFKA-2339: -- I haven't had a chance to try to reproduce this yet more exactly. I will see about doing that in the next day or so. broker becomes unavailable if bad data is passed through the protocol - Key: KAFKA-2339 URL: https://issues.apache.org/jira/browse/KAFKA-2339 Project: Kafka Issue Type: Bug Reporter: Joe Stein Assignee: Timothy Chen Priority: Critical Fix For: 0.8.3 I ran into a situation that a non integer value got past for the partition and the brokers went bonkers. reproducible {code} ah=1..2 echo don't do this in production|kafkacat -b localhost:9092 -p $ah {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 34492: Patch for KAFKA-2210
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34492/#review92345 --- core/src/main/scala/kafka/security/auth/Acl.scala (line 71) https://reviews.apache.org/r/34492/#comment146470 Disclaimer: I am not claiming that you should change the code commented here. Okay, for getting rid of the dreaded ``collection.mutable.HashSet[Acl]()``, you have two options, afaik: 1. use ``(for (i - list) yield i).toSet``. In the current code it would be something like: ``` val acls = (for (item - aclSet) { val principals: List[KafkaPrincipal] = item(PrincipalKey).asInstanceOf[List[String]].map(principal = KafkaPrincipal.fromString(principal)) val permissionType: PermissionType = PermissionType.fromString(item(PermissionTypeKey).asInstanceOf[String]) val operations: List[Operation] = item(OperationKey).asInstanceOf[List[String]].map(operation = Operation.fromString(operation)) val hosts: List[String] = item(HostsKey).asInstanceOf[List[String]] yield new Acl(principals.toSet, permissionType, hosts.toSet, operations.toSet) }).toSet ``` The surrounding parenthesis around the ``for`` comprehesion are important as ``yield`` would return the same Collection type from aclSet (a List in this case). 2. To use a (private) helper recursive function like, for example: ``` private def listToSet(list: List[Map[String, Any]]): Set[Acl] = { list match { case item::tail = { // L#72 - L#75 processing over `item` Set(new Acl(...)) ++ listToSet(tail) } case Nil = Set.empty[Acl] } } ``` can call it from ``fromJson`` on ``aclSet`` instead of doing a ``foreach``. In fact, most of lines L#72 - L#75 are composed of Lists that eventually get converted to set (principals, hosts, operations and acls), so you could generify the helper function above, so that you could pass a 'convertion' function, but here I am wary of the complexity of the code starting to outweight the benefits (?) of not using mutable data structures... Nevertheless, it would look like: ``` def listToSet[T,K](list: List[T], f: T = K): Set[K] = { list match { case head::tail = Set(f(head)) ++ listToSet(tail, f) case Nil = Set.empty[K] } } ``` - Edward Ribeiro On July 20, 2015, 11:42 p.m., Parth Brahmbhatt wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34492/ --- (Updated July 20, 2015, 11:42 p.m.) Review request for kafka. Bugs: KAFKA-2210 https://issues.apache.org/jira/browse/KAFKA-2210 Repository: kafka Description --- Addressing review comments from Jun. Adding CREATE check for offset topic only if the topic does not exist already. Addressing some more comments. Removing acl.json file Moving PermissionType to trait instead of enum. Following the convention for defining constants. Adding authorizer.config.path back. Addressing more comments from Jun. Diffs - core/src/main/scala/kafka/api/OffsetRequest.scala f418868046f7c99aefdccd9956541a0cb72b1500 core/src/main/scala/kafka/common/AuthorizationException.scala PRE-CREATION core/src/main/scala/kafka/common/ErrorMapping.scala c75c68589681b2c9d6eba2b440ce5e58cddf6370 core/src/main/scala/kafka/security/auth/Acl.scala PRE-CREATION core/src/main/scala/kafka/security/auth/Authorizer.scala PRE-CREATION core/src/main/scala/kafka/security/auth/KafkaPrincipal.scala PRE-CREATION core/src/main/scala/kafka/security/auth/Operation.scala PRE-CREATION core/src/main/scala/kafka/security/auth/PermissionType.scala PRE-CREATION core/src/main/scala/kafka/security/auth/Resource.scala PRE-CREATION core/src/main/scala/kafka/security/auth/ResourceType.scala PRE-CREATION core/src/main/scala/kafka/server/KafkaApis.scala 18f5b5b895af1469876b2223841fd90a2dd255e0 core/src/main/scala/kafka/server/KafkaConfig.scala dbe170f87331f43e2dc30165080d2cb7dfe5fdbf core/src/main/scala/kafka/server/KafkaServer.scala 18917bc4464b9403b16d85d20c3fd4c24893d1d3 core/src/test/scala/unit/kafka/security/auth/AclTest.scala PRE-CREATION core/src/test/scala/unit/kafka/security/auth/KafkaPrincipalTest.scala PRE-CREATION core/src/test/scala/unit/kafka/security/auth/OperationTest.scala PRE-CREATION core/src/test/scala/unit/kafka/security/auth/PermissionTypeTest.scala PRE-CREATION core/src/test/scala/unit/kafka/security/auth/ResourceTest.scala PRE-CREATION
Re: Review Request 34492: Patch for KAFKA-2210
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34492/#review92347 --- core/src/main/scala/kafka/security/auth/PermissionType.scala (line 47) https://reviews.apache.org/r/34492/#comment146474 The ``return`` here is redundant. In fact the L#46 - L#48 could be rewritten as either: ``` def values() : List[PermissionType] = { List(Allow, Deny) } ``` or just ``` def values = List(Allow, Deny) ``` please, check with the committers what they prefer. In any case the return is a unnecessary. ;) - Edward Ribeiro On July 20, 2015, 11:42 p.m., Parth Brahmbhatt wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34492/ --- (Updated July 20, 2015, 11:42 p.m.) Review request for kafka. Bugs: KAFKA-2210 https://issues.apache.org/jira/browse/KAFKA-2210 Repository: kafka Description --- Addressing review comments from Jun. Adding CREATE check for offset topic only if the topic does not exist already. Addressing some more comments. Removing acl.json file Moving PermissionType to trait instead of enum. Following the convention for defining constants. Adding authorizer.config.path back. Addressing more comments from Jun. Diffs - core/src/main/scala/kafka/api/OffsetRequest.scala f418868046f7c99aefdccd9956541a0cb72b1500 core/src/main/scala/kafka/common/AuthorizationException.scala PRE-CREATION core/src/main/scala/kafka/common/ErrorMapping.scala c75c68589681b2c9d6eba2b440ce5e58cddf6370 core/src/main/scala/kafka/security/auth/Acl.scala PRE-CREATION core/src/main/scala/kafka/security/auth/Authorizer.scala PRE-CREATION core/src/main/scala/kafka/security/auth/KafkaPrincipal.scala PRE-CREATION core/src/main/scala/kafka/security/auth/Operation.scala PRE-CREATION core/src/main/scala/kafka/security/auth/PermissionType.scala PRE-CREATION core/src/main/scala/kafka/security/auth/Resource.scala PRE-CREATION core/src/main/scala/kafka/security/auth/ResourceType.scala PRE-CREATION core/src/main/scala/kafka/server/KafkaApis.scala 18f5b5b895af1469876b2223841fd90a2dd255e0 core/src/main/scala/kafka/server/KafkaConfig.scala dbe170f87331f43e2dc30165080d2cb7dfe5fdbf core/src/main/scala/kafka/server/KafkaServer.scala 18917bc4464b9403b16d85d20c3fd4c24893d1d3 core/src/test/scala/unit/kafka/security/auth/AclTest.scala PRE-CREATION core/src/test/scala/unit/kafka/security/auth/KafkaPrincipalTest.scala PRE-CREATION core/src/test/scala/unit/kafka/security/auth/OperationTest.scala PRE-CREATION core/src/test/scala/unit/kafka/security/auth/PermissionTypeTest.scala PRE-CREATION core/src/test/scala/unit/kafka/security/auth/ResourceTest.scala PRE-CREATION core/src/test/scala/unit/kafka/security/auth/ResourceTypeTest.scala PRE-CREATION core/src/test/scala/unit/kafka/server/KafkaConfigConfigDefTest.scala 04a02e08a54139ee1a298c5354731bae009efef3 Diff: https://reviews.apache.org/r/34492/diff/ Testing --- Thanks, Parth Brahmbhatt