[jira] [Commented] (KAFKA-9338) Incremental fetch sessions do not maintain or use leader epoch for fencing purposes
[ https://issues.apache.org/jira/browse/KAFKA-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016601#comment-17016601 ] ASF GitHub Bot commented on KAFKA-9338: --- hachikuji commented on pull request #7970: KAFKA-9338; Fetch session should cache request leader epoch URL: https://github.com/apache/kafka/pull/7970 Since the leader epoch was not maintained in the fetch session cache, no validation would be done except for the initial (full) fetch request. This patch adds the leader epoch to the session cache and addresses the testing gaps. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Incremental fetch sessions do not maintain or use leader epoch for fencing > purposes > --- > > Key: KAFKA-9338 > URL: https://issues.apache.org/jira/browse/KAFKA-9338 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.1.0, 2.2.0, 2.3.0, 2.4.0 >Reporter: Lucas Bradstreet >Assignee: Jason Gustafson >Priority: Major > > KIP-320 adds the ability to fence replicas by detecting stale leader epochs > from followers, and helping consumers handle unclean truncation. > Unfortunately the incremental fetch session handling does not maintain or use > the leader epoch in the fetch session cache. As a result, it does not appear > that the leader epoch is used for fencing a majority of the time. I'm not > sure if this is only the case after incremental fetch sessions are > established - it may be the case that the first "full" fetch session is safe. > Optional.empty is returned for the FetchRequest.PartitionData here: > [https://github.com/apache/kafka/blob/a4cbdc6a7b3140ccbcd0e2339e28c048b434974e/core/src/main/scala/kafka/server/FetchSession.scala#L111] > I believe this affects brokers from 2.1.0 when fencing was improved on the > replica fetcher side, and 2.3.0 and above for consumers, which is when client > side truncation detection was added on the consumer side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9410) Make groupId Optional in KafkaConsumer
[ https://issues.apache.org/jira/browse/KAFKA-9410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016553#comment-17016553 ] ASF GitHub Bot commented on KAFKA-9410: --- hachikuji commented on pull request #7943: KAFKA-9410: Make groupId Optional in KafkaConsumer URL: https://github.com/apache/kafka/pull/7943 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Make groupId Optional in KafkaConsumer > -- > > Key: KAFKA-9410 > URL: https://issues.apache.org/jira/browse/KAFKA-9410 > Project: Kafka > Issue Type: Improvement > Components: clients >Reporter: David Mollitor >Priority: Minor > > ... because it is. > > [https://docs.oracle.com/javase/8/docs/api/java/util/Optional.html] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7317) Use collections subscription for main consumer to reduce metadata
[ https://issues.apache.org/jira/browse/KAFKA-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016522#comment-17016522 ] ASF GitHub Bot commented on KAFKA-7317: --- ableegoldman commented on pull request #7969: KAFKA-7317: Use collections subscription for main consumer to reduce metadata URL: https://github.com/apache/kafka/pull/7969 Also addresses [KAFKA-8821](https://issues.apache.org/jira/browse/KAFKA-8821) Note that we still have to fall back to using pattern subscription if the user has added any regex-based source nodes to the topology. Includes some minor cleanup on the side This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Use collections subscription for main consumer to reduce metadata > - > > Key: KAFKA-7317 > URL: https://issues.apache.org/jira/browse/KAFKA-7317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Sophie Blee-Goldman >Priority: Major > > In KAFKA-4633 we switched from "collection subscription" to "pattern > subscription" for `Consumer#subscribe()` to avoid triggering auto topic > creating on the broker. In KAFKA-5291, the metadata request was extended to > overwrite the broker config within the request itself. However, this feature > is only used in `KafkaAdminClient`. KAFKA-7320 adds this feature for the > consumer client, too. > This ticket proposes to use the new feature within Kafka Streams to allow the > usage of collection based subscription in consumer and admit clients to > reduce the metadata response size than can be very large for large number of > partitions in the cluster. > Note, that Streams need to be able to distinguish if it connects to older > brokers that do not support the new metadata request and still use pattern > subscription for this case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9259) suppress() for windowed-Serdes does not work with default serdes
[ https://issues.apache.org/jira/browse/KAFKA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016489#comment-17016489 ] John Roesler commented on KAFKA-9259: - Yep! That sounds right. > suppress() for windowed-Serdes does not work with default serdes > > > Key: KAFKA-9259 > URL: https://issues.apache.org/jira/browse/KAFKA-9259 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 2.1.0 >Reporter: Matthias J. Sax >Assignee: Omkar Mestry >Priority: Major > Labels: newbie > > The suppress() operator either inherits serdes from its upstream operator or > falls back to default serdes from the config. > If the upstream operator is an windowed aggregation, the window-aggregation > operator wraps the user passed-in serde with a window-serde and pushed it > into suppress() – however, if default serdes are used, the window-aggregation > operator cannot push anything into suppress(). At runtime, it just creates a > default serde and wraps it according. For this case, suppress() also falls > back to default serdes; however, it does not wrap the serde and thus a > ClassCastException is thrown when the serde is used later. > suppress() is already aware if the upstream aggregation is time/session > windowed or not and thus should use this information to wrap default serdes > accordingly. > The current workaround for windowed-suppress is to overwrite the default > serde upstream to suppress(), such that suppress() inherits serdes and does > not fall back to default serdes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9338) Incremental fetch sessions do not maintain or use leader epoch for fencing purposes
[ https://issues.apache.org/jira/browse/KAFKA-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016488#comment-17016488 ] Jason Gustafson commented on KAFKA-9338: [~lucasbradstreet] Thanks, good find. > Incremental fetch sessions do not maintain or use leader epoch for fencing > purposes > --- > > Key: KAFKA-9338 > URL: https://issues.apache.org/jira/browse/KAFKA-9338 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.1.0, 2.2.0, 2.3.0, 2.4.0 >Reporter: Lucas Bradstreet >Assignee: Jason Gustafson >Priority: Major > > KIP-320 adds the ability to fence replicas by detecting stale leader epochs > from followers, and helping consumers handle unclean truncation. > Unfortunately the incremental fetch session handling does not maintain or use > the leader epoch in the fetch session cache. As a result, it does not appear > that the leader epoch is used for fencing a majority of the time. I'm not > sure if this is only the case after incremental fetch sessions are > established - it may be the case that the first "full" fetch session is safe. > Optional.empty is returned for the FetchRequest.PartitionData here: > [https://github.com/apache/kafka/blob/a4cbdc6a7b3140ccbcd0e2339e28c048b434974e/core/src/main/scala/kafka/server/FetchSession.scala#L111] > I believe this affects brokers from 2.1.0 when fencing was improved on the > replica fetcher side, and 2.3.0 and above for consumers, which is when client > side truncation detection was added on the consumer side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9439) Add more public API tests for KafkaProducer
[ https://issues.apache.org/jira/browse/KAFKA-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016479#comment-17016479 ] Xiang Zhang commented on KAFKA-9439: I am interested in this, as a newbie, could I get some advice on this ? thanks > Add more public API tests for KafkaProducer > --- > > Key: KAFKA-9439 > URL: https://issues.apache.org/jira/browse/KAFKA-9439 > Project: Kafka > Issue Type: Improvement >Reporter: Boyang Chen >Priority: Major > Labels: newbie > > While working on KIP-447, we realized a lack of test coverage on the > KafkaProducer public APIs. For example, `commitTransaction` and > `sendOffsetsToTransaction` are not even called in the > `KafkaProducerTest.java` and the code coverage is only 75%. > Adding more unit tests here will be pretty valuable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9338) Incremental fetch sessions do not maintain or use leader epoch for fencing purposes
[ https://issues.apache.org/jira/browse/KAFKA-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson reassigned KAFKA-9338: -- Assignee: Jason Gustafson > Incremental fetch sessions do not maintain or use leader epoch for fencing > purposes > --- > > Key: KAFKA-9338 > URL: https://issues.apache.org/jira/browse/KAFKA-9338 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.1.0, 2.2.0, 2.3.0, 2.4.0 >Reporter: Lucas Bradstreet >Assignee: Jason Gustafson >Priority: Major > > KIP-320 adds the ability to fence replicas by detecting stale leader epochs > from followers, and helping consumers handle unclean truncation. > Unfortunately the incremental fetch session handling does not maintain or use > the leader epoch in the fetch session cache. As a result, it does not appear > that the leader epoch is used for fencing a majority of the time. I'm not > sure if this is only the case after incremental fetch sessions are > established - it may be the case that the first "full" fetch session is safe. > Optional.empty is returned for the FetchRequest.PartitionData here: > [https://github.com/apache/kafka/blob/a4cbdc6a7b3140ccbcd0e2339e28c048b434974e/core/src/main/scala/kafka/server/FetchSession.scala#L111] > I believe this affects brokers from 2.1.0 when fencing was improved on the > replica fetcher side, and 2.3.0 and above for consumers, which is when client > side truncation detection was added on the consumer side. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9323) Refactor Streams' upgrade system tests
[ https://issues.apache.org/jira/browse/KAFKA-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016438#comment-17016438 ] Sophie Blee-Goldman commented on KAFKA-9323: I can help you figure out the upgrade matrix though. Might need some input on the much much older versions, which may not be compatible for upgrades at all, but here are the general rules (the first matching upgrade path is the correct one): # Cooperative upgrade: an upgrade from EAGER to COOPERATIVE rebalancing. The cooperative protocol was introduced in 2.4 and turned on by default, so this includes upgrades where from_version < 2.4 and to_version >= 2.4 # Version probing upgrade: an upgrade across a metadata version bump (see below), where from_version >= 2.0 (version probing was introduced in 2.0) # Metadata upgrade: an upgrade across a metadata version bump, where from_version < 2.0 # Simple upgrade: normal upgrade (no change in rebalance protocol or metadata version) The Streams versions corresponding to each metadata version so far are: # --> 0.10.0 # --> 0.10.1, 0.10.2, 0.11.0, 2.0, 1.1 # --> 2.0 # --> 2.1, 2.2, 2.3 # --> 2.4 > Refactor Streams' upgrade system tests > --- > > Key: KAFKA-9323 > URL: https://issues.apache.org/jira/browse/KAFKA-9323 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Sophie Blee-Goldman >Assignee: Nikolay Izhikov >Priority: Major > > With the introduction of version probing in 2.0 and cooperative rebalancing > in 2.4, the specific upgrade path depends heavily on the to & from version. > This can be a complex operation, and we should make sure to test a realistic > upgrade scenario across all possible combinations. The current system tests > have gaps however, which have allowed bugs in the upgrade path to slip by > unnoticed for several versions. > Our current system tests include a "simple upgrade downgrade" test, a > metadata upgrade test, a version probing test, and a cooperative upgrade > test. This has a few drawbacks and current issues: > a) only the version probing test tests "forwards compatibility" (upgrade from > latest to future version) > b) nothing tests version probing "backwards compatibility" (upgrade from > older version to latest), except: > c) the cooperative rebalancing test actually happens to involve a version > probing step, and so coincidentally DOES test VP (but only starting with 2.4) > d) each test itself tries to test the upgrade across different versions, > meaning there may be overlap and/or unnecessary tests. For example, currently > both the metadata_upgrade and cooperative_upgrade tests will test the upgrade > of 1.0 -> 2.4 > e) worse, a number of (to, from) pairs are not tested according to the > correct upgrade path at all, which has lead to easily reproducible bugs > slipping past for several versions. > f) we have a test_simple_upgrade_downgrade test which does not actually do a > downgrade, and for some reason only tests upgrading within the range [0.10.1 > - 1.1] > g) as new versions are released, it is unclear to those not directly involved > in these tests and/or projects whether and what needs to be updated (eg > should this new version be added to the cooperative test? should the old > version be aded to the metadata test?) > We should definitely fill in the testing gap here, but how to do so is of > course up for discussion. > I would propose to refactor the upgrade tests, and rather than maintain > different lists of versions to pass as input to each different test, we > should have a single matrix that we update with each new version that > specifies which upgrade path that version combination actually requires. We > can then loop through each version combination and test only the actual > upgrade path that users would actually need to follow. This way we can be > sure we are not missing anything, as each and every possible upgrade would be > tested. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (KAFKA-9323) Refactor Streams' upgrade system tests
[ https://issues.apache.org/jira/browse/KAFKA-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016420#comment-17016420 ] Sophie Blee-Goldman edited comment on KAFKA-9323 at 1/16/20 12:58 AM: -- Hey [~nizhikov], thanks for picking this up! The relevant files to this are streams_cooperative_rebalance_upgrade_test.py and streams_upgrade_test.py We don't have this upgrade matrix written down anywhere, at the moment this information is pretty scattered around the system tests and the upgrade guide. This makes it difficult for maintainers to understand and update the tests correctly, and difficult for users to discover how to upgrade between specific versions. We should really add this to the docs in addition to the system tests... Besides compiling this matrix, we definitely need to add a new version probing test that tests upgrading TO the current version. And obviously, we should only run this test for upgrades that actually require version probing, but _don't_ require a cooperative upgrade (since that will include VP) – note this means we can probably just copy over much of the code from the first half of the cooperative test. I'd personally want to call this new test test_version_probing_upgrade and then rename the existing VP test to test_future_version_probing_upgrade (will refer to these like this going forward) Then, we should refactor the tests to leverage a single source of truth (the upgrade matrix) to determine which upgrade path should be followed for a given to/from version. Each version combination should be run by _exactly one_ system test. I think how exactly to go about this is up for some discussion (and I'm sure [~mjsax] has strong opinions on it :P ) but I would propose to have a single test_upgrade system test parametrized by all versions (with to > from) that just calls the appropriate upgrade path by looking it up in the upgrade matrix. This way it can share all the setup, etc and the actual individual upgrades can isolate and focus on the specific upgrade path it's testing. Each entry would point to one of the following functions, which contain just the upgrade logic extracted from the corresponding system test. * existing test_simple_upgrade_downgrade -> do_simple_upgrade(to, from): * cooperative_rebalance_upgrade_test -> do_cooperative_upgrade(to, from): * test_metadata_upgrade -> do_metadata_upgrade(to, from): * test_version_probing_upgrade (the new one, not the "future" one) -> do_version_probing_upgrade(to, from): We can list all the "rules" for determining which upgrade path above the upgrade matrix, so maintainers will only need to look at one place to update as new versions come out. This way we are guaranteed to test all possible upgrades, and do so according to the correct upgrade path. We also make it _much_ easier and faster for developers working on new features to add an upgrade path to the system tests, and be sure it will be followed (when appropriate) was (Author: ableegoldman): Hey [~nizhikov], thanks for picking this up! The relevant files to this are streams_cooperative_rebalance_upgrade_test.py and streams_upgrade_test.py We don't have this upgrade matrix written down anywhere, at the moment this information is pretty scattered around the system tests and the upgrade guide. This makes it difficult for maintainers to understand and update the tests correctly, and difficult for users to discover how to upgrade between specific versions. We should really add this to the docs in addition to the system tests... Besides compiling this matrix, we definitely need to add a new version probing test that tests upgrading TO the current version. And obviously, we should only run this test for upgrades that actually require version probing, but _don't_ require a cooperative upgrade (since that will include VP) – note this means we can probably just copy over much of the code from the first half of the cooperative test. I'd personally want to call this new test test_version_probing_upgrade and then rename the existing VP test to test_future_version_probing_upgrade (will refer to these like this going forward) Then, we should refactor the tests to leverage a single source of truth (the upgrade matrix) to determine which upgrade path should be followed for a given to/from version. Each version combination should be run by _exactly one_ system test. I think how exactly to go about this is up for some discussion (and I'm sure [~mjsax] has strong opinions on it :P ) but I would propose to have a single test_upgrade system test parametrized by all versions (with to > from) that just calls the appropriate upgrade path by looking it up in the upgrade matrix. This way it can share all the setup, etc and the actual individual upgrades can isolate and focus on the specific upgrade path it's testing. Each entry
[jira] [Commented] (KAFKA-9323) Refactor Streams' upgrade system tests
[ https://issues.apache.org/jira/browse/KAFKA-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016420#comment-17016420 ] Sophie Blee-Goldman commented on KAFKA-9323: Hey [~nizhikov], thanks for picking this up! The relevant files to this are streams_cooperative_rebalance_upgrade_test.py and streams_upgrade_test.py We don't have this upgrade matrix written down anywhere, at the moment this information is pretty scattered around the system tests and the upgrade guide. This makes it difficult for maintainers to understand and update the tests correctly, and difficult for users to discover how to upgrade between specific versions. We should really add this to the docs in addition to the system tests... Besides compiling this matrix, we definitely need to add a new version probing test that tests upgrading TO the current version. And obviously, we should only run this test for upgrades that actually require version probing, but _don't_ require a cooperative upgrade (since that will include VP) – note this means we can probably just copy over much of the code from the first half of the cooperative test. I'd personally want to call this new test test_version_probing_upgrade and then rename the existing VP test to test_future_version_probing_upgrade (will refer to these like this going forward) Then, we should refactor the tests to leverage a single source of truth (the upgrade matrix) to determine which upgrade path should be followed for a given to/from version. Each version combination should be run by _exactly one_ system test. I think how exactly to go about this is up for some discussion (and I'm sure [~mjsax] has strong opinions on it :P ) but I would propose to have a single test_upgrade system test parametrized by all versions (with to > from) that just calls the appropriate upgrade path by looking it up in the upgrade matrix. This way it can share all the setup, etc and the actual individual upgrades can isolate and focus on the specific upgrade path it's testing. Each entry would point to one of the following functions, which contain just the upgrade logic extracted from the corresponding system test. * existing test_simple_upgrade_downgrade -> do_simple_upgrade(to, from): * cooperative_rebalance_upgrade_test -> do_cooperative_upgrade(to, from): * test_metadata_upgrade -> do_metadata_upgrade(to, from): * test_version_probing_upgrade (the new one, not the "future" one) -> do_version_probing_upgrade(to, from): We can list all the "rules" for determining which upgrade path above the upgrade matrix, so maintainers will only need to look at one place to update as new versions come out. This way we are guaranteed to test all possible upgrades, and do so according to the correct upgrade path. We also make it _much_ easier and faster for developers working on new features to add an upgrade path to the system tests, and be sure it will be followed (when appropriate) > Refactor Streams' upgrade system tests > --- > > Key: KAFKA-9323 > URL: https://issues.apache.org/jira/browse/KAFKA-9323 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Sophie Blee-Goldman >Assignee: Nikolay Izhikov >Priority: Major > > With the introduction of version probing in 2.0 and cooperative rebalancing > in 2.4, the specific upgrade path depends heavily on the to & from version. > This can be a complex operation, and we should make sure to test a realistic > upgrade scenario across all possible combinations. The current system tests > have gaps however, which have allowed bugs in the upgrade path to slip by > unnoticed for several versions. > Our current system tests include a "simple upgrade downgrade" test, a > metadata upgrade test, a version probing test, and a cooperative upgrade > test. This has a few drawbacks and current issues: > a) only the version probing test tests "forwards compatibility" (upgrade from > latest to future version) > b) nothing tests version probing "backwards compatibility" (upgrade from > older version to latest), except: > c) the cooperative rebalancing test actually happens to involve a version > probing step, and so coincidentally DOES test VP (but only starting with 2.4) > d) each test itself tries to test the upgrade across different versions, > meaning there may be overlap and/or unnecessary tests. For example, currently > both the metadata_upgrade and cooperative_upgrade tests will test the upgrade > of 1.0 -> 2.4 > e) worse, a number of (to, from) pairs are not tested according to the > correct upgrade path at all, which has lead to easily reproducible bugs > slipping past for several versions. > f) we have a test_simple_upgrade_downgrade test which does not actually do a > downgrade, and for some
[jira] [Updated] (KAFKA-9323) Refactor Streams' upgrade system tests
[ https://issues.apache.org/jira/browse/KAFKA-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sophie Blee-Goldman updated KAFKA-9323: --- Description: With the introduction of version probing in 2.0 and cooperative rebalancing in 2.4, the specific upgrade path depends heavily on the to & from version. This can be a complex operation, and we should make sure to test a realistic upgrade scenario across all possible combinations. The current system tests have gaps however, which have allowed bugs in the upgrade path to slip by unnoticed for several versions. Our current system tests include a "simple upgrade downgrade" test, a metadata upgrade test, a version probing test, and a cooperative upgrade test. This has a few drawbacks and current issues: a) only the version probing test tests "forwards compatibility" (upgrade from latest to future version) b) nothing tests version probing "backwards compatibility" (upgrade from older version to latest), except: c) the cooperative rebalancing test actually happens to involve a version probing step, and so coincidentally DOES test VP (but only starting with 2.4) d) each test itself tries to test the upgrade across different versions, meaning there may be overlap and/or unnecessary tests. For example, currently both the metadata_upgrade and cooperative_upgrade tests will test the upgrade of 1.0 -> 2.4 e) worse, a number of (to, from) pairs are not tested according to the correct upgrade path at all, which has lead to easily reproducible bugs slipping past for several versions. f) we have a test_simple_upgrade_downgrade test which does not actually do a downgrade, and for some reason only tests upgrading within the range [0.10.1 - 1.1] g) as new versions are released, it is unclear to those not directly involved in these tests and/or projects whether and what needs to be updated (eg should this new version be added to the cooperative test? should the old version be aded to the metadata test?) We should definitely fill in the testing gap here, but how to do so is of course up for discussion. I would propose to refactor the upgrade tests, and rather than maintain different lists of versions to pass as input to each different test, we should have a single matrix that we update with each new version that specifies which upgrade path that version combination actually requires. We can then loop through each version combination and test only the actual upgrade path that users would actually need to follow. This way we can be sure we are not missing anything, as each and every possible upgrade would be tested. was: With the introduction of version probing in 2.0 and cooperative rebalancing in 2.4, the specific upgrade path depends heavily on the to & from version. This can be a complex operation, and we should make sure to test a realistic upgrade scenario across all possible combinations. The current system tests have gaps however, which have allowed bugs in the upgrade path to slip by unnoticed for several versions. Our current system tests include a metadata upgrade test, a version probing test, and a cooperative upgrade test. This has a few drawbacks: a) only the version probing test tests "forwards compatibility" (upgrade from latest to future version) b) nothing tests version probing "backwards compatibility" (upgrade from older version to latest), except: c) the cooperative rebalancing test actually happens to involve a version probing step, and so coincidentally DOES test VP (but only starting with 2.4) d) each test itself tries to test the upgrade across different versions, meaning there may be overlap and/or unnecessary tests e) as new versions are released, it is unclear to those not directly involved in these tests and/or projects whether and what needs to be updated (eg should this new version be added to the cooperative test? should the old version be aded to the metadata test?) We should definitely fill in the testing gap here, but how to do so is of course up for discussion. I would propose to refactor the upgrade tests, and rather than maintain different lists of versions to pass as input to each different test, we should have a single matrix that we update with each new version that specifies which upgrade path that version combination actually requires. We can then loop through each version combination and test only the actual upgrade path that users would actually need to follow. This way we can be sure we are not missing anything, as each and every possible upgrade would be tested. > Refactor Streams' upgrade system tests > --- > > Key: KAFKA-9323 > URL: https://issues.apache.org/jira/browse/KAFKA-9323 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Sophie Blee-Goldman >
[jira] [Created] (KAFKA-9442) Kafka connect REST API times out when trying to create a connector
Kedar Shenoy created KAFKA-9442: --- Summary: Kafka connect REST API times out when trying to create a connector Key: KAFKA-9442 URL: https://issues.apache.org/jira/browse/KAFKA-9442 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 2.3.1 Environment: Kafka - Amazon MSK running version 2.3.1 with only Plaintext configured Connect - Docker image version : latest (5.4.0) running on Amazon ECS cluster with tasks configured. I have attached a snapshot of the environment variables passed to the image. Reporter: Kedar Shenoy Attachments: 404.txt, Startup-log.txt, environment-variables.png, get call1.jpg, get-wrong-connector-name.png, post-s3.png REST API for some resources results in a timeout and there is no exception in the logs. Summary of what was done so far: * Kafka connect docker image running as a ECS service with 2 tasks ** Service starts up and creates the 3 topics as expected. ** There are a few warnings during start up but no errors - logs attached * REST API ** Root resource / returns the connect version. ** GET call to /connectors returns a empty list as expected. ** wrong resource e.g. /connectors1 returns a 404 ** GET call with a wrong id results in a timeout e,g, /connectors/abc where abc does not exist ** POST call to create a connector config results in timeout ** POST call to create a connector with some wrong data returns a 400 e.g. providing different values for attribute "name" in top level vs config entry -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016395#comment-17016395 ] Matthias J. Sax commented on KAFKA-7591: SGTM. > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9441) Refactor commit logic
Matthias J. Sax created KAFKA-9441: -- Summary: Refactor commit logic Key: KAFKA-9441 URL: https://issues.apache.org/jira/browse/KAFKA-9441 Project: Kafka Issue Type: Sub-task Components: streams Reporter: Matthias J. Sax Using producer per thread in combination with EOS, it's not possible any longer to commit individual task independently (as done currently). We need to refactor StreamsThread, to commit all tasks at the same time for the new model. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9441) Refactor commit logic
[ https://issues.apache.org/jira/browse/KAFKA-9441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax reassigned KAFKA-9441: -- Assignee: Matthias J. Sax > Refactor commit logic > - > > Key: KAFKA-9441 > URL: https://issues.apache.org/jira/browse/KAFKA-9441 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Major > > Using producer per thread in combination with EOS, it's not possible any > longer to commit individual task independently (as done currently). > We need to refactor StreamsThread, to commit all tasks at the same time for > the new model. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016372#comment-17016372 ] Sophie Blee-Goldman commented on KAFKA-7591: That's fair, I suppose if it _was_ the window size that changed then all bets are off anyway, while users who just changed the retention period will be able to benefit. We should just change it and log that we did, maybe in the meantime including a warning about what is/isn't a compatible change. Tangentially, we should compile a list of compatible changes that can be made dynamically vs incompatible changes that require a reset, and document that somewhere. This seems like a common question/source of confusion > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-3596) Kafka Streams: Window expiration needs to consider more than event time
[ https://issues.apache.org/jira/browse/KAFKA-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016366#comment-17016366 ] Sophie Blee-Goldman commented on KAFKA-3596: I see, maybe I misunderstood the original ticket or previous behavior then. I'm not sure if this was different for Streams in the past, but as John said we don't have "future" events, only late ones, with the grace period determining the tolerance. I'll close this as "not a problem" > Kafka Streams: Window expiration needs to consider more than event time > --- > > Key: KAFKA-3596 > URL: https://issues.apache.org/jira/browse/KAFKA-3596 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.0.0 >Reporter: Henry Cai >Priority: Minor > Labels: architecture > > Currently in Kafka Streams, the way the windows are expired in RocksDB is > triggered by new event insertion. When a window is created at T0 with 10 > minutes retention, when we saw a new record coming with event timestamp T0 + > 10 +1, we will expire that window (remove it) out of RocksDB. > In the real world, it's very easy to see event coming with future timestamp > (or out-of-order events coming with big time gaps between events), this way > of retiring a window based on one event's event timestamp is dangerous. I > think at least we need to consider both the event's event time and > server/stream time elapse. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-3596) Kafka Streams: Window expiration needs to consider more than event time
[ https://issues.apache.org/jira/browse/KAFKA-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sophie Blee-Goldman resolved KAFKA-3596. Resolution: Not A Problem > Kafka Streams: Window expiration needs to consider more than event time > --- > > Key: KAFKA-3596 > URL: https://issues.apache.org/jira/browse/KAFKA-3596 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.0.0 >Reporter: Henry Cai >Priority: Minor > Labels: architecture > > Currently in Kafka Streams, the way the windows are expired in RocksDB is > triggered by new event insertion. When a window is created at T0 with 10 > minutes retention, when we saw a new record coming with event timestamp T0 + > 10 +1, we will expire that window (remove it) out of RocksDB. > In the real world, it's very easy to see event coming with future timestamp > (or out-of-order events coming with big time gaps between events), this way > of retiring a window based on one event's event timestamp is dangerous. I > think at least we need to consider both the event's event time and > server/stream time elapse. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-3596) Kafka Streams: Window expiration needs to consider more than event time
[ https://issues.apache.org/jira/browse/KAFKA-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016358#comment-17016358 ] Matthias J. Sax commented on KAFKA-3596: I don't think that the behavior changed in 2.1 release: Note that if a "future event" arrives (ie, we have a big jump between current stream-time and the timestamp of the new record), stream-time would jump ahead and thus "still-live windows" would be closed as their grace-period and retention time passes, and thus those windows would be dropped. However, I think that this is correct behavior and changing how we compute stream-time or how we handle grace-period or retention time would be wrong. Hence, I would close this ticket as "not a problem" (not as "fixed", because there is nothing to be fixed IMHO). To actually address the problem of wrongly timestamped input records (ie, that jump ahead in time) there is only two correct approaches: 1. If the records are indeed correctly timestamped, it implies a larger "unorder" and increasing the grace-period and retention time configs is the right way to address it 2. If the record timestamp is incorrect, it should be fixed by a custom TimestampExtractor that can compare current partition-time (that is passed into the extractor to the record timestamp and either return a different (corrected) timestamp, or filter out the record by returning -1 (ie, the extractor can enforce a bound on much time can be advanced by a single record before the "jump" is considered incorrect). > Kafka Streams: Window expiration needs to consider more than event time > --- > > Key: KAFKA-3596 > URL: https://issues.apache.org/jira/browse/KAFKA-3596 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.0.0 >Reporter: Henry Cai >Priority: Minor > Labels: architecture > > Currently in Kafka Streams, the way the windows are expired in RocksDB is > triggered by new event insertion. When a window is created at T0 with 10 > minutes retention, when we saw a new record coming with event timestamp T0 + > 10 +1, we will expire that window (remove it) out of RocksDB. > In the real world, it's very easy to see event coming with future timestamp > (or out-of-order events coming with big time gaps between events), this way > of retiring a window based on one event's event timestamp is dangerous. I > think at least we need to consider both the event's event time and > server/stream time elapse. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016352#comment-17016352 ] Matthias J. Sax commented on KAFKA-7591: Personally, I don't see why updating the config and enabling this feature would make the situation worse and why we would need to block this feature? We can't detect the discussed case today, and hence, enabling this feature does not change the current "level of protection" (not better but also not worth), but would just make Kafka Streams more feature rich (ie, better). I don't see what we gain by logging a WARN if the expected config does not match the actual config: if the reason is indeed a change of the window size, the program is already starting up and if we just log a WARN and proceed, the state will get "corrupted", so we don't really gain anything from my point of view. > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9358) Explicitly resign controller leadership and broker znode
[ https://issues.apache.org/jira/browse/KAFKA-9358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016325#comment-17016325 ] ASF GitHub Bot commented on KAFKA-9358: --- mumrah commented on pull request #7968: KAFKA-9358 Explicitly unregister brokers and controller from ZK URL: https://github.com/apache/kafka/pull/7968 Instead of closing the zkClient immediately after controller shutdown, the controller now resigns on shutdown rather than relying on the zkClient shutdown to implicitly do so. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Explicitly resign controller leadership and broker znode > > > Key: KAFKA-9358 > URL: https://issues.apache.org/jira/browse/KAFKA-9358 > Project: Kafka > Issue Type: Improvement > Components: controller, core >Reporter: Lucas Bradstreet >Assignee: Lucas Bradstreet >Priority: Major > > When shutting down the controller the broker shuts down the controller and > then closes the zookeeper connection. Closing the zookeeper connection > results in ephemeral nodes being removed. It is currently critical that the > zkClient is closed after the controller is shutdown, otherwise a controller > election will not occur if the broker being shutdown is currently the > controller. > We should consider resigning leadership explicitly in the controller rather > than relying on the zookeeper client being closed. This would ensure that any > changes in shutdown order cannot lead to periods where a broker's controller > component is stopped while also maintaining leadership until the zkClient is > closed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-6629) SegmentedCacheFunctionTest does not cover session window serdes
[ https://issues.apache.org/jira/browse/KAFKA-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sophie Blee-Goldman reassigned KAFKA-6629: -- Assignee: (was: Manasvi Gupta) > SegmentedCacheFunctionTest does not cover session window serdes > --- > > Key: KAFKA-6629 > URL: https://issues.apache.org/jira/browse/KAFKA-6629 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Priority: Major > Labels: newbie, unit-test > > The SegmentedCacheFunctionTest.java only covers time window serdes, but not > session window serdes. We should fill in this coverage gap. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-6629) SegmentedCacheFunctionTest does not cover session window serdes
[ https://issues.apache.org/jira/browse/KAFKA-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016306#comment-17016306 ] Sophie Blee-Goldman commented on KAFKA-6629: Unassigned since there has been no activity for over a year > SegmentedCacheFunctionTest does not cover session window serdes > --- > > Key: KAFKA-6629 > URL: https://issues.apache.org/jira/browse/KAFKA-6629 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Guozhang Wang >Assignee: Manasvi Gupta >Priority: Major > Labels: newbie, unit-test > > The SegmentedCacheFunctionTest.java only covers time window serdes, but not > session window serdes. We should fill in this coverage gap. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-3596) Kafka Streams: Window expiration needs to consider more than event time
[ https://issues.apache.org/jira/browse/KAFKA-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016296#comment-17016296 ] Sophie Blee-Goldman commented on KAFKA-3596: [~vvcephei] Can we close this? I think I agree with your assessment, and this is no longer an issue. Is 2.1 the correct fix version? > Kafka Streams: Window expiration needs to consider more than event time > --- > > Key: KAFKA-3596 > URL: https://issues.apache.org/jira/browse/KAFKA-3596 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.0.0 >Reporter: Henry Cai >Priority: Minor > Labels: architecture > > Currently in Kafka Streams, the way the windows are expired in RocksDB is > triggered by new event insertion. When a window is created at T0 with 10 > minutes retention, when we saw a new record coming with event timestamp T0 + > 10 +1, we will expire that window (remove it) out of RocksDB. > In the real world, it's very easy to see event coming with future timestamp > (or out-of-order events coming with big time gaps between events), this way > of retiring a window based on one event's event timestamp is dangerous. I > think at least we need to consider both the event's event time and > server/stream time elapse. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-7591) Changelog retention period doesn't synchronise with window-store size
[ https://issues.apache.org/jira/browse/KAFKA-7591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016294#comment-17016294 ] Sophie Blee-Goldman commented on KAFKA-7591: Right, that is why I say if we can't distinguish between a window size change (incompatible) and a retention period change (fine) we should _not_ alter the configs, and just log a warning if there is a mismatch. Once KAFKA-8307 is fixed then we can consider allowing Streams to change existing topic configs. For now, I think it would still be valuable to check whether there are contradictory configs and raise it to the user if so. > Changelog retention period doesn't synchronise with window-store size > - > > Key: KAFKA-7591 > URL: https://issues.apache.org/jira/browse/KAFKA-7591 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Jon Bates >Priority: Major > > When a new windowed state store is created, the associated changelog topic's > `retention.ms` value is set to `window-size + > CHANGELOG_ADDITIONAL_RETENTION_MS` > h3. Expected Behaviour > If the window-size is updated, the changelog topic's `retention.ms` config > should be updated to reflect the new size > h3. Actual Behaviour > The changelog-topic's `retention.ms` setting is not amended, resulting in > possible loss of data upon application restart > > n.b. Although it is easy to update changelog topic config, I logged this as > `major` due to the potential for data-loss for any user of Kafka-Streams who > may not be intimately aware of the relationship between a windowed store and > the changelog config -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9440) Add ConsumerGroupCommand to delete static members
Boyang Chen created KAFKA-9440: -- Summary: Add ConsumerGroupCommand to delete static members Key: KAFKA-9440 URL: https://issues.apache.org/jira/browse/KAFKA-9440 Project: Kafka Issue Type: Improvement Reporter: Boyang Chen We introduced a new AdminClient API removeMembersFromConsumerGroup in 2.4. It would be good to instantiate the API as part of the ConsumerGroupCommand for easy command line usage. This change requires a new KIP, and just posting out here in case anyone who uses static membership to pick it up, if they would like to use. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-6144) Allow serving interactive queries from in-sync Standbys
[ https://issues.apache.org/jira/browse/KAFKA-6144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016273#comment-17016273 ] ASF GitHub Bot commented on KAFKA-6144: --- vvcephei commented on pull request #7962: KAFKA-6144: option to query restoring and standby URL: https://github.com/apache/kafka/pull/7962 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Allow serving interactive queries from in-sync Standbys > --- > > Key: KAFKA-6144 > URL: https://issues.apache.org/jira/browse/KAFKA-6144 > Project: Kafka > Issue Type: New Feature > Components: streams >Reporter: Antony Stubbs >Assignee: Navinder Brar >Priority: Major > Labels: kip-535 > Attachments: image-2019-10-09-20-33-37-423.png, > image-2019-10-09-20-47-38-096.png > > > Currently when expanding the KS cluster, the new node's partitions will be > unavailable during the rebalance, which for large states can take a very long > time, or for small state stores even more than a few ms can be a deal-breaker > for micro service use cases. > One workaround is to allow stale data to be read from the state stores when > use case allows. Adding the use case from KAFKA-8994 as it is more > descriptive. > "Consider the following scenario in a three node Streams cluster with node A, > node S and node R, executing a stateful sub-topology/topic group with 1 > partition and `_num.standby.replicas=1_` > * *t0*: A is the active instance owning the partition, B is the standby that > keeps replicating the A's state into its local disk, R just routes streams > IQs to active instance using StreamsMetadata > * *t1*: IQs pick node R as router, R forwards query to A, A responds back to > R which reverse forwards back the results. > * *t2:* Active A instance is killed and rebalance begins. IQs start failing > to A > * *t3*: Rebalance assignment happens and standby B is now promoted as active > instance. IQs continue to fail > * *t4*: B fully catches up to changelog tail and rewinds offsets to A's last > commit position, IQs continue to fail > * *t5*: IQs to R, get routed to B, which is now ready to serve results. IQs > start succeeding again > > Depending on Kafka consumer group session/heartbeat timeouts, step t2,t3 can > take few seconds (~10 seconds based on defaults values). Depending on how > laggy the standby B was prior to A being killed, t4 can take few > seconds-minutes. > While this behavior favors consistency over availability at all times, the > long unavailability window might be undesirable for certain classes of > applications (e.g simple caches or dashboards). > This issue aims to also expose information about standby B to R, during each > rebalance such that the queries can be routed by an application to a standby > to serve stale reads, choosing availability over consistency." -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9439) Add more public API tests for KafkaProducer
Boyang Chen created KAFKA-9439: -- Summary: Add more public API tests for KafkaProducer Key: KAFKA-9439 URL: https://issues.apache.org/jira/browse/KAFKA-9439 Project: Kafka Issue Type: Improvement Reporter: Boyang Chen While working on KIP-447, we realized a lack of test coverage on the KafkaProducer public APIs. For example, `commitTransaction` and `sendOffsetsToTransaction` are not even called in the `KafkaProducerTest.java` and the code coverage is only 75%. Adding more unit tests here will be pretty valuable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8803) Stream will not start due to TimeoutException: Timeout expired after 60000milliseconds while awaiting InitProducerId
[ https://issues.apache.org/jira/browse/KAFKA-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016236#comment-17016236 ] Raman Gupta commented on KAFKA-8803: Looks like this is not just a dev-ops issue, from what I'm seeing here, it looks like *the EXACTLY_ONCE guarantee of the stream that failed to start with this error is being violated by Kafka*. After restarting all my brokers I was expecting the stream to start where it left off and process all input messages. However, the stream offset was already past a message that had never been processed. This is a huge problem. > Stream will not start due to TimeoutException: Timeout expired after > 6milliseconds while awaiting InitProducerId > > > Key: KAFKA-8803 > URL: https://issues.apache.org/jira/browse/KAFKA-8803 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Raman Gupta >Assignee: Boyang Chen >Priority: Major > Attachments: logs.txt.gz, screenshot-1.png > > > One streams app is consistently failing at startup with the following > exception: > {code} > 2019-08-14 17:02:29,568 ERROR --- [2ce1b-StreamThread-2] > org.apa.kaf.str.pro.int.StreamTask: task [0_36] Timeout > exception caught when initializing transactions for task 0_36. This might > happen if the broker is slow to respond, if the network connection to the > broker was interrupted, or if similar circumstances arise. You can increase > producer parameter `max.block.ms` to increase this timeout. > org.apache.kafka.common.errors.TimeoutException: Timeout expired after > 6milliseconds while awaiting InitProducerId > {code} > These same brokers are used by many other streams without any issue, > including some in the very same processes for the stream which consistently > throws this exception. > *UPDATE 08/16:* > The very first instance of this error is August 13th 2019, 17:03:36.754 and > it happened for 4 different streams. For 3 of these streams, the error only > happened once, and then the stream recovered. For the 4th stream, the error > has continued to happen, and continues to happen now. > I looked up the broker logs for this time, and see that at August 13th 2019, > 16:47:43, two of four brokers started reporting messages like this, for > multiple partitions: > [2019-08-13 20:47:43,658] INFO [ReplicaFetcher replicaId=3, leaderId=1, > fetcherId=0] Retrying leaderEpoch request for partition xxx-1 as the leader > reported an error: UNKNOWN_LEADER_EPOCH (kafka.server.ReplicaFetcherThread) > The UNKNOWN_LEADER_EPOCH messages continued for some time, and then stopped, > here is a view of the count of these messages over time: > !screenshot-1.png! > However, as noted, the stream task timeout error continues to happen. > I use the static consumer group protocol with Kafka 2.3.0 clients and 2.3.0 > broker. The broker has a patch for KAFKA-8773. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9323) Refactor Streams' upgrade system tests
[ https://issues.apache.org/jira/browse/KAFKA-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016167#comment-17016167 ] Nikolay Izhikov commented on KAFKA-9323: Hello, [~ableegoldman] Can you, please, confirm my analysis of the current source code of stream system tests: We have 4 python scripts with the upgrade tests: * streams_broker_compatibility_test.py * streams_cooperative_rebalance_upgrade_test.py * streams_statis_memebership_test.py * streams_upgrade_test.py > a) only the version probing test tests "forwards compatibility" (upgrade from > latest to future version) streams_upgrade_test.py#test_version_probing_upgrade > b) nothing tests version probing "backwards compatibility" (upgrade from > older version to latest), except: a new test that should be created. > c) the cooperative rebalancing test actually happens to involve a version > probing step, and so coincidentally DOES test VP (but only starting with 2.4) stream_cooperative_rebalance_upgrade_test.py#test_upgrade_to_cooperative_rebalance > we should have a single matrix that we update with each new version that > specifies which upgrade path that version combination actually requires Do we have this matrix written somewhere? As I can understand your proposal we should refactor all script from the mentioned files. Is it correct? > Refactor Streams' upgrade system tests > --- > > Key: KAFKA-9323 > URL: https://issues.apache.org/jira/browse/KAFKA-9323 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Sophie Blee-Goldman >Assignee: Nikolay Izhikov >Priority: Major > > With the introduction of version probing in 2.0 and cooperative rebalancing > in 2.4, the specific upgrade path depends heavily on the to & from version. > This can be a complex operation, and we should make sure to test a realistic > upgrade scenario across all possible combinations. The current system tests > have gaps however, which have allowed bugs in the upgrade path to slip by > unnoticed for several versions. > Our current system tests include a metadata upgrade test, a version probing > test, and a cooperative upgrade test. This has a few drawbacks: > a) only the version probing test tests "forwards compatibility" (upgrade from > latest to future version) > b) nothing tests version probing "backwards compatibility" (upgrade from > older version to latest), except: > c) the cooperative rebalancing test actually happens to involve a version > probing step, and so coincidentally DOES test VP (but only starting with 2.4) > d) each test itself tries to test the upgrade across different versions, > meaning there may be overlap and/or unnecessary tests > e) as new versions are released, it is unclear to those not directly involved > in these tests and/or projects whether and what needs to be updated (eg > should this new version be added to the cooperative test? should the old > version be aded to the metadata test?) > We should definitely fill in the testing gap here, but how to do so is of > course up for discussion. > I would propose to refactor the upgrade tests, and rather than maintain > different lists of versions to pass as input to each different test, we > should have a single matrix that we update with each new version that > specifies which upgrade path that version combination actually requires. We > can then loop through each version combination and test only the actual > upgrade path that users would actually need to follow. This way we can be > sure we are not missing anything, as each and every possible upgrade would be > tested. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9438) Issue with mm2 active/active replication
[ https://issues.apache.org/jira/browse/KAFKA-9438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman updated KAFKA-9438: - Description: Hi, i am trying to configure the the active/active with new kafka 2.4.0 and MM2. I have 2 kafka clusters of 3 kafkas, one in lets say BO and on in IN. In each cluster there are 3 kafkas. Topics are replicated properly so in BO i see {quote}topics in.topics {quote} in IN i see {quote}topics bo.topic {quote} That should be according to documentation. But when I stop the replication process on one data center and start it up, the replication replicate the topics with the same prefix twice bo.bo.topics or in.in.topics depending on what connector i restart. I have also blacklisted the topics but they are still replicating. bo.properties file {quote}name = in-bo #topics = .* topics.blacklist = "bo.*" #groups = .* connector.class = org.apache.kafka.connect.mirror.MirrorSourceConnector tasks.max = 10 source.cluster.alias = in target.cluster.alias = bo source.cluster.bootstrap.servers = IN-kafka1:9092,IN-kafka2:9092,IN-kafka3:9092 target.cluster.bootstrap.servers = BO-kafka1:9092,BO-kafka2:9092,bo-kafka39092 use ByteArrayConverter to ensure that records are not re-encoded key.converter = org.apache.kafka.connect.converters.ByteArrayConverter value.converter = org.apache.kafka.connect.converters.ByteArrayConverter {quote} in.properties {quote}name = bo-in #topics = .* topics.blacklist = "in.*" #groups = .* connector.class = org.apache.kafka.connect.mirror.MirrorSourceConnector tasks.max = 10 source.cluster.alias = bo target.cluster.alias = in target.cluster.bootstrap.servers = IN-kafka1:9092,IN-kafka2:9092,IN-kafka3:9092 source.cluster.bootstrap.servers = BO-kafka1:9092,BO-kafka2:9092,BO-kafka3:9092 use ByteArrayConverter to ensure that records are not re-encoded key.converter = org.apache.kafka.connect.converters.ByteArrayConverter value.converter = org.apache.kafka.connect.converters.ByteArrayConverter {quote} was: Hi, i am trying to configure the the active/active with new kafka 2.4.0 and MM2. I have 2 kafka clusters of 3 kafkas, one in lets say BO and on in IN. In each cluster there are 3 kafkas. Topics are replicated properly so in BO i see {quote}topics in.topics {quote} in IN i see {quote}topics bo.topic {quote} That should be according to documentation. But when I stop the replication process on one data center and start it up, the replication replicate the topics with the same prefix twice bo.bo.topics or in.in.topics depending on what connector i restart. I have also blacklisted the topics but they are still replicating. bo.properties file {quote}name = in-bo #topics = .* topics.blacklist = "bo.*" #groups = .* connector.class = org.apache.kafka.connect.mirror.MirrorSourceConnector tasks.max = 10 source.cluster.alias = in target.cluster.alias = bo source.cluster.bootstrap.servers = IN-kafka1:9092,IN-kafka2:9092,IN-kafka3:9092 target.cluster.bootstrap.servers = BO-kafka1:9092,BO-kafka2:9092,bo-kafka39092 # use ByteArrayConverter to ensure that records are not re-encoded key.converter = org.apache.kafka.connect.converters.ByteArrayConverter value.converter = org.apache.kafka.connect.converters.ByteArrayConverter {quote} in.properties {quote}name = bo-in #topics = .* topics.blacklist = "in.*" #groups = .* connector.class = org.apache.kafka.connect.mirror.MirrorSourceConnector tasks.max = 10 source.cluster.alias = bo target.cluster.alias = in target.cluster.bootstrap.servers = IN-kafka1:9092,IN-kafka2:9092,IN-kafka3:9092 source.cluster.bootstrap.servers = BO-kafka1:9092,BO-kafka2:9092,BO-kafka3:9092 # use ByteArrayConverter to ensure that records are not re-encoded key.converter = org.apache.kafka.connect.converters.ByteArrayConverter value.converter = org.apache.kafka.connect.converters.ByteArrayConverter {quote} > Issue with mm2 active/active replication > > > Key: KAFKA-9438 > URL: https://issues.apache.org/jira/browse/KAFKA-9438 > Project: Kafka > Issue Type: Bug > Components: mirrormaker >Affects Versions: 2.4.0 >Reporter: Roman >Priority: Minor > > Hi, > > i am trying to configure the the active/active with new kafka 2.4.0 and MM2. > I have 2 kafka clusters of 3 kafkas, one in lets say BO and on in IN. > In each cluster there are 3 kafkas. > Topics are replicated properly so in BO i see > {quote}topics > in.topics > {quote} > > in IN i see > {quote}topics > bo.topic > {quote} > > That should be according to documentation. > > But when I stop the replication process on one data center and start it up, > the replication replicate the topics with the same prefix twice bo.bo.topics > or in.in.topics depending on what connector i restart. > I have also blacklisted the
[jira] [Created] (KAFKA-9438) Issue with mm2 active/active replication
Roman created KAFKA-9438: Summary: Issue with mm2 active/active replication Key: KAFKA-9438 URL: https://issues.apache.org/jira/browse/KAFKA-9438 Project: Kafka Issue Type: Bug Components: mirrormaker Affects Versions: 2.4.0 Reporter: Roman Hi, i am trying to configure the the active/active with new kafka 2.4.0 and MM2. I have 2 kafka clusters of 3 kafkas, one in lets say BO and on in IN. In each cluster there are 3 kafkas. Topics are replicated properly so in BO i see {quote}topics in.topics {quote} in IN i see {quote}topics bo.topic {quote} That should be according to documentation. But when I stop the replication process on one data center and start it up, the replication replicate the topics with the same prefix twice bo.bo.topics or in.in.topics depending on what connector i restart. I have also blacklisted the topics but they are still replicating. bo.properties file {quote}name = in-bo #topics = .* topics.blacklist = "bo.*" #groups = .* connector.class = org.apache.kafka.connect.mirror.MirrorSourceConnector tasks.max = 10 source.cluster.alias = in target.cluster.alias = bo source.cluster.bootstrap.servers = IN-kafka1:9092,IN-kafka2:9092,IN-kafka3:9092 target.cluster.bootstrap.servers = BO-kafka1:9092,BO-kafka2:9092,bo-kafka39092 # use ByteArrayConverter to ensure that records are not re-encoded key.converter = org.apache.kafka.connect.converters.ByteArrayConverter value.converter = org.apache.kafka.connect.converters.ByteArrayConverter {quote} in.properties {quote}name = bo-in #topics = .* topics.blacklist = "in.*" #groups = .* connector.class = org.apache.kafka.connect.mirror.MirrorSourceConnector tasks.max = 10 source.cluster.alias = bo target.cluster.alias = in target.cluster.bootstrap.servers = IN-kafka1:9092,IN-kafka2:9092,IN-kafka3:9092 source.cluster.bootstrap.servers = BO-kafka1:9092,BO-kafka2:9092,BO-kafka3:9092 # use ByteArrayConverter to ensure that records are not re-encoded key.converter = org.apache.kafka.connect.converters.ByteArrayConverter value.converter = org.apache.kafka.connect.converters.ByteArrayConverter {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8803) Stream will not start due to TimeoutException: Timeout expired after 60000milliseconds while awaiting InitProducerId
[ https://issues.apache.org/jira/browse/KAFKA-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016128#comment-17016128 ] Raman Gupta commented on KAFKA-8803: This happened to me again today. Recycling 3 of 4 brokers fixed the issue. > Stream will not start due to TimeoutException: Timeout expired after > 6milliseconds while awaiting InitProducerId > > > Key: KAFKA-8803 > URL: https://issues.apache.org/jira/browse/KAFKA-8803 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Raman Gupta >Assignee: Boyang Chen >Priority: Major > Attachments: logs.txt.gz, screenshot-1.png > > > One streams app is consistently failing at startup with the following > exception: > {code} > 2019-08-14 17:02:29,568 ERROR --- [2ce1b-StreamThread-2] > org.apa.kaf.str.pro.int.StreamTask: task [0_36] Timeout > exception caught when initializing transactions for task 0_36. This might > happen if the broker is slow to respond, if the network connection to the > broker was interrupted, or if similar circumstances arise. You can increase > producer parameter `max.block.ms` to increase this timeout. > org.apache.kafka.common.errors.TimeoutException: Timeout expired after > 6milliseconds while awaiting InitProducerId > {code} > These same brokers are used by many other streams without any issue, > including some in the very same processes for the stream which consistently > throws this exception. > *UPDATE 08/16:* > The very first instance of this error is August 13th 2019, 17:03:36.754 and > it happened for 4 different streams. For 3 of these streams, the error only > happened once, and then the stream recovered. For the 4th stream, the error > has continued to happen, and continues to happen now. > I looked up the broker logs for this time, and see that at August 13th 2019, > 16:47:43, two of four brokers started reporting messages like this, for > multiple partitions: > [2019-08-13 20:47:43,658] INFO [ReplicaFetcher replicaId=3, leaderId=1, > fetcherId=0] Retrying leaderEpoch request for partition xxx-1 as the leader > reported an error: UNKNOWN_LEADER_EPOCH (kafka.server.ReplicaFetcherThread) > The UNKNOWN_LEADER_EPOCH messages continued for some time, and then stopped, > here is a view of the count of these messages over time: > !screenshot-1.png! > However, as noted, the stream task timeout error continues to happen. > I use the static consumer group protocol with Kafka 2.3.0 clients and 2.3.0 > broker. The broker has a patch for KAFKA-8773. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9428) Expose standby information in KafkaStreams via queryMetadataForKey API
[ https://issues.apache.org/jira/browse/KAFKA-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016094#comment-17016094 ] ASF GitHub Bot commented on KAFKA-9428: --- vvcephei commented on pull request #7960: KAFKA-9428: Add KeyQueryMetadata APIs to KafkaStreams URL: https://github.com/apache/kafka/pull/7960 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Expose standby information in KafkaStreams via queryMetadataForKey API > -- > > Key: KAFKA-9428 > URL: https://issues.apache.org/jira/browse/KAFKA-9428 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Major > Fix For: 2.5.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-8847) Deprecate and remove usage of supporting classes in kafka.security.auth
[ https://issues.apache.org/jira/browse/KAFKA-8847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016093#comment-17016093 ] ASF GitHub Bot commented on KAFKA-8847: --- rajinisivaram commented on pull request #7966: KAFKA-8847; Deprecate and remove usage of supporting classes in kafka.security.auth URL: https://github.com/apache/kafka/pull/7966 Removes references to the old scala Acl classes from `kafka.security.auth` (Acl, Operation, ResourceType, Resource etc.) and replaces these with the Java API. Only the old `SimpleAclAuthorizer`, `AuthorizerWrapper` used to wrap legacy authorizer instances and tests using `SimpleAclAuthorizer` continue to use the old API. Deprecated the old scala API. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Deprecate and remove usage of supporting classes in kafka.security.auth > --- > > Key: KAFKA-8847 > URL: https://issues.apache.org/jira/browse/KAFKA-8847 > Project: Kafka > Issue Type: Sub-task > Components: security >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram >Priority: Major > Fix For: 2.5.0 > > > Deprecate Acl, Resource etc. from `kafka.security.auth` and replace > references to these with the equivalent Java classes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9323) Refactor Streams' upgrade system tests
[ https://issues.apache.org/jira/browse/KAFKA-9323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikolay Izhikov reassigned KAFKA-9323: -- Assignee: Nikolay Izhikov > Refactor Streams' upgrade system tests > --- > > Key: KAFKA-9323 > URL: https://issues.apache.org/jira/browse/KAFKA-9323 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Sophie Blee-Goldman >Assignee: Nikolay Izhikov >Priority: Major > > With the introduction of version probing in 2.0 and cooperative rebalancing > in 2.4, the specific upgrade path depends heavily on the to & from version. > This can be a complex operation, and we should make sure to test a realistic > upgrade scenario across all possible combinations. The current system tests > have gaps however, which have allowed bugs in the upgrade path to slip by > unnoticed for several versions. > Our current system tests include a metadata upgrade test, a version probing > test, and a cooperative upgrade test. This has a few drawbacks: > a) only the version probing test tests "forwards compatibility" (upgrade from > latest to future version) > b) nothing tests version probing "backwards compatibility" (upgrade from > older version to latest), except: > c) the cooperative rebalancing test actually happens to involve a version > probing step, and so coincidentally DOES test VP (but only starting with 2.4) > d) each test itself tries to test the upgrade across different versions, > meaning there may be overlap and/or unnecessary tests > e) as new versions are released, it is unclear to those not directly involved > in these tests and/or projects whether and what needs to be updated (eg > should this new version be added to the cooperative test? should the old > version be aded to the metadata test?) > We should definitely fill in the testing gap here, but how to do so is of > course up for discussion. > I would propose to refactor the upgrade tests, and rather than maintain > different lists of versions to pass as input to each different test, we > should have a single matrix that we update with each new version that > specifies which upgrade path that version combination actually requires. We > can then loop through each version combination and test only the actual > upgrade path that users would actually need to follow. This way we can be > sure we are not missing anything, as each and every possible upgrade would be > tested. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (KAFKA-9414) sink-task-metrics.sink-record-lag-max metric is not exposed
[ https://issues.apache.org/jira/browse/KAFKA-9414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aidar Makhmutov resolved KAFKA-9414. Resolution: Not A Bug It's not been implemented yet. > sink-task-metrics.sink-record-lag-max metric is not exposed > --- > > Key: KAFKA-9414 > URL: https://issues.apache.org/jira/browse/KAFKA-9414 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 2.2.0 >Reporter: Aidar Makhmutov >Priority: Minor > > In group of metrics > {{kafka.connect:type=sink-task-metrics,connector="\{connector}",task="\{task}"}} > The following metric is not exposed by JMX (but present in documentation): > {{sink-record-lag-max}} > > Details: > * Docker image: confluentinc/cp-kafka-connect:5.2.1 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9437) KIP-559: Make the Kafka Protocol Friendlier with L7 Proxies
David Jacot created KAFKA-9437: -- Summary: KIP-559: Make the Kafka Protocol Friendlier with L7 Proxies Key: KAFKA-9437 URL: https://issues.apache.org/jira/browse/KAFKA-9437 Project: Kafka Issue Type: Improvement Reporter: David Jacot Assignee: David Jacot -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (KAFKA-9436) New Kafka Connect SMT for plainText => Struct(or Map)
[ https://issues.apache.org/jira/browse/KAFKA-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] whsoul updated KAFKA-9436: -- Comment: was deleted (was: [https://github.com/apache/kafka/pull/7965]) > New Kafka Connect SMT for plainText => Struct(or Map) > - > > Key: KAFKA-9436 > URL: https://issues.apache.org/jira/browse/KAFKA-9436 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: whsoul >Priority: Major > > I'd like to parse and convert plain text rows to struct(or map) data, and > load into documented database such as mongoDB, elasticSearch, etc... with SMT > > For example > > plain text apache log > {code:java} > "111.61.73.113 - - [08/Aug/2019:18:15:29 +0900] \"OPTIONS > /api/v1/service_config HTTP/1.1\" 200 - 101989 \"http://local.test.com/\; > \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, > like Gecko) Chrome/75.0.3770.142 Safari/537.36\"" > {code} > SMT connect config with regular expression below can easily transform a plain > text to struct (or map) data. > > {code:java} > "transforms": "TimestampTopic, RegexTransform", > "transforms.RegexTransform.type": > "org.apache.kafka.connect.transforms.ToStructByRegexTransform$Value", > "transforms.RegexTransform.struct.field": "message", > "transforms.RegexTransform.regex": "^([\\d.]+) (\\S+) (\\S+) > \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(GET|POST|OPTIONS|HEAD|PUT|DELETE|PATCH) > (.+?) (.+?)\" (\\d{3}) ([0-9|-]+) ([0-9|-]+) \"([^\"]+)\" \"([^\"]+)\"" > "transforms.RegexTransform.mapping": > "IP,RemoteUser,AuthedRemoteUser,DateTime,Method,Request,Protocol,Response,BytesSent,Ms:NUMBER,Referrer,UserAgent" > {code} > > I have PR about this -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9436) New Kafka Connect SMT for plainText => Struct(or Map)
[ https://issues.apache.org/jira/browse/KAFKA-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17015833#comment-17015833 ] whsoul commented on KAFKA-9436: --- [https://github.com/apache/kafka/pull/7965] > New Kafka Connect SMT for plainText => Struct(or Map) > - > > Key: KAFKA-9436 > URL: https://issues.apache.org/jira/browse/KAFKA-9436 > Project: Kafka > Issue Type: Improvement > Components: KafkaConnect >Reporter: whsoul >Priority: Major > > I'd like to parse and convert plain text rows to struct(or map) data, and > load into documented database such as mongoDB, elasticSearch, etc... with SMT > > For example > > plain text apache log > {code:java} > "111.61.73.113 - - [08/Aug/2019:18:15:29 +0900] \"OPTIONS > /api/v1/service_config HTTP/1.1\" 200 - 101989 \"http://local.test.com/\; > \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, > like Gecko) Chrome/75.0.3770.142 Safari/537.36\"" > {code} > SMT connect config with regular expression below can easily transform a plain > text to struct (or map) data. > > {code:java} > "transforms": "TimestampTopic, RegexTransform", > "transforms.RegexTransform.type": > "org.apache.kafka.connect.transforms.ToStructByRegexTransform$Value", > "transforms.RegexTransform.struct.field": "message", > "transforms.RegexTransform.regex": "^([\\d.]+) (\\S+) (\\S+) > \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(GET|POST|OPTIONS|HEAD|PUT|DELETE|PATCH) > (.+?) (.+?)\" (\\d{3}) ([0-9|-]+) ([0-9|-]+) \"([^\"]+)\" \"([^\"]+)\"" > "transforms.RegexTransform.mapping": > "IP,RemoteUser,AuthedRemoteUser,DateTime,Method,Request,Protocol,Response,BytesSent,Ms:NUMBER,Referrer,UserAgent" > {code} > > I have PR about this -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9436) New Kafka Connect SMT for plainText => Struct(or Map)
whsoul created KAFKA-9436: - Summary: New Kafka Connect SMT for plainText => Struct(or Map) Key: KAFKA-9436 URL: https://issues.apache.org/jira/browse/KAFKA-9436 Project: Kafka Issue Type: Improvement Components: KafkaConnect Reporter: whsoul I'd like to parse and convert plain text rows to struct(or map) data, and load into documented database such as mongoDB, elasticSearch, etc... with SMT For example plain text apache log {code:java} "111.61.73.113 - - [08/Aug/2019:18:15:29 +0900] \"OPTIONS /api/v1/service_config HTTP/1.1\" 200 - 101989 \"http://local.test.com/\; \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36\"" {code} SMT connect config with regular expression below can easily transform a plain text to struct (or map) data. {code:java} "transforms": "TimestampTopic, RegexTransform", "transforms.RegexTransform.type": "org.apache.kafka.connect.transforms.ToStructByRegexTransform$Value", "transforms.RegexTransform.struct.field": "message", "transforms.RegexTransform.regex": "^([\\d.]+) (\\S+) (\\S+) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] \"(GET|POST|OPTIONS|HEAD|PUT|DELETE|PATCH) (.+?) (.+?)\" (\\d{3}) ([0-9|-]+) ([0-9|-]+) \"([^\"]+)\" \"([^\"]+)\"" "transforms.RegexTransform.mapping": "IP,RemoteUser,AuthedRemoteUser,DateTime,Method,Request,Protocol,Response,BytesSent,Ms:NUMBER,Referrer,UserAgent" {code} I have PR about this -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9435) Replace DescribeLogDirs request/response with automated protocol
Tom Bentley created KAFKA-9435: -- Summary: Replace DescribeLogDirs request/response with automated protocol Key: KAFKA-9435 URL: https://issues.apache.org/jira/browse/KAFKA-9435 Project: Kafka Issue Type: Sub-task Reporter: Tom Bentley Assignee: Tom Bentley -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9434) Replace AlterReplicaLogDirs request/response with automated protocol
Tom Bentley created KAFKA-9434: -- Summary: Replace AlterReplicaLogDirs request/response with automated protocol Key: KAFKA-9434 URL: https://issues.apache.org/jira/browse/KAFKA-9434 Project: Kafka Issue Type: Sub-task Reporter: Tom Bentley Assignee: Tom Bentley -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9433) Replace AlterConfigs request/response with automated protocol
Tom Bentley created KAFKA-9433: -- Summary: Replace AlterConfigs request/response with automated protocol Key: KAFKA-9433 URL: https://issues.apache.org/jira/browse/KAFKA-9433 Project: Kafka Issue Type: Sub-task Reporter: Tom Bentley Assignee: Tom Bentley -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9432) Replace DescribeConfigs request/response with automated protocol
Tom Bentley created KAFKA-9432: -- Summary: Replace DescribeConfigs request/response with automated protocol Key: KAFKA-9432 URL: https://issues.apache.org/jira/browse/KAFKA-9432 Project: Kafka Issue Type: Sub-task Reporter: Tom Bentley Assignee: Tom Bentley -- This message was sent by Atlassian Jira (v8.3.4#803005)