[jira] [Commented] (KAFKA-5731) Connect WorkerSinkTask out of order offset commit can lead to inconsistent state
[ https://issues.apache.org/jira/browse/KAFKA-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128198#comment-16128198 ] ASF GitHub Bot commented on KAFKA-5731: --- Github user rhauch closed the pull request at: https://github.com/apache/kafka/pull/3672 > Connect WorkerSinkTask out of order offset commit can lead to inconsistent > state > > > Key: KAFKA-5731 > URL: https://issues.apache.org/jira/browse/KAFKA-5731 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Reporter: Jason Gustafson >Assignee: Randall Hauch > Fix For: 0.11.0.1 > > > In Connect's WorkerSinkTask, we do sequence number validation to ensure that > offset commits are handled in the right order > (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L199). > > Unfortunately, for asynchronous commits, the {{lastCommittedOffsets}} field > is overridden regardless of this sequence check as long as the response had > no error > (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L284): > {code:java} > OffsetCommitCallback cb = new OffsetCommitCallback() { > @Override > public void onComplete(Map> offsets, Exception error) { > if (error == null) { > lastCommittedOffsets = offsets; > } > onCommitCompleted(error, seqno); > } > }; > {code} > Hence if we get an out of order commit, then the internal state will be > inconsistent. To fix this, we should only override {{lastCommittedOffsets}} > after sequence validation as part of the {{onCommitCompleted(...)}} method. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-5731) Connect WorkerSinkTask out of order offset commit can lead to inconsistent state
[ https://issues.apache.org/jira/browse/KAFKA-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson resolved KAFKA-5731. Resolution: Fixed Fix Version/s: (was: 1.0.0) Issue resolved by pull request 3672 [https://github.com/apache/kafka/pull/3672] > Connect WorkerSinkTask out of order offset commit can lead to inconsistent > state > > > Key: KAFKA-5731 > URL: https://issues.apache.org/jira/browse/KAFKA-5731 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Reporter: Jason Gustafson >Assignee: Randall Hauch > Fix For: 0.11.0.1 > > > In Connect's WorkerSinkTask, we do sequence number validation to ensure that > offset commits are handled in the right order > (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L199). > > Unfortunately, for asynchronous commits, the {{lastCommittedOffsets}} field > is overridden regardless of this sequence check as long as the response had > no error > (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L284): > {code:java} > OffsetCommitCallback cb = new OffsetCommitCallback() { > @Override > public void onComplete(Map> offsets, Exception error) { > if (error == null) { > lastCommittedOffsets = offsets; > } > onCommitCompleted(error, seqno); > } > }; > {code} > Hence if we get an out of order commit, then the internal state will be > inconsistent. To fix this, we should only override {{lastCommittedOffsets}} > after sequence validation as part of the {{onCommitCompleted(...)}} method. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5737) KafkaAdminClient thread should be daemon
[ https://issues.apache.org/jira/browse/KAFKA-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128051#comment-16128051 ] ASF GitHub Bot commented on KAFKA-5737: --- GitHub user cmccabe opened a pull request: https://github.com/apache/kafka/pull/3674 KAFKA-5737. KafkaAdminClient thread should be daemon You can merge this pull request into a Git repository by running: $ git pull https://github.com/cmccabe/kafka KAFKA-5737 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3674.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3674 commit 35dcae186e254e871def6939d4cac9a213d724d6 Author: Colin P. MccabeDate: 2017-08-15T23:20:56Z KAFKA-5737. KafkaAdminClient thread should be daemon > KafkaAdminClient thread should be daemon > > > Key: KAFKA-5737 > URL: https://issues.apache.org/jira/browse/KAFKA-5737 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.11.0.0 >Reporter: Colin P. McCabe >Assignee: Colin P. McCabe > > The admin client thread should be daemon, for consistency with the consumer > and producer threads. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5737) KafkaAdminClient thread should be daemon
Colin P. McCabe created KAFKA-5737: -- Summary: KafkaAdminClient thread should be daemon Key: KAFKA-5737 URL: https://issues.apache.org/jira/browse/KAFKA-5737 Project: Kafka Issue Type: Bug Affects Versions: 0.11.0.0 Reporter: Colin P. McCabe Assignee: Colin P. McCabe The admin client thread should be daemon, for consistency with the consumer and producer threads. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-5440) Kafka Streams report state RUNNING even if all threads are dead
[ https://issues.apache.org/jira/browse/KAFKA-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128011#comment-16128011 ] Guozhang Wang edited comment on KAFKA-5440 at 8/15/17 10:54 PM: [~twbecker] it is already cherry-picked as in https://github.com/apache/kafka/pull/3432 to 0.11.0, we plan to have a bug fix release 0.11.0.1 soon. Please feel free to resolve this once you validated in 0.11.0 branch. was (Author: guozhang): [~twbecker] it is already cherry-picked as in https://github.com/apache/kafka/pull/3432 to 0.11.0, we plan to have a bug fix release 0.11.0.1 soon. > Kafka Streams report state RUNNING even if all threads are dead > --- > > Key: KAFKA-5440 > URL: https://issues.apache.org/jira/browse/KAFKA-5440 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.2.1, 0.11.0.0 >Reporter: Matthias J. Sax > Fix For: 0.11.0.1, 1.0.0 > > > From the mailing list: > {quote} > Hi All, > We recently implemented a health check for a Kafka Streams based application. > The health check is simply checking the state of Kafka Streams by calling > KafkaStreams.state(). It reports healthy if it’s not in PENDING_SHUTDOWN or > NOT_RUNNING states. > We truly appreciate having the possibility to easily check the state of Kafka > Streams but to our surprise we noticed that KafkaStreams.state() returns > RUNNING even though all StreamThreads has crashed and reached NOT_RUNNING > state. Is this intended behaviour or is it a bug? Semantically it seems weird > to me that KafkaStreams would say it’s RUNNING when it is in fact not > consuming anything since all underlying working threads has crashed. > If this is intended behaviour I would appreciate an explanation of why that > is the case. Also in that case, how could I determine if the consumption from > Kafka hasn’t crashed? > If this is not intended behaviour, how fast could I expect it to be fixed? I > wouldn’t mind fixing it myself but I’m not sure if this is considered trivial > or big enough to require a JIRA. Also, if I would implement a fix I’d like > your input on what would be a reasonable solution. By just inspecting to code > I have an idea but I’m not sure I understand all the implication so I’d be > happy to hear your thoughts first. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5736) Improve error message in Connect when all kafka brokers are down
Konstantine Karantasis created KAFKA-5736: - Summary: Improve error message in Connect when all kafka brokers are down Key: KAFKA-5736 URL: https://issues.apache.org/jira/browse/KAFKA-5736 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 0.11.0.0 Reporter: Konstantine Karantasis Assignee: Konstantine Karantasis Fix For: 0.11.0.1 Currently when all the Kafka brokers are down, Kafka Connect is failing with a pretty unintuitive message when it tries to, for instance, reconfigure tasks. Example output: {code:java} [2017-08-15 19:12:09,959] ERROR Failed to reconfigure connector's tasks, retrying after backoff: (org.apache.kafka.connect.runtime.distributed.DistributedHerder) java.lang.IllegalArgumentException: CircularIterator can only be used on non-empty lists at org.apache.kafka.common.utils.CircularIterator.(CircularIterator.java:29) at org.apache.kafka.clients.consumer.RoundRobinAssignor.assign(RoundRobinAssignor.java:61) at org.apache.kafka.clients.consumer.internals.AbstractPartitionAssignor.assign(AbstractPartitionAssignor.java:68) at ... (connector code) at org.apache.kafka.connect.runtime.Worker.connectorTaskConfigs(Worker.java:230) {code} The error message needs to be improved, since its root cause is the absence kafka brokers for assignment. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5735) Client-ids are not handled consistently by clients and broker
[ https://issues.apache.org/jira/browse/KAFKA-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajini Sivaram updated KAFKA-5735: -- Description: At the moment, Java clients expect client-ids to use a limited set of characters so that they can be used without quoting in metrics. kafka-configs.sh allows quotas to be defined only for that limited set. But the broker does not validate client-ids. And the documentation does not mention any limitations. Existing non-Java clients do not place any restrictions on client-ids and hence introducing restrictions on the broker-side now will be a breaking change. So we should allow any characters and treat them consistently in the same way as we handle user principals. Changes required: 1. Client-id in metrics should be sanitized using URL-encoding similar to the encoding used for user principal in quota metrics. This leaves metrics for client-ids using the current limited set of characters as-is, but will allow arbitrary characters in encoded form. To avoid sanitizing multiple times and to avoid unsanitized ids being used by mistake in some metrics, it may be better to introduce a ClientId class that stores the sanitized id and uses appropriate methods to retrieve the id for metrics etc. 2. Quota metrics and sensors as well as ZooKeeper quota configuration paths should use sanitized ids for client-ids (they already do for user principal). 3. Remove client-id validation in kafka-configs.sh and allow any characters for client-id similar to usernames, URL-encoding the names to generate ZK path. was:At the moment, Java clients expect client-ids to use a limited set of characters so that they can be used without quoting in metrics. kafka-configs.sh allows quotas to be defined only for that limited set. But the broker does not validate client-ids. And the documentation does not mention any limitations. We need to either limit characters used in client-ids, document and validate them or we should allow any characters and treat them consistently in the same way as we handle user principals. > Client-ids are not handled consistently by clients and broker > - > > Key: KAFKA-5735 > URL: https://issues.apache.org/jira/browse/KAFKA-5735 > Project: Kafka > Issue Type: Bug > Components: clients, core >Affects Versions: 0.11.0.0 >Reporter: Rajini Sivaram >Assignee: Rajini Sivaram > Fix For: 1.0.0 > > > At the moment, Java clients expect client-ids to use a limited set of > characters so that they can be used without quoting in metrics. > kafka-configs.sh allows quotas to be defined only for that limited set. But > the broker does not validate client-ids. And the documentation does not > mention any limitations. Existing non-Java clients do not place any > restrictions on client-ids and hence introducing restrictions on the > broker-side now will be a breaking change. So we should allow any characters > and treat them consistently in the same way as we handle user principals. > Changes required: > 1. Client-id in metrics should be sanitized using URL-encoding similar to the > encoding used for user principal in quota metrics. This leaves metrics for > client-ids using the current limited set of characters as-is, but will allow > arbitrary characters in encoded form. To avoid sanitizing multiple times and > to avoid unsanitized ids being used by mistake in some metrics, it may be > better to introduce a ClientId class that stores the sanitized id and uses > appropriate methods to retrieve the id for metrics etc. > 2. Quota metrics and sensors as well as ZooKeeper quota configuration paths > should use sanitized ids for client-ids (they already do for user principal). > 3. Remove client-id validation in kafka-configs.sh and allow any characters > for client-id similar to usernames, URL-encoding the names to generate ZK > path. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5731) Connect WorkerSinkTask out of order offset commit can lead to inconsistent state
[ https://issues.apache.org/jira/browse/KAFKA-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127960#comment-16127960 ] ASF GitHub Bot commented on KAFKA-5731: --- GitHub user rhauch opened a pull request: https://github.com/apache/kafka/pull/3672 KAFKA-5731 Corrected how the sink task worker updates the last committed offsets (0.11.0) Prior to this change, it was possible for the synchronous consumer commit request to be handled before previously-submitted asynchronous commit requests. If that happened, the out-of-order handlers improperly set the last committed offsets, which then became inconsistent with the offsets the connector task is working with. This change ensures that the last committed offsets are updated only for the most recent commit request, even if the consumer reorders the calls to the callbacks. **This is for the `0.11.0` branch; see #3662 for the equivalent and already-approved PR for `trunk`.** You can merge this pull request into a Git repository by running: $ git pull https://github.com/rhauch/kafka kafka-5731-0.11.0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3672.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3672 commit 526dbcb776effcc1661e51293a6d03256b19d0a6 Author: Randall HauchDate: 2017-08-12T00:42:06Z KAFKA-5731 Corrected how the sink task worker updates the last committed offsets Prior to this change, it was possible for the synchronous consumer commit request to be handled before previously-submitted asynchronous commit requests. If that happened, the out-of-order handlers improperly set the last committed offsets, which then became inconsistent with the offsets the connector task is working with. This change ensures that the last committed offsets are updated only for the most recent commit request, even if the consumer reorders the calls to the callbacks. # Conflicts: # connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java commit 80965cb5f8771e63b0dad095287cb9a29dea47f6 Author: Randall Hauch Date: 2017-08-14T19:11:08Z KAFKA-5731 Corrected mock consumer behavior during rebalance Corrects the test case added in the previous commit to properly revoke the existing partition assignments before adding new partition assigments. commit 37687c544513566c7d728273137a12751702ad41 Author: Randall Hauch Date: 2017-08-14T19:11:45Z KAFKA-5731 Added expected call that was missing in another test commit bfac0688ab64935bc4ac9c11e0a6251ca03e1043 Author: Randall Hauch Date: 2017-08-14T22:24:35Z KAFKA-5731 Improved log messages related to offset commits # Conflicts: # connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java commit bbc316dfa4353a7e914d11ceaca2b60c1bdaf291 Author: Randall Hauch Date: 2017-08-15T14:47:05Z KAFKA-5731 More cleanup of log messages related to offset commits commit 00b17ebbb5effb7f8aa171ea69b0227c7b009e97 Author: Randall Hauch Date: 2017-08-15T16:21:52Z KAFKA-5731 More improvements to the log messages in WorkerSinkTask # Conflicts: # connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java commit 3a5c31f1b912e06aded4b74524c73dfe1033e76c Author: Randall Hauch Date: 2017-08-15T16:31:28Z KAFKA-5731 Removed unnecessary log message commit 8b91f93e8c4c7b6b8e1aa6721f86ff01f8ecf40e Author: Randall Hauch Date: 2017-08-15T17:54:16Z KAFKA-5731 Additional tweaks to debug and trace log messages to ensure clarity and usefulness # Conflicts: # connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java commit bea03f69055524e9005302200f3a560a9cad2c3f Author: Randall Hauch Date: 2017-08-15T19:30:09Z KAFKA-5731 Use the correct value in trace messages > Connect WorkerSinkTask out of order offset commit can lead to inconsistent > state > > > Key: KAFKA-5731 > URL: https://issues.apache.org/jira/browse/KAFKA-5731 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Reporter: Jason Gustafson >Assignee: Randall Hauch > Fix For: 0.11.0.1, 1.0.0 > > > In Connect's WorkerSinkTask, we do sequence number validation to ensure that > offset commits are handled in the right order >
[jira] [Commented] (KAFKA-5731) Connect WorkerSinkTask out of order offset commit can lead to inconsistent state
[ https://issues.apache.org/jira/browse/KAFKA-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127882#comment-16127882 ] ASF GitHub Bot commented on KAFKA-5731: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3662 > Connect WorkerSinkTask out of order offset commit can lead to inconsistent > state > > > Key: KAFKA-5731 > URL: https://issues.apache.org/jira/browse/KAFKA-5731 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Reporter: Jason Gustafson >Assignee: Randall Hauch > Fix For: 0.11.0.1, 1.0.0 > > > In Connect's WorkerSinkTask, we do sequence number validation to ensure that > offset commits are handled in the right order > (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L199). > > Unfortunately, for asynchronous commits, the {{lastCommittedOffsets}} field > is overridden regardless of this sequence check as long as the response had > no error > (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L284): > {code:java} > OffsetCommitCallback cb = new OffsetCommitCallback() { > @Override > public void onComplete(Map> offsets, Exception error) { > if (error == null) { > lastCommittedOffsets = offsets; > } > onCommitCompleted(error, seqno); > } > }; > {code} > Hence if we get an out of order commit, then the internal state will be > inconsistent. To fix this, we should only override {{lastCommittedOffsets}} > after sequence validation as part of the {{onCommitCompleted(...)}} method. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5663) LogDirFailureTest system test fails
[ https://issues.apache.org/jira/browse/KAFKA-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127744#comment-16127744 ] Dong Lin commented on KAFKA-5663: - [~cmccabe] It appears that the bug in ConsoleConsumer has been hot-fixed by Jason yesterday. The test should pass now. > LogDirFailureTest system test fails > --- > > Key: KAFKA-5663 > URL: https://issues.apache.org/jira/browse/KAFKA-5663 > Project: Kafka > Issue Type: Bug >Reporter: Apurva Mehta >Assignee: Dong Lin > Fix For: 1.0.0 > > > The recently added JBOD system test failed last night. > {noformat} > Producer failed to produce messages for 20s. > Traceback (most recent call last): > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", > line 123, in run > data = self.run_test() > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", > line 176, in run_test > return self.test_context.function(self.test) > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py", > line 321, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/tests/kafkatest/tests/core/log_dir_failure_test.py", > line 166, in test_replication_with_disk_failure > self.start_producer_and_consumer() > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/tests/kafkatest/tests/produce_consume_validate.py", > line 75, in start_producer_and_consumer > self.producer_start_timeout_sec) > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/utils/util.py", > line 36, in wait_until > raise TimeoutError(err_msg) > TimeoutError: Producer failed to produce messages for 20s. > {noformat} > Complete logs here: > http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2017-07-26--001.1501074756--apache--trunk--91c207c/LogDirFailureTest/test_replication_with_disk_failure/bounce_broker=False.security_protocol=PLAINTEXT.broker_type=follower/48.tgz -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5714) Allow whitespaces in the principal name
[ https://issues.apache.org/jira/browse/KAFKA-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127737#comment-16127737 ] Manikumar commented on KAFKA-5714: -- By default, the SSL user name will be of the form "CN=writeuser,OU=Unknown,O=Unknown,L=Unknown,ST=Unknown,C=Unknown" , without any spaces. Are you sure of spaces in your SSL username? > Allow whitespaces in the principal name > --- > > Key: KAFKA-5714 > URL: https://issues.apache.org/jira/browse/KAFKA-5714 > Project: Kafka > Issue Type: Improvement > Components: security >Affects Versions: 0.10.2.1 >Reporter: Alla Tumarkin >Assignee: Manikumar > > Request > Improve parser behavior to allow whitespaces in the principal name in the > config file, as in: > {code} > super.users=User:CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, > C=Unknown > {code} > Background > Current implementation requires that there are no whitespaces after commas, > i.e. > {code} > super.users=User:CN=Unknown,OU=Unknown,O=Unknown,L=Unknown,ST=Unknown,C=Unknown > {code} > Note: having a semicolon at the end doesn't help, i.e. this does not work > either > {code} > super.users=User:CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, > C=Unknown; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5714) Allow whitespaces in the principal name
[ https://issues.apache.org/jira/browse/KAFKA-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127711#comment-16127711 ] Alla Tumarkin commented on KAFKA-5714: -- If I have super.users=User:CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=Unknown (with spaces), there is an error in authorizer log {code} 2017-08-11 12:37:26,560] DEBUG No acl found for resource Cluster:kafka-cluster, authorized = false (kafka.authorizer.logger) [2017-08-11 12:37:26,560] DEBUG Principal = User:CN=Unknown,OU=Unknown,O=Unknown,L=Unknown,ST=Unknown,C=Unknown is Denied Operation = ClusterAction from host = 127.0.0.1 on resource = Cluster:kafka-cluster (kafka.authorizer.logger) {code} But if I use super.users=User:CN=Unknown,OU=Unknown,O=Unknown,L=Unknown,ST=Unknown,C=Unknown (without spaces), there is no such error. Why is the behavior different? > Allow whitespaces in the principal name > --- > > Key: KAFKA-5714 > URL: https://issues.apache.org/jira/browse/KAFKA-5714 > Project: Kafka > Issue Type: Improvement > Components: security >Affects Versions: 0.10.2.1 >Reporter: Alla Tumarkin >Assignee: Manikumar > > Request > Improve parser behavior to allow whitespaces in the principal name in the > config file, as in: > {code} > super.users=User:CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, > C=Unknown > {code} > Background > Current implementation requires that there are no whitespaces after commas, > i.e. > {code} > super.users=User:CN=Unknown,OU=Unknown,O=Unknown,L=Unknown,ST=Unknown,C=Unknown > {code} > Note: having a semicolon at the end doesn't help, i.e. this does not work > either > {code} > super.users=User:CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, > C=Unknown; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-2283) scheduler exception on non-controller node when shutdown
[ https://issues.apache.org/jira/browse/KAFKA-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar resolved KAFKA-2283. -- Resolution: Fixed > scheduler exception on non-controller node when shutdown > > > Key: KAFKA-2283 > URL: https://issues.apache.org/jira/browse/KAFKA-2283 > Project: Kafka > Issue Type: Bug > Components: controller >Affects Versions: 0.8.2.1 > Environment: linux debian >Reporter: allenlee >Assignee: Neha Narkhede >Priority: Minor > > When broker shutdown, there is an error log about 'Kafka scheduler has not > been started'. > It only appears on non-controller node. If this broker is the controller, it > shutdown without warning log. > IMHO, *autoRebalanceScheduler.shutdown()* should only valid for controller, > right? > {quote} > [2015-06-17 22:32:51,814] INFO Shutdown complete. (kafka.log.LogManager) > [2015-06-17 22:32:51,815] WARN Kafka scheduler has not been started > (kafka.utils.Utils$) > java.lang.IllegalStateException: Kafka scheduler has not been started > at kafka.utils.KafkaScheduler.ensureStarted(KafkaScheduler.scala:114) > at kafka.utils.KafkaScheduler.shutdown(KafkaScheduler.scala:86) > at > kafka.controller.KafkaController.onControllerResignation(KafkaController.scala:350) > at > kafka.controller.KafkaController.shutdown(KafkaController.scala:664) > at > kafka.server.KafkaServer$$anonfun$shutdown$8.apply$mcV$sp(KafkaServer.scala:285) > at kafka.utils.Utils$.swallow(Utils.scala:172) > at kafka.utils.Logging$class.swallowWarn(Logging.scala:92) > at kafka.utils.Utils$.swallowWarn(Utils.scala:45) > at kafka.utils.Logging$class.swallow(Logging.scala:94) > at kafka.utils.Utils$.swallow(Utils.scala:45) > at kafka.server.KafkaServer.shutdown(KafkaServer.scala:285) > at > kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:42) > at kafka.Kafka$$anon$1.run(Kafka.scala:42) > [2015-06-17 22:32:51,818] INFO Terminate ZkClient event thread. > (org.I0Itec.zkclient.ZkEventThread) > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-2220) Improvement: Could we support rewind by time ?
[ https://issues.apache.org/jira/browse/KAFKA-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar resolved KAFKA-2220. -- Resolution: Fixed This got fixed in KAFKA-4743 / KIP-122. > Improvement: Could we support rewind by time ? > > > Key: KAFKA-2220 > URL: https://issues.apache.org/jira/browse/KAFKA-2220 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.8.1.1 >Reporter: Li Junjun > Attachments: screenshot.png > > > Improvement: Support rewind by time ! > My scenarios as follow: >A program read record from kafka and process then write to a dir in > HDFS like /hive/year=/month=xx/day=xx/hour=10 . If the program goes > down . I can restart it , so it read from last offset . > But what if the program was config with wrong params , so I need remove > dir hour=10 and reconfig my program and I need to find the offset where > hour=10 start , but now I can't do this. > And there are many scenarios like this. > so , can we add a time partition , so we can rewind by time ? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-5072) Kafka topics should allow custom metadata configs within some config namespace
[ https://issues.apache.org/jira/browse/KAFKA-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127680#comment-16127680 ] Soumabrata Chakraborty edited comment on KAFKA-5072 at 8/15/17 6:31 PM: Hi [~ondrej.tomcik] - Thanks for reaching out. There's a linked PR on this JIRA (Would also need some documentation changes) Its been some time since the JIRA and the PR was generated. There hasn't been much attention to this JIRA - I just assumed that not so many people came across the need to be able to tag topics with custom metadata. Good to know that you too are looking for a similar solution. Perhaps if you vote for the JIRA - would help it get some attention Feel free to reach out for further details. With this change you could add custom metadata using the kafka-configs.sh as shown below. The property name would need to start with "metadata." Let me know if this answers your question {code:java} [soumabrata@Krishna bin]$ ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name demo --alter --add-config 'metadata.contact.info=soumabr...@gmail.com' Completed Updating config for entity: topic 'demo'. [soumabrata@Krishna bin]$ ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name demo --describe Configs for topic 'demo' are metadata.contact.info=soumabr...@gmail.com {code} was (Author: soumabrata): Hi [~ondrej.tomcik] - Thanks for reaching out. There's a linked PR on this JIRA (Would also need some documentation changes) Its been some time since the JIRA and the PR was generated. There hasn't been much attention to this JIRA - I just assumed that not so many people came across the need to be able to tag topics with custom metadata. Good to know that you too are looking for a similar solution. Perhaps if you vote for the JIRA - would help it get some attention Feel free to reach out for further details. With this change you could add custom metadata using the kafka-configs.sh as shown below. Let me know if this answers your question {code:java} [soumabrata@Krishna bin]$ ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name demo --alter --add-config 'metadata.contact.info=soumabr...@gmail.com' Completed Updating config for entity: topic 'demo'. [soumabrata@Krishna bin]$ ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name demo --describe Configs for topic 'demo' are metadata.contact.info=soumabr...@gmail.com {code} > Kafka topics should allow custom metadata configs within some config namespace > -- > > Key: KAFKA-5072 > URL: https://issues.apache.org/jira/browse/KAFKA-5072 > Project: Kafka > Issue Type: Improvement > Components: config >Affects Versions: 0.10.2.0 >Reporter: Soumabrata Chakraborty >Assignee: Soumabrata Chakraborty >Priority: Minor > > Kafka topics should allow custom metadata configs > Such config properties may have some fixed namespace e.g. metadata* or custom* > This is handy for governance. For example, in large organizations sharing a > kafka cluster - it might be helpful to be able to configure properties like > metadata.contact.info, metadata.project, metadata.description on a topic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5072) Kafka topics should allow custom metadata configs within some config namespace
[ https://issues.apache.org/jira/browse/KAFKA-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127680#comment-16127680 ] Soumabrata Chakraborty commented on KAFKA-5072: --- Hi [~ondrej.tomcik] - Thanks for reaching out. There's a linked PR on this JIRA (Would also need some documentation changes) Its been some time since the JIRA and the PR was generated. There hasn't been much attention to this JIRA - I just assumed that not so many people came across the need to be able to tag topics with custom metadata. Good to know that you too are looking for a similar solution. Perhaps if you vote for the JIRA - would help it get some attention Feel free to reach out for further details. With this change you could add custom metadata using the kafka-configs.sh as shown below. Let me know if this answers your question {code:java} [soumabrata@Krishna bin]$ ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name demo --alter --add-config 'metadata.contact.info=soumabr...@gmail.com' Completed Updating config for entity: topic 'demo'. [soumabrata@Krishna bin]$ ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name demo --describe Configs for topic 'demo' are metadata.contact.info=soumabr...@gmail.com {code} > Kafka topics should allow custom metadata configs within some config namespace > -- > > Key: KAFKA-5072 > URL: https://issues.apache.org/jira/browse/KAFKA-5072 > Project: Kafka > Issue Type: Improvement > Components: config >Affects Versions: 0.10.2.0 >Reporter: Soumabrata Chakraborty >Assignee: Soumabrata Chakraborty >Priority: Minor > > Kafka topics should allow custom metadata configs > Such config properties may have some fixed namespace e.g. metadata* or custom* > This is handy for governance. For example, in large organizations sharing a > kafka cluster - it might be helpful to be able to configure properties like > metadata.contact.info, metadata.project, metadata.description on a topic. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-2206) Add AlterConfig and DescribeConfig requests to Kafka
[ https://issues.apache.org/jira/browse/KAFKA-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127676#comment-16127676 ] Manikumar commented on KAFKA-2206: -- this looks like is a duplicate of KIP-133/ KAFKA-3267. If so, we can close this jira. cc [~ijuma] > Add AlterConfig and DescribeConfig requests to Kafka > > > Key: KAFKA-2206 > URL: https://issues.apache.org/jira/browse/KAFKA-2206 > Project: Kafka > Issue Type: Sub-task >Reporter: Aditya Auradkar >Assignee: Aditya Auradkar > > See > https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration#KIP-21-DynamicConfiguration-ConfigAPI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5727) Add the archetype project along with "write applications" web docs.
[ https://issues.apache.org/jira/browse/KAFKA-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-5727: - Fix Version/s: 0.11.0.1 > Add the archetype project along with "write applications" web docs. > --- > > Key: KAFKA-5727 > URL: https://issues.apache.org/jira/browse/KAFKA-5727 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Guozhang Wang >Assignee: Guozhang Wang > Fix For: 0.11.0.1, 1.0.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-5440) Kafka Streams report state RUNNING even if all threads are dead
[ https://issues.apache.org/jira/browse/KAFKA-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127637#comment-16127637 ] Tommy Becker edited comment on KAFKA-5440 at 8/15/17 5:56 PM: -- To answer my own question, from what I can tell the PR for KAFKA-5372 will solve this. It would be nice to get a backport to a patch release before 1.0. was (Author: twbecker): To answer my own question, from what I can tell the PR for KAFKA-5372 will solve this. It would be nice to get a backport to patch release before 1.0. > Kafka Streams report state RUNNING even if all threads are dead > --- > > Key: KAFKA-5440 > URL: https://issues.apache.org/jira/browse/KAFKA-5440 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.2.1, 0.11.0.0 >Reporter: Matthias J. Sax > > From the mailing list: > {quote} > Hi All, > We recently implemented a health check for a Kafka Streams based application. > The health check is simply checking the state of Kafka Streams by calling > KafkaStreams.state(). It reports healthy if it’s not in PENDING_SHUTDOWN or > NOT_RUNNING states. > We truly appreciate having the possibility to easily check the state of Kafka > Streams but to our surprise we noticed that KafkaStreams.state() returns > RUNNING even though all StreamThreads has crashed and reached NOT_RUNNING > state. Is this intended behaviour or is it a bug? Semantically it seems weird > to me that KafkaStreams would say it’s RUNNING when it is in fact not > consuming anything since all underlying working threads has crashed. > If this is intended behaviour I would appreciate an explanation of why that > is the case. Also in that case, how could I determine if the consumption from > Kafka hasn’t crashed? > If this is not intended behaviour, how fast could I expect it to be fixed? I > wouldn’t mind fixing it myself but I’m not sure if this is considered trivial > or big enough to require a JIRA. Also, if I would implement a fix I’d like > your input on what would be a reasonable solution. By just inspecting to code > I have an idea but I’m not sure I understand all the implication so I’d be > happy to hear your thoughts first. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5440) Kafka Streams report state RUNNING even if all threads are dead
[ https://issues.apache.org/jira/browse/KAFKA-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127637#comment-16127637 ] Tommy Becker commented on KAFKA-5440: - To answer my own question, from what I can tell the PR for KAFKA-5372 will solve this. It would be nice to get a backport to patch release before 1.0. > Kafka Streams report state RUNNING even if all threads are dead > --- > > Key: KAFKA-5440 > URL: https://issues.apache.org/jira/browse/KAFKA-5440 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.2.1, 0.11.0.0 >Reporter: Matthias J. Sax > > From the mailing list: > {quote} > Hi All, > We recently implemented a health check for a Kafka Streams based application. > The health check is simply checking the state of Kafka Streams by calling > KafkaStreams.state(). It reports healthy if it’s not in PENDING_SHUTDOWN or > NOT_RUNNING states. > We truly appreciate having the possibility to easily check the state of Kafka > Streams but to our surprise we noticed that KafkaStreams.state() returns > RUNNING even though all StreamThreads has crashed and reached NOT_RUNNING > state. Is this intended behaviour or is it a bug? Semantically it seems weird > to me that KafkaStreams would say it’s RUNNING when it is in fact not > consuming anything since all underlying working threads has crashed. > If this is intended behaviour I would appreciate an explanation of why that > is the case. Also in that case, how could I determine if the consumption from > Kafka hasn’t crashed? > If this is not intended behaviour, how fast could I expect it to be fixed? I > wouldn’t mind fixing it myself but I’m not sure if this is considered trivial > or big enough to require a JIRA. Also, if I would implement a fix I’d like > your input on what would be a reasonable solution. By just inspecting to code > I have an idea but I’m not sure I understand all the implication so I’d be > happy to hear your thoughts first. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-1832) Async Producer will cause 'java.net.SocketException: Too many open files' when broker host does not exist
[ https://issues.apache.org/jira/browse/KAFKA-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar resolved KAFKA-1832. -- Resolution: Fixed Fixed in KAFKA-1041 > Async Producer will cause 'java.net.SocketException: Too many open files' > when broker host does not exist > - > > Key: KAFKA-1832 > URL: https://issues.apache.org/jira/browse/KAFKA-1832 > Project: Kafka > Issue Type: Bug > Components: producer >Affects Versions: 0.8.1, 0.8.1.1 > Environment: linux >Reporter: barney >Assignee: Jun Rao > > h3.How to replay the problem: > * producer configuration: > ** producer.type=async > ** metadata.broker.list=not.existed.com:9092 > Make sure the host '*not.existed.com*' does not exist in DNS server or > /etc/hosts; > * send a lot of messages continuously using the above producer > It will cause '*java.net.SocketException: Too many open files*' after a > while, or you can use '*lsof -p $pid|wc -l*' to check the count of open files > which will be increasing as time goes by until it reaches the system > limit(check by '*ulimit -n*'). > h3.Problem cause: > {code:title=kafka.network.BlockingChannel|borderStyle=solid} > channel.connect(new InetSocketAddress(host, port)) > {code} > this line will throw an exception > '*java.nio.channels.UnresolvedAddressException*' when broker host does not > exist, and at this same time the field '*connected*' is false; > In *kafka.producer.SyncProducer*, '*disconnect()*' will not invoke > '*blockingChannel.disconnect()*' because '*blockingChannel.isConnected*' is > false which means the FileDescriptor will be created but never closed; > h3.More: > When the broker is an non-existent ip(for example: > metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the > problem will not appear; > In *SocketChannelImpl.connect()*, '*Net.checkAddress()*' is not in try-catch > block but '*Net.connect()*' is in, that makes the difference; > h3.Temporary Solution: > {code:title=kafka.network.BlockingChannel|borderStyle=solid} > try > { > channel.connect(new InetSocketAddress(host, port)) > } > catch > { > case e: UnresolvedAddressException => > { > disconnect(); > throw e > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-1821) Example shell scripts broken
[ https://issues.apache.org/jira/browse/KAFKA-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manikumar resolved KAFKA-1821. -- Resolution: Fixed It is working in newer Kafka versions. > Example shell scripts broken > > > Key: KAFKA-1821 > URL: https://issues.apache.org/jira/browse/KAFKA-1821 > Project: Kafka > Issue Type: Bug > Components: packaging, tools >Affects Versions: 0.8.1.1 > Environment: Ubuntu 14.04, Linux 75477193b766 3.13.0-24-generic > #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux. > Scala: 2.8.0 >Reporter: Yong Fu >Priority: Minor > > After run ./gradlew jarAll to generate all jars including for examples, I try > to run the producer-consumer demo from shell scripts. But it doesn't work and > throw ClassNotFoundException. It seems the shell scripts > (java-producer-consumer-demo and java-simple-consumer-demo) still work on the > library structure for sbt. So it cannot find jar files under new structure > forced by gradle. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5603) Streams should not abort transaction when closing zombie task
[ https://issues.apache.org/jira/browse/KAFKA-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damian Guy updated KAFKA-5603: -- Fix Version/s: (was: 0.11.0.1) 0.11.0.2 > Streams should not abort transaction when closing zombie task > - > > Key: KAFKA-5603 > URL: https://issues.apache.org/jira/browse/KAFKA-5603 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Critical > Fix For: 0.11.0.2 > > > The contract of the transactional producer API is to not call any > transactional method after a {{ProducerFenced}} exception was thrown. > Streams however, does an unconditional call within {{StreamTask#close()}} to > {{abortTransaction()}} in case of unclean shutdown. We need to distinguish > between a {{ProducerFenced}} and other unclean shutdown cases. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5054) ChangeLoggingKeyValueByteStore delete and putIfAbsent should be synchronized
[ https://issues.apache.org/jira/browse/KAFKA-5054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damian Guy updated KAFKA-5054: -- Fix Version/s: (was: 0.11.0.1) 1.0.0 > ChangeLoggingKeyValueByteStore delete and putIfAbsent should be synchronized > > > Key: KAFKA-5054 > URL: https://issues.apache.org/jira/browse/KAFKA-5054 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.2.0 >Reporter: Damian Guy >Assignee: Damian Guy >Priority: Critical > Fix For: 1.0.0 > > > {{putIfAbsent}} and {{delete}} should be synchronized as they involve at > least 2 operations on the underlying store and may result in inconsistent > results if someone were to query via IQ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5535) Transformations - tranformations for value broken on tombstone events
[ https://issues.apache.org/jira/browse/KAFKA-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127574#comment-16127574 ] Randall Hauch commented on KAFKA-5535: -- [~hachikuji], can you please backport this to the {{0.11.0}} branch and update the fix version of this issue to include 0.11.0.1? IMO it is low-risk and a few people have been running into it with connectors that produce tombtstone events. Thanks! > Transformations - tranformations for value broken on tombstone events > - > > Key: KAFKA-5535 > URL: https://issues.apache.org/jira/browse/KAFKA-5535 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 0.10.2.1 > Environment: Unbuntu 14.04 > Java 8 >Reporter: Yelei Wu >Assignee: Ewen Cheslack-Postava > Labels: newbie > Fix For: 1.0.0 > > > I'm trying to use the transformation for Kafka Connect and running into > issues. > The tranformation configuration is: > - > "transforms": "GetAfter", > "transforms.GetAfter.type": > "org.apache.kafka.connect.transforms.ExtractField$Value", > "transforms.GetAfter.field": "after", > - > And I got the following errors occasionally: > - > org.apache.kafka.connect.errors.DataException: Only Map objects supported in > absence of schema for [field extraction], found: null > at > org.apache.kafka.connect.transforms.util.Requirements.requireMap(Requirements.java:38) > at > org.apache.kafka.connect.transforms.ExtractField.apply(ExtractField.java:57) > at > org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:39) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:408) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:249) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:179) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148) > at > org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:139) > at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:182) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > - > Seems that tombstone events break the transformation, but after checking the > source code for Transformations on Value (ExtractField$Value, ValueToKey, > MaskField$Value, ReplaceField$Value), none of them handles tombstone events > explicitly, none of them work through Tombstone events neither. > Null check in those transformations may be nessesary. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5731) Connect WorkerSinkTask out of order offset commit can lead to inconsistent state
[ https://issues.apache.org/jira/browse/KAFKA-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Randall Hauch updated KAFKA-5731: - Fix Version/s: 0.11.0.1 > Connect WorkerSinkTask out of order offset commit can lead to inconsistent > state > > > Key: KAFKA-5731 > URL: https://issues.apache.org/jira/browse/KAFKA-5731 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Reporter: Jason Gustafson >Assignee: Randall Hauch > Fix For: 0.11.0.1, 1.0.0 > > > In Connect's WorkerSinkTask, we do sequence number validation to ensure that > offset commits are handled in the right order > (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L199). > > Unfortunately, for asynchronous commits, the {{lastCommittedOffsets}} field > is overridden regardless of this sequence check as long as the response had > no error > (https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L284): > {code:java} > OffsetCommitCallback cb = new OffsetCommitCallback() { > @Override > public void onComplete(Map> offsets, Exception error) { > if (error == null) { > lastCommittedOffsets = offsets; > } > onCommitCompleted(error, seqno); > } > }; > {code} > Hence if we get an out of order commit, then the internal state will be > inconsistent. To fix this, we should only override {{lastCommittedOffsets}} > after sequence validation as part of the {{onCommitCompleted(...)}} method. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5714) Allow whitespaces in the principal name
[ https://issues.apache.org/jira/browse/KAFKA-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127513#comment-16127513 ] Manikumar commented on KAFKA-5714: -- ZK based topic creation/deletion doesn't go through ACL authorization. Not sure how these are related. You can enable authorizer logs and to verify any deny operations. > Allow whitespaces in the principal name > --- > > Key: KAFKA-5714 > URL: https://issues.apache.org/jira/browse/KAFKA-5714 > Project: Kafka > Issue Type: Improvement > Components: security >Affects Versions: 0.10.2.1 >Reporter: Alla Tumarkin >Assignee: Manikumar > > Request > Improve parser behavior to allow whitespaces in the principal name in the > config file, as in: > {code} > super.users=User:CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, > C=Unknown > {code} > Background > Current implementation requires that there are no whitespaces after commas, > i.e. > {code} > super.users=User:CN=Unknown,OU=Unknown,O=Unknown,L=Unknown,ST=Unknown,C=Unknown > {code} > Note: having a semicolon at the end doesn't help, i.e. this does not work > either > {code} > super.users=User:CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, > C=Unknown; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KAFKA-5679) Add logging to distinguish between internally and externally initiated shutdown of Kafka
[ https://issues.apache.org/jira/browse/KAFKA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apurva Mehta reassigned KAFKA-5679: --- Assignee: Rajini Sivaram (was: Apurva Mehta) > Add logging to distinguish between internally and externally initiated > shutdown of Kafka > > > Key: KAFKA-5679 > URL: https://issues.apache.org/jira/browse/KAFKA-5679 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.11.0.0 >Reporter: Apurva Mehta >Assignee: Rajini Sivaram > Fix For: 1.0.0 > > > Currently, if there is an internal error that triggers a shutdown of the > Kafka server, the {{Exit}} class is used, which begins the shutdown > procedure. The other way a shutdown is triggered is by {{SIGTERM}} or some > other signal. > We would like to distinguish between shutdown due to internal errors and > external signals. This helps when debugging. Particularly, a natural question > when a broker shuts down unexpectedly is: "did the deployment system send > the signal or is there some un logged fatal error in the broker"? > Today, we rely on callers of {{Exit}} to log the error before making the > call. However, this won't always have 100% coverage. It would be good to add > a log message in {{Exit}} to record that an exit method was invoked > explicitly. > We could also add a signal handler to log when {{SIGTERM}}, {{SIGKILL}} etc. > are received. > This would make operating Kafka a bit easier. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5679) Add logging to distinguish between internally and externally initiated shutdown of Kafka
[ https://issues.apache.org/jira/browse/KAFKA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127451#comment-16127451 ] Apurva Mehta commented on KAFKA-5679: - Assigning to [~rsivaram] since she already has a patch for this. > Add logging to distinguish between internally and externally initiated > shutdown of Kafka > > > Key: KAFKA-5679 > URL: https://issues.apache.org/jira/browse/KAFKA-5679 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.11.0.0 >Reporter: Apurva Mehta >Assignee: Rajini Sivaram > Fix For: 1.0.0 > > > Currently, if there is an internal error that triggers a shutdown of the > Kafka server, the {{Exit}} class is used, which begins the shutdown > procedure. The other way a shutdown is triggered is by {{SIGTERM}} or some > other signal. > We would like to distinguish between shutdown due to internal errors and > external signals. This helps when debugging. Particularly, a natural question > when a broker shuts down unexpectedly is: "did the deployment system send > the signal or is there some un logged fatal error in the broker"? > Today, we rely on callers of {{Exit}} to log the error before making the > call. However, this won't always have 100% coverage. It would be good to add > a log message in {{Exit}} to record that an exit method was invoked > explicitly. > We could also add a signal handler to log when {{SIGTERM}}, {{SIGKILL}} etc. > are received. > This would make operating Kafka a bit easier. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5723) Refactor BrokerApiVersionsCommand to use the new AdminClient
[ https://issues.apache.org/jira/browse/KAFKA-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127340#comment-16127340 ] ASF GitHub Bot commented on KAFKA-5723: --- GitHub user adyach opened a pull request: https://github.com/apache/kafka/pull/3671 KAFKA-5723[WIP]: Refactor BrokerApiVersionsCommand to use the new AdminClient This PR brings refactoring to new AdminClient java class for BrokerApiVersionsCommand. The code was not tested, because I just want to make sure, that I am going in the right direction with the implementation, at the end tests will be in place. There are also no java doc for the same reasons. I took a look at #3514 to be more consistent with the implementation for the whole topic of AdminClient refactoring, so I grabbed argparse4j. You can merge this pull request into a Git repository by running: $ git pull https://github.com/adyach/kafka kafka-5723 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3671.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3671 commit cc7b0eb5055ae44467fd41699ca53bf5a91da371 Author: Andrey DyachkovDate: 2017-08-14T20:46:27Z kafka-5723: preparation phase commit 8c2864cfda8e12db836de98ed7a59047f2311f85 Author: Andrey Dyachkov Date: 2017-08-15T14:43:07Z kafka-5723: draft impl for ListBrokersVersionInfo command > Refactor BrokerApiVersionsCommand to use the new AdminClient > > > Key: KAFKA-5723 > URL: https://issues.apache.org/jira/browse/KAFKA-5723 > Project: Kafka > Issue Type: Improvement >Reporter: Viktor Somogyi >Assignee: Viktor Somogyi > > Currently it uses the deprecated AdminClient and in order to remove usages of > that client, this class needs to be refactored. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-5440) Kafka Streams report state RUNNING even if all threads are dead
[ https://issues.apache.org/jira/browse/KAFKA-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127316#comment-16127316 ] Tommy Becker edited comment on KAFKA-5440 at 8/15/17 2:41 PM: -- Any update on this? We're seeing the same behavior and I though there have been some changes in this area in trunk I don't see anything that would fix this particular problem. org.apache.kafka.streams.KafkaStreams.StreamStateListener which is called back on the state changes of the thread still doesn't seem to care about transitions to terminal states. This is quite frustrating because we too were using the KafkaStreams state to determine the health of the topology. was (Author: twbecker): Any update on this? We're seeing the same behavior and I though there have been some changes in this area in trunk I don't see anything that would fix this particular problem. org.apache.kafka.streams.KafkaStreams.StreamStateListener which is called back on the state changes of the thread still doesn't seem to care about transitions to terminal states. > Kafka Streams report state RUNNING even if all threads are dead > --- > > Key: KAFKA-5440 > URL: https://issues.apache.org/jira/browse/KAFKA-5440 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.2.1, 0.11.0.0 >Reporter: Matthias J. Sax > > From the mailing list: > {quote} > Hi All, > We recently implemented a health check for a Kafka Streams based application. > The health check is simply checking the state of Kafka Streams by calling > KafkaStreams.state(). It reports healthy if it’s not in PENDING_SHUTDOWN or > NOT_RUNNING states. > We truly appreciate having the possibility to easily check the state of Kafka > Streams but to our surprise we noticed that KafkaStreams.state() returns > RUNNING even though all StreamThreads has crashed and reached NOT_RUNNING > state. Is this intended behaviour or is it a bug? Semantically it seems weird > to me that KafkaStreams would say it’s RUNNING when it is in fact not > consuming anything since all underlying working threads has crashed. > If this is intended behaviour I would appreciate an explanation of why that > is the case. Also in that case, how could I determine if the consumption from > Kafka hasn’t crashed? > If this is not intended behaviour, how fast could I expect it to be fixed? I > wouldn’t mind fixing it myself but I’m not sure if this is considered trivial > or big enough to require a JIRA. Also, if I would implement a fix I’d like > your input on what would be a reasonable solution. By just inspecting to code > I have an idea but I’m not sure I understand all the implication so I’d be > happy to hear your thoughts first. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5440) Kafka Streams report state RUNNING even if all threads are dead
[ https://issues.apache.org/jira/browse/KAFKA-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127316#comment-16127316 ] Tommy Becker commented on KAFKA-5440: - Any update on this? We're seeing the same behavior and I though there have been some changes in this area in trunk I don't see anything that would fix this particular problem. org.apache.kafka.streams.KafkaStreams.StreamStateListener which is called back on the state changes of the thread still doesn't seem to care about transitions to terminal states. > Kafka Streams report state RUNNING even if all threads are dead > --- > > Key: KAFKA-5440 > URL: https://issues.apache.org/jira/browse/KAFKA-5440 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.2.1, 0.11.0.0 >Reporter: Matthias J. Sax > > From the mailing list: > {quote} > Hi All, > We recently implemented a health check for a Kafka Streams based application. > The health check is simply checking the state of Kafka Streams by calling > KafkaStreams.state(). It reports healthy if it’s not in PENDING_SHUTDOWN or > NOT_RUNNING states. > We truly appreciate having the possibility to easily check the state of Kafka > Streams but to our surprise we noticed that KafkaStreams.state() returns > RUNNING even though all StreamThreads has crashed and reached NOT_RUNNING > state. Is this intended behaviour or is it a bug? Semantically it seems weird > to me that KafkaStreams would say it’s RUNNING when it is in fact not > consuming anything since all underlying working threads has crashed. > If this is intended behaviour I would appreciate an explanation of why that > is the case. Also in that case, how could I determine if the consumption from > Kafka hasn’t crashed? > If this is not intended behaviour, how fast could I expect it to be fixed? I > wouldn’t mind fixing it myself but I’m not sure if this is considered trivial > or big enough to require a JIRA. Also, if I would implement a fix I’d like > your input on what would be a reasonable solution. By just inspecting to code > I have an idea but I’m not sure I understand all the implication so I’d be > happy to hear your thoughts first. > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5735) Client-ids are not handled consistently by clients and broker
Rajini Sivaram created KAFKA-5735: - Summary: Client-ids are not handled consistently by clients and broker Key: KAFKA-5735 URL: https://issues.apache.org/jira/browse/KAFKA-5735 Project: Kafka Issue Type: Bug Components: clients, core Affects Versions: 0.11.0.0 Reporter: Rajini Sivaram Assignee: Rajini Sivaram Fix For: 1.0.0 At the moment, Java clients expect client-ids to use a limited set of characters so that they can be used without quoting in metrics. kafka-configs.sh allows quotas to be defined only for that limited set. But the broker does not validate client-ids. And the documentation does not mention any limitations. We need to either limit characters used in client-ids, document and validate them or we should allow any characters and treat them consistently in the same way as we handle user principals. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5718) Better document what LogAppendTime means
[ https://issues.apache.org/jira/browse/KAFKA-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127086#comment-16127086 ] Umesh Chaudhary commented on KAFKA-5718: [~cotedm] Message Format seems correct section to place this info. Will send the PR soon ! > Better document what LogAppendTime means > > > Key: KAFKA-5718 > URL: https://issues.apache.org/jira/browse/KAFKA-5718 > Project: Kafka > Issue Type: Improvement > Components: documentation >Affects Versions: 0.11.0.0 >Reporter: Dustin Cote >Priority: Trivial > > There isn't a good description of LogAppendTime in the documentation. It > would be nice to add this in somewhere to say something like: > LogAppendTime is some time between when the partition leader receives the > request and before it writes it to it's local log. > There are two important distinctions that trip people up: > 1) This timestamp is not when the consumer could have first consumed the > message. This instead requires min.insync.replicas to have been satisfied. > 2) This is not precisely when the leader wrote to it's log, there can be > delays along the path between receiving the request and writing to the log. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KAFKA-5718) Better document what LogAppendTime means
[ https://issues.apache.org/jira/browse/KAFKA-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Chaudhary reassigned KAFKA-5718: -- Assignee: Umesh Chaudhary > Better document what LogAppendTime means > > > Key: KAFKA-5718 > URL: https://issues.apache.org/jira/browse/KAFKA-5718 > Project: Kafka > Issue Type: Improvement > Components: documentation >Affects Versions: 0.11.0.0 >Reporter: Dustin Cote >Assignee: Umesh Chaudhary >Priority: Trivial > > There isn't a good description of LogAppendTime in the documentation. It > would be nice to add this in somewhere to say something like: > LogAppendTime is some time between when the partition leader receives the > request and before it writes it to it's local log. > There are two important distinctions that trip people up: > 1) This timestamp is not when the consumer could have first consumed the > message. This instead requires min.insync.replicas to have been satisfied. > 2) This is not precisely when the leader wrote to it's log, there can be > delays along the path between receiving the request and writing to the log. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5734) Heap (Old generation space) gradually increase
[ https://issues.apache.org/jira/browse/KAFKA-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127078#comment-16127078 ] Manikumar commented on KAFKA-5734: -- We can use JMAP command to output a histogram of java object heap. This will help us to analyze the heap memory usage. Take periodic outputs and compare the outputs. {quote}jdk/bin/jmap -histo:live PID{quote} > Heap (Old generation space) gradually increase > -- > > Key: KAFKA-5734 > URL: https://issues.apache.org/jira/browse/KAFKA-5734 > Project: Kafka > Issue Type: Bug > Components: metrics >Affects Versions: 0.10.2.0 > Environment: ubuntu 14.04 / java 1.7.0 >Reporter: jang > Attachments: heap-log.xlsx > > > I set up kafka server on ubuntu with 4GB ram. > Heap ( Old generation space ) size is increasing gradually like attached > excel file which recorded gc info in 1 minute interval. > Finally OU occupies 2.6GB and GC expend too much time ( And out of memory > exception ) > kafka process argumens are below. > _java -Xmx3000M -Xms2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC > -Djava.awt.headless=true > -Xloggc:/usr/local/kafka/bin/../logs/kafkaServer-gc.log -verbose:gc > -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dkafka.logs.dir=/usr/local/kafka/bin/../logs > -Dlog4j.configuration=file:/usr/local/kafka/bin/../config/log4j.properties_ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5734) Heap (Old generation space) gradually increase
jang created KAFKA-5734: --- Summary: Heap (Old generation space) gradually increase Key: KAFKA-5734 URL: https://issues.apache.org/jira/browse/KAFKA-5734 Project: Kafka Issue Type: Bug Components: metrics Affects Versions: 0.10.2.0 Environment: ubuntu 14.04 / java 1.7.0 Reporter: jang Attachments: heap-log.xlsx I set up kafka server on ubuntu with 4GB ram. Heap ( Old generation space ) size is increasing gradually like attached excel file which recorded gc info in 1 minute interval. Finally OU occupies 2.6GB and GC expend too much time ( And out of memory exception ) kafka process argumens are below. _java -Xmx3000M -Xms2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC -Djava.awt.headless=true -Xloggc:/usr/local/kafka/bin/../logs/kafkaServer-gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dkafka.logs.dir=/usr/local/kafka/bin/../logs -Dlog4j.configuration=file:/usr/local/kafka/bin/../config/log4j.properties_ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5726) KafkaConsumer.subscribe() overload that takes just Pattern without ConsumerRebalanceListener
[ https://issues.apache.org/jira/browse/KAFKA-5726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126987#comment-16126987 ] ASF GitHub Bot commented on KAFKA-5726: --- GitHub user attilakreiner opened a pull request: https://github.com/apache/kafka/pull/3669 KAFKA-5726: KafkaConsumer.subscribe() overload that takes just Pattern - changed the interface & implementations - updated tests to use the new method where applicable You can merge this pull request into a Git repository by running: $ git pull https://github.com/attilakreiner/kafka KAFKA-5726 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3669.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3669 commit 8e7b6e741f0ad042421ada39e0c9c1546d231bfb Author: Attila KreinerDate: 2017-08-14T20:56:21Z KAFKA-5726: KafkaConsumer.subscribe() overload that takes just Pattern > KafkaConsumer.subscribe() overload that takes just Pattern without > ConsumerRebalanceListener > > > Key: KAFKA-5726 > URL: https://issues.apache.org/jira/browse/KAFKA-5726 > Project: Kafka > Issue Type: Improvement > Components: consumer >Affects Versions: 0.11.0.0 >Reporter: Yeva Byzek >Priority: Minor > Labels: needs-kip, newbie, usability > > Request: provide {{subscribe(Pattern pattern)}} overload, similar to > {{subscribe(Collection topics)}}, > Today, for a consumer to subscribe to topics based on a regular expression > (i.e. {{Pattern}}), the only method option also requires to pass in a > {{ConsumerRebalanceListener}}. This is not user-friendly to require this > second argument. It seems {{new NoOpConsumerRebalanceListener()}} has to be > used. > Use case: multi datacenter, allowing easier subscription to multiple topics > prefixed with datacenter names, just by using a pattern subscription. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5679) Add logging to distinguish between internally and externally initiated shutdown of Kafka
[ https://issues.apache.org/jira/browse/KAFKA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126907#comment-16126907 ] ASF GitHub Bot commented on KAFKA-5679: --- GitHub user rajinisivaram opened a pull request: https://github.com/apache/kafka/pull/3668 KAFKA-5679: Add logging for broker termination due to SIGTERM or SIGINT You can merge this pull request into a Git repository by running: $ git pull https://github.com/rajinisivaram/kafka KAFKA-5679 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3668.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3668 commit e6f02a6eb90c242d850d3aab2e59d74162d88626 Author: rajiniDate: 2017-08-15T07:29:49Z KAFKA-5679: Add logging for broker termination due to SIGTERM or SIGINT > Add logging to distinguish between internally and externally initiated > shutdown of Kafka > > > Key: KAFKA-5679 > URL: https://issues.apache.org/jira/browse/KAFKA-5679 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.11.0.0 >Reporter: Apurva Mehta >Assignee: Apurva Mehta > Fix For: 1.0.0 > > > Currently, if there is an internal error that triggers a shutdown of the > Kafka server, the {{Exit}} class is used, which begins the shutdown > procedure. The other way a shutdown is triggered is by {{SIGTERM}} or some > other signal. > We would like to distinguish between shutdown due to internal errors and > external signals. This helps when debugging. Particularly, a natural question > when a broker shuts down unexpectedly is: "did the deployment system send > the signal or is there some un logged fatal error in the broker"? > Today, we rely on callers of {{Exit}} to log the error before making the > call. However, this won't always have 100% coverage. It would be good to add > a log message in {{Exit}} to record that an exit method was invoked > explicitly. > We could also add a signal handler to log when {{SIGTERM}}, {{SIGKILL}} etc. > are received. > This would make operating Kafka a bit easier. -- This message was sent by Atlassian JIRA (v6.4.14#64029)