[jira] [Commented] (KAFKA-8719) kafka-console-consumer bypassing sentry evaluations while specifying --partition option

2019-08-29 Thread huxihx (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919137#comment-16919137
 ] 

huxihx commented on KAFKA-8719:
---

What version did you use?  Options group and partition should not be specified 
together. Besides, I did not reproduce this issue using the latest version(2.3).

> kafka-console-consumer bypassing sentry evaluations while specifying 
> --partition option
> ---
>
> Key: KAFKA-8719
> URL: https://issues.apache.org/jira/browse/KAFKA-8719
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, tools
>Reporter: Sathish
>Priority: Major
>  Labels: kafka-console-cons
>
> While specifying --partition option on kafka-console-consumer, it is 
> bypassing the sentry evaluations and making the users to consume messages 
> freely. Even though a consumer group does not have access to consume messages 
> from topics --partition option bypassing the evaluation
> Example command used:
> #kafka-console-consumer  --topic booktopic1 --consumer.config 
> consumer.properties --bootstrap-server :9092 --from-beginning 
> --consumer-property group.id=spark-kafka-111 --partition 0
> This succeeds even though, if spark-kafka-111 does not have any access on 
> topic booktopic1
> whereas 
> #kafka-console-consumer  --topic booktopic1 --consumer.config 
> consumer.properties --bootstrap-server :9092 --from-beginning 
> --consumer-property group.id=spark-kafka-111
> Fails with topic authorisation issues



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8732) specifying a non-existent broker to ./bin/kafka-reassign-partitions.sh leads to reassignment never getting completed.

2019-08-29 Thread huxihx (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919121#comment-16919121
 ] 

huxihx commented on KAFKA-8732:
---

The issue was already fixed in newer versions where ReassignPartitionsCommand 
checks existence for to-be-reassigned brokers before the execution.

> specifying a non-existent broker to ./bin/kafka-reassign-partitions.sh leads 
> to reassignment never getting completed.
> -
>
> Key: KAFKA-8732
> URL: https://issues.apache.org/jira/browse/KAFKA-8732
> Project: Kafka
>  Issue Type: Bug
>  Components: controller, tools
>Affects Versions: 0.10.1.1
> Environment: Ubuntu-VERSION="14.04.5 LTS"
>Reporter: Ron1994
>Priority: Critical
>  Labels: bin, tools
>
> Specifying a non-existent broker to ./bin/kafka-reassign-partitions.sh leads 
> to reassignment never getting completed. 
>  My reassignment is getting struck if I provide non-existing broker ID. My 
> kafka version is 0.10.1.1.
>  
>  
> {code:java}
> ./kafka-reassign-partitions.sh --zookeeper zk:2181 --reassignment-json-file 
> le.json --execute
> Current partition replica assignment
> {"version":1,"partitions":[{"topic":"cv-topic","partition":0,"replicas":[1011131,101067,98,101240]}]}
> Save this to use as the --reassignment-json-file option during rollback
> Successfully started reassignment of partitions.
> {code}
> In this 98 is the non-existing broker. Deleting reassign_partitions znode is 
> of no use as well. As when I describe the same topic the 98 broker is out of 
> sync.
>  
>  
> {code:java}
> Topic:cv-topic PartitionCount:1 ReplicationFactor:4 Configs:
> Topic: cv-topic Partition: 0 Leader: 1011131 Replicas: 
> 1011131,101067,98,101240 Isr: 1011131,101067,101240
> {code}
> Now 98 will always be out of sync.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Issue Comment Deleted] (KAFKA-8719) kafka-console-consumer bypassing sentry evaluations while specifying --partition option

2019-08-29 Thread huxihx (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huxihx updated KAFKA-8719:
--
Comment: was deleted

(was: The issue was already fixed in newer versions where 
ReassignPartitionsCommand checks existence for to-be-reassigned brokers before 
the execution.)

> kafka-console-consumer bypassing sentry evaluations while specifying 
> --partition option
> ---
>
> Key: KAFKA-8719
> URL: https://issues.apache.org/jira/browse/KAFKA-8719
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, tools
>Reporter: Sathish
>Priority: Major
>  Labels: kafka-console-cons
>
> While specifying --partition option on kafka-console-consumer, it is 
> bypassing the sentry evaluations and making the users to consume messages 
> freely. Even though a consumer group does not have access to consume messages 
> from topics --partition option bypassing the evaluation
> Example command used:
> #kafka-console-consumer  --topic booktopic1 --consumer.config 
> consumer.properties --bootstrap-server :9092 --from-beginning 
> --consumer-property group.id=spark-kafka-111 --partition 0
> This succeeds even though, if spark-kafka-111 does not have any access on 
> topic booktopic1
> whereas 
> #kafka-console-consumer  --topic booktopic1 --consumer.config 
> consumer.properties --bootstrap-server :9092 --from-beginning 
> --consumer-property group.id=spark-kafka-111
> Fails with topic authorisation issues



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8719) kafka-console-consumer bypassing sentry evaluations while specifying --partition option

2019-08-29 Thread huxihx (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919120#comment-16919120
 ] 

huxihx commented on KAFKA-8719:
---

The issue was already fixed in newer versions where ReassignPartitionsCommand 
checks existence for to-be-reassigned brokers before the execution.

> kafka-console-consumer bypassing sentry evaluations while specifying 
> --partition option
> ---
>
> Key: KAFKA-8719
> URL: https://issues.apache.org/jira/browse/KAFKA-8719
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, tools
>Reporter: Sathish
>Priority: Major
>  Labels: kafka-console-cons
>
> While specifying --partition option on kafka-console-consumer, it is 
> bypassing the sentry evaluations and making the users to consume messages 
> freely. Even though a consumer group does not have access to consume messages 
> from topics --partition option bypassing the evaluation
> Example command used:
> #kafka-console-consumer  --topic booktopic1 --consumer.config 
> consumer.properties --bootstrap-server :9092 --from-beginning 
> --consumer-property group.id=spark-kafka-111 --partition 0
> This succeeds even though, if spark-kafka-111 does not have any access on 
> topic booktopic1
> whereas 
> #kafka-console-consumer  --topic booktopic1 --consumer.config 
> consumer.properties --bootstrap-server :9092 --from-beginning 
> --consumer-property group.id=spark-kafka-111
> Fails with topic authorisation issues



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8718) Not authorized to access topics: [__consumer_offsets] with Apache Kafka 2.3.0

2019-08-29 Thread huxihx (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919109#comment-16919109
 ] 

huxihx commented on KAFKA-8718:
---

Option `blacklist` was removed by 
[KAFKA-2983|https://issues.apache.org/jira/browse/KAFKA-2983]. Did you enable 
security for the source cluster?

> Not authorized to access topics: [__consumer_offsets] with Apache Kafka 2.3.0
> -
>
> Key: KAFKA-8718
> URL: https://issues.apache.org/jira/browse/KAFKA-8718
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, mirrormaker, producer 
>Affects Versions: 2.3.0
>Reporter: Bala Bharath Reddy Resapu
>Priority: Critical
>
> Hi Team,
> I am trying to replicate all topics from one instance to other instance using 
> Kafka mirror maker. When i specify to copy all the topics using whitelist 
> option it fails with the below error. Upon reading few blogs people have 
> suggested to mention the offset topic in blacklist. When i tried to do that 
> it fails saying not a recognized option. Please suggest if this is a bug or 
> do we have a fix for this.
> /usr/src/mirror-maker/kafka_2.12-2.3.0/bin/kafka-mirror-maker.sh 
> --consumer.config sourceClusterConsumer.properties --producer.config 
> targetClusterProducer.properties --num.streams 4 --whitelist=".*"
> ERROR Error when sending message to topic __consumer_offsets with key: 62 
> bytes, value: 28 bytes with error: 
> (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
> org.apache.kafka.common.errors.TopicAuthorizationException: Not authorized to 
> access topics: [__consumer_offsets]
>  
> --blacklist "__consumer_offsets
>  ERROR Exception when starting mirror maker. (kafka.tools.MirrorMaker$)
> joptsimple.UnrecognizedOptionException: blacklist is not a recognized option



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8793) StickyTaskAssignor throws java.lang.ArithmeticException

2019-08-29 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919067#comment-16919067
 ] 

Guozhang Wang commented on KAFKA-8793:
--

[~rocketraman] I think I found out the root cause of this issue, which is 
correlated to version probing such that if all clients are excluded as future 
consumers, the client involving in the actual assignment would be empty and 
hence the summed total capacity being zero. Will file a PR for fixing this 
issue asap.

> StickyTaskAssignor throws java.lang.ArithmeticException
> ---
>
> Key: KAFKA-8793
> URL: https://issues.apache.org/jira/browse/KAFKA-8793
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Raman Gupta
>Assignee: Guozhang Wang
>Priority: Critical
>
> Occassionally when starting a streams consumer that uses the static consumer 
> group protocol, I get the following error:
> {code:java}
> 2019-08-13 06:06:43,527 ERROR --- [691d2-StreamThread-1] 
> org.apa.kaf.str.pro.int.StreamThread  : stream-thread 
> [prod-cisSegmenter-777489d8-6cc5-48b4-8771-868d873691d2-StreamThread-1] 
> Encountered the following er
> ror during processing:
> EXCEPTION: java.lang.ArithmeticException: / by zero
> at 
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignor.assignActive(StickyTaskAssignor.java:76)
>  ~[kafka-streams-2.3.0.jar:?]
> at 
> org.apache.kafka.streams.processor.internals.assignment.StickyTaskAssignor.assign(StickyTaskAssignor.java:52)
>  ~[kafka-streams-2.3.0.jar:?]
> at 
> org.apache.kafka.streams.processor.internals.StreamsPartitionAssignor.assign(StreamsPartitionAssignor.java:634)
>  ~[kafka-streams-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.performAssignment(ConsumerCoordinator.java:424)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onJoinLeader(AbstractCoordinator.java:622)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.access$1100(AbstractCoordinator.java:107)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:544)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:527)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:978)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:958)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:578)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:388)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:294)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:212)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:415)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:358)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:353)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1251)
>  ~[kafka-clients-2.3.0.jar:?]
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1216) 
> 

[jira] [Comment Edited] (KAFKA-8522) Tombstones can survive forever

2019-08-29 Thread Richard Yu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918980#comment-16918980
 ] 

Richard Yu edited comment on KAFKA-8522 at 8/29/19 10:06 PM:
-

[~junrao] I think I've hit a caveat with your approach. The problem I've 
encountered here is that the partitions that are "assigned" to a LogCleaner 
could fluctuate after the LogCleaner instance is constructed. This has some 
implications because new TopicPartitions could be added to or removed from this 
"assignment". The consequences are that files are created and removed far more 
often than comfortable under certain conditions (not completely sure here).  

For details, I noticed that in LogCleanerManager constructor, the {{logs}} 
parameter (the equivalent of the "assignment") is essentially a ConcurrentMap 
which can have its contents change after initialization. That means files also 
have to be repeatedly created and destroyed. Your thoughts on this?


was (Author: yohan123):
[~junrao] I think I've hit a caveat with your approach. The problem I've 
encountered here is that the partitions that are "assigned" to a LogCleaner 
could fluctuate after the LogCleaner instance is constructed. This has some 
implications because new TopicPartitions could be added to or removed from this 
"assignment". The consequences are that files are created and removed far more 
often than comfortable under certain conditions. 

For details, I noticed that in LogCleanerManager constructor, the {{logs}} 
parameter (the equivalent of the "assignment") is essentially a ConcurrentMap 
which can have its contents change after initialization. That means files also 
have to be repeatedly created and destroyed. Your thoughts on this?

> Tombstones can survive forever
> --
>
> Key: KAFKA-8522
> URL: https://issues.apache.org/jira/browse/KAFKA-8522
> Project: Kafka
>  Issue Type: Improvement
>  Components: log cleaner
>Reporter: Evelyn Bayes
>Priority: Minor
>
> This is a bit grey zone as to whether it's a "bug" but it is certainly 
> unintended behaviour.
>  
> Under specific conditions tombstones effectively survive forever:
>  * Small amount of throughput;
>  * min.cleanable.dirty.ratio near or at 0; and
>  * Other parameters at default.
> What  happens is all the data continuously gets cycled into the oldest 
> segment. Old records get compacted away, but the new records continuously 
> update the timestamp of the oldest segment reseting the countdown for 
> deleting tombstones.
> So tombstones build up in the oldest segment forever.
>  
> While you could "fix" this by reducing the segment size, this can be 
> undesirable as a sudden change in throughput could cause a dangerous number 
> of segments to be created.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8522) Tombstones can survive forever

2019-08-29 Thread Richard Yu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918985#comment-16918985
 ] 

Richard Yu commented on KAFKA-8522:
---

I've some thoughts to get around this. (the best approach I can think of is 
once a topic partition and its respective checkpoint file is created, we don't 
remove the entry from the partition to checkpoint map). Once the 
LogCleanerManager instance is destroyed (with the exception of when 
{{partitions}} are having their checkpoint files being moved from one directory 
to the other), then we remove all files. 

> Tombstones can survive forever
> --
>
> Key: KAFKA-8522
> URL: https://issues.apache.org/jira/browse/KAFKA-8522
> Project: Kafka
>  Issue Type: Improvement
>  Components: log cleaner
>Reporter: Evelyn Bayes
>Priority: Minor
>
> This is a bit grey zone as to whether it's a "bug" but it is certainly 
> unintended behaviour.
>  
> Under specific conditions tombstones effectively survive forever:
>  * Small amount of throughput;
>  * min.cleanable.dirty.ratio near or at 0; and
>  * Other parameters at default.
> What  happens is all the data continuously gets cycled into the oldest 
> segment. Old records get compacted away, but the new records continuously 
> update the timestamp of the oldest segment reseting the countdown for 
> deleting tombstones.
> So tombstones build up in the oldest segment forever.
>  
> While you could "fix" this by reducing the segment size, this can be 
> undesirable as a sudden change in throughput could cause a dangerous number 
> of segments to be created.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (KAFKA-8522) Tombstones can survive forever

2019-08-29 Thread Richard Yu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918980#comment-16918980
 ] 

Richard Yu edited comment on KAFKA-8522 at 8/29/19 10:01 PM:
-

[~junrao] I think I've hit a caveat with your approach. The problem I've 
encountered here is that the partitions that are "assigned" to a LogCleaner 
could fluctuate after the LogCleaner instance is constructed. This has some 
implications because new TopicPartitions could be added to or removed from this 
"assignment". The consequences are that files are created and removed far more 
often than comfortable under certain conditions. 

For details, I noticed that in LogCleanerManager constructor, the {{logs}} 
parameter (the equivalent of the "assignment") is essentially a ConcurrentMap 
which can have its contents change after initialization. That means files also 
have to be repeatedly created and destroyed. Your thoughts on this?


was (Author: yohan123):
@Jun Rao I think I've hit a caveat with your approach. The problem I've 
encountered here is that the partitions that are "assigned" to a LogCleaner 
could fluctuate after the LogCleaner instance is constructed. This has some 
implications because new TopicPartitions could be added to or removed from this 
"assignment". The consequences are that files are created and removed far more 
often than comfortable under certain conditions. 

For details, I noticed that in LogCleanerManager constructor, the {{logs}} 
parameter (the equivalent of the "assignment") is essentially a ConcurrentMap 
which can have its contents change after initialization. That means files also 
have to be repeatedly created and destroyed. Your thoughts on this?

> Tombstones can survive forever
> --
>
> Key: KAFKA-8522
> URL: https://issues.apache.org/jira/browse/KAFKA-8522
> Project: Kafka
>  Issue Type: Improvement
>  Components: log cleaner
>Reporter: Evelyn Bayes
>Priority: Minor
>
> This is a bit grey zone as to whether it's a "bug" but it is certainly 
> unintended behaviour.
>  
> Under specific conditions tombstones effectively survive forever:
>  * Small amount of throughput;
>  * min.cleanable.dirty.ratio near or at 0; and
>  * Other parameters at default.
> What  happens is all the data continuously gets cycled into the oldest 
> segment. Old records get compacted away, but the new records continuously 
> update the timestamp of the oldest segment reseting the countdown for 
> deleting tombstones.
> So tombstones build up in the oldest segment forever.
>  
> While you could "fix" this by reducing the segment size, this can be 
> undesirable as a sudden change in throughput could cause a dangerous number 
> of segments to be created.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8522) Tombstones can survive forever

2019-08-29 Thread Richard Yu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918980#comment-16918980
 ] 

Richard Yu commented on KAFKA-8522:
---

@Jun Rao I think I've hit a caveat with your approach. The problem I've 
encountered here is that the partitions that are "assigned" to a LogCleaner 
could fluctuate after the LogCleaner instance is constructed. This has some 
implications because new TopicPartitions could be added to or removed from this 
"assignment". The consequences are that files are created and removed far more 
often than comfortable under certain conditions. 

For details, I noticed that in LogCleanerManager constructor, the {{logs}} 
parameter (the equivalent of the "assignment") is essentially a ConcurrentMap 
which can have its contents change after initialization. That means files also 
have to be repeatedly created and destroyed. Your thoughts on this?

> Tombstones can survive forever
> --
>
> Key: KAFKA-8522
> URL: https://issues.apache.org/jira/browse/KAFKA-8522
> Project: Kafka
>  Issue Type: Improvement
>  Components: log cleaner
>Reporter: Evelyn Bayes
>Priority: Minor
>
> This is a bit grey zone as to whether it's a "bug" but it is certainly 
> unintended behaviour.
>  
> Under specific conditions tombstones effectively survive forever:
>  * Small amount of throughput;
>  * min.cleanable.dirty.ratio near or at 0; and
>  * Other parameters at default.
> What  happens is all the data continuously gets cycled into the oldest 
> segment. Old records get compacted away, but the new records continuously 
> update the timestamp of the oldest segment reseting the countdown for 
> deleting tombstones.
> So tombstones build up in the oldest segment forever.
>  
> While you could "fix" this by reducing the segment size, this can be 
> undesirable as a sudden change in throughput could cause a dangerous number 
> of segments to be created.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Closed] (KAFKA-8828) [BC Break] Global store returns a TimestampedKeyValueStore in 2.3

2019-08-29 Thread Marcos Passos (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcos Passos closed KAFKA-8828.


> [BC Break] Global store returns a TimestampedKeyValueStore in 2.3
> -
>
> Key: KAFKA-8828
> URL: https://issues.apache.org/jira/browse/KAFKA-8828
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Marcos Passos
>Priority: Major
>
> Since 2.3, {{ProcessorContext}} returns a {{TimestampedKeyValueStore}} for 
> global stores, which is backward incompatible. This change makes the upgrade 
> path a lot painful and involves creating a non-trivial adapter to hide the 
> timestamp-related functionality in cases where it is not needed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (KAFKA-8828) [BC Break] Global store returns a TimestampedKeyValueStore in 2.3

2019-08-29 Thread Marcos Passos (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcos Passos resolved KAFKA-8828.
--
Resolution: Invalid

> [BC Break] Global store returns a TimestampedKeyValueStore in 2.3
> -
>
> Key: KAFKA-8828
> URL: https://issues.apache.org/jira/browse/KAFKA-8828
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Marcos Passos
>Priority: Major
>
> Since 2.3, {{ProcessorContext}} returns a {{TimestampedKeyValueStore}} for 
> global stores, which is backward incompatible. This change makes the upgrade 
> path a lot painful and involves creating a non-trivial adapter to hide the 
> timestamp-related functionality in cases where it is not needed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8828) [BC Break] Global store returns a TimestampedKeyValueStore in 2.3

2019-08-29 Thread Marcos Passos (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918970#comment-16918970
 ] 

Marcos Passos commented on KAFKA-8828:
--

[~mjsax] Thank you very much for your support. I understand your point now, so 
I'm closing this issue.

> [BC Break] Global store returns a TimestampedKeyValueStore in 2.3
> -
>
> Key: KAFKA-8828
> URL: https://issues.apache.org/jira/browse/KAFKA-8828
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Marcos Passos
>Priority: Major
>
> Since 2.3, {{ProcessorContext}} returns a {{TimestampedKeyValueStore}} for 
> global stores, which is backward incompatible. This change makes the upgrade 
> path a lot painful and involves creating a non-trivial adapter to hide the 
> timestamp-related functionality in cases where it is not needed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8828) [BC Break] Global store returns a TimestampedKeyValueStore in 2.3

2019-08-29 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918969#comment-16918969
 ] 

Matthias J. Sax commented on KAFKA-8828:


{quote}only started using them since 2.3,
{quote}
{quote}since I wasn't upgrading
{quote}
Well. For this case it seems your code never worked (and hence it did no 
_break_) :)
{quote}much of the existing documentation is not current to this change.
{quote}
Can you elaborate? I don't think we have any example code that tries to access 
a (Global)KTable store in a Transformer? Also, I don't think that the 
implementation details of (Global)KTable stores are discussed in the docs? It 
would be helpful to understand which part in the docs you are referring to and 
what we need to update to avoid this miss understanding?

> [BC Break] Global store returns a TimestampedKeyValueStore in 2.3
> -
>
> Key: KAFKA-8828
> URL: https://issues.apache.org/jira/browse/KAFKA-8828
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Marcos Passos
>Priority: Major
>
> Since 2.3, {{ProcessorContext}} returns a {{TimestampedKeyValueStore}} for 
> global stores, which is backward incompatible. This change makes the upgrade 
> path a lot painful and involves creating a non-trivial adapter to hide the 
> timestamp-related functionality in cases where it is not needed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8832) We should limit the maximum size read by a fetch request on the kafka server.

2019-08-29 Thread Stanislav Kozlovski (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918926#comment-16918926
 ] 

Stanislav Kozlovski commented on KAFKA-8832:


Do you happen to have the text files of the logs with the error? It is 
uncomfortable to troubleshoot the issue through images

> We should limit the maximum size read by a fetch request on the kafka server.
> -
>
> Key: KAFKA-8832
> URL: https://issues.apache.org/jira/browse/KAFKA-8832
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.3.0, 2.2.1
>Reporter: ChenLin
>Priority: Major
>  Labels: needs-kip
> Attachments: image-2019-08-25-15-31-56-707.png, 
> image-2019-08-25-15-42-24-379.png, image-2019-08-29-11-01-04-147.png, 
> image-2019-08-29-11-01-17-347.png, image-2019-08-29-11-02-01-477.png, 
> image-2019-08-29-11-03-37-693.png, image-2019-08-29-11-21-49-998.png, 
> image-2019-08-29-11-23-53-155.png, image-2019-08-29-11-25-52-242.png
>
>
> I found that kafka is not on the server side, limiting the amount of data 
> read per fetch request. This may cause the kafka server program to report an 
> error: OutOfMemory. Due to unreasonable client configuration, 
> fetch.message.max.bytes configuration is too large, such as 100M, because the 
> kafka server receives a lot of fetch requests at a certain moment, causing 
> the server to report an error: OutOfMemory。So I think this is a bug。
>    !image-2019-08-29-11-25-52-242.png!
> !image-2019-08-25-15-42-24-379.png!
> !image-2019-08-25-15-31-56-707.png!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8803) Stream will not start due to TimeoutException: Timeout expired after 60000milliseconds while awaiting InitProducerId

2019-08-29 Thread Bill Bejeck (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918868#comment-16918868
 ] 

Bill Bejeck commented on KAFKA-8803:


[~rocketraman]

I think what is going is when the broker is experiencing a 
{{UNKNOWN_LEADER_EPOCH}} error.  But by the time the broker recovers and 
stabilizes more than 60 seconds has elapsed.  The {{initProducerId}} request is 
controlled by the {{max.block.ms}} configuration.  Try bumping up that value to 
something higher (I don't have a great suggestion, 5-10 minutes) and see if 
that helps.
{code:java}
// props.put(StreamsConfig.producerPrefix(ProducerConfig.MAX_BLOCK_MS_CONFIG), 
60);
{code}
HTH,

Bill

> Stream will not start due to TimeoutException: Timeout expired after 
> 6milliseconds while awaiting InitProducerId
> 
>
> Key: KAFKA-8803
> URL: https://issues.apache.org/jira/browse/KAFKA-8803
> Project: Kafka
>  Issue Type: Bug
>Reporter: Raman Gupta
>Priority: Major
> Attachments: logs.txt.gz, screenshot-1.png
>
>
> One streams app is consistently failing at startup with the following 
> exception:
> {code}
> 2019-08-14 17:02:29,568 ERROR --- [2ce1b-StreamThread-2] 
> org.apa.kaf.str.pro.int.StreamTask: task [0_36] Timeout 
> exception caught when initializing transactions for task 0_36. This might 
> happen if the broker is slow to respond, if the network connection to the 
> broker was interrupted, or if similar circumstances arise. You can increase 
> producer parameter `max.block.ms` to increase this timeout.
> org.apache.kafka.common.errors.TimeoutException: Timeout expired after 
> 6milliseconds while awaiting InitProducerId
> {code}
> These same brokers are used by many other streams without any issue, 
> including some in the very same processes for the stream which consistently 
> throws this exception.
> *UPDATE 08/16:*
> The very first instance of this error is August 13th 2019, 17:03:36.754 and 
> it happened for 4 different streams. For 3 of these streams, the error only 
> happened once, and then the stream recovered. For the 4th stream, the error 
> has continued to happen, and continues to happen now.
> I looked up the broker logs for this time, and see that at August 13th 2019, 
> 16:47:43, two of four brokers started reporting messages like this, for 
> multiple partitions:
> [2019-08-13 20:47:43,658] INFO [ReplicaFetcher replicaId=3, leaderId=1, 
> fetcherId=0] Retrying leaderEpoch request for partition xxx-1 as the leader 
> reported an error: UNKNOWN_LEADER_EPOCH (kafka.server.ReplicaFetcherThread)
> The UNKNOWN_LEADER_EPOCH messages continued for some time, and then stopped, 
> here is a view of the count of these messages over time:
>  !screenshot-1.png! 
> However, as noted, the stream task timeout error continues to happen.
> I use the static consumer group protocol with Kafka 2.3.0 clients and 2.3.0 
> broker. The broker has a patch for KAFKA-8773.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (KAFKA-8803) Stream will not start due to TimeoutException: Timeout expired after 60000milliseconds while awaiting InitProducerId

2019-08-29 Thread Bill Bejeck (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918868#comment-16918868
 ] 

Bill Bejeck edited comment on KAFKA-8803 at 8/29/19 6:44 PM:
-

[~rocketraman]

I think what is going is when the broker is experiencing a 
{{UNKNOWN_LEADER_EPOCH}} error.  But by the time the broker recovers and 
stabilizes more than 60 seconds has elapsed.  The {{initProducerId}} request is 
controlled by the {{max.block.ms}} configuration.  Try bumping up that value to 
something higher (I don't have a great suggestion, 5-10 minutes) and see if 
that helps.
{code:java}
props.put(StreamsConfig.producerPrefix(ProducerConfig.MAX_BLOCK_MS_CONFIG), 
60);
{code}
HTH,

Bill


was (Author: bbejeck):
[~rocketraman]

I think what is going is when the broker is experiencing a 
{{UNKNOWN_LEADER_EPOCH}} error.  But by the time the broker recovers and 
stabilizes more than 60 seconds has elapsed.  The {{initProducerId}} request is 
controlled by the {{max.block.ms}} configuration.  Try bumping up that value to 
something higher (I don't have a great suggestion, 5-10 minutes) and see if 
that helps.
{code:java}
// props.put(StreamsConfig.producerPrefix(ProducerConfig.MAX_BLOCK_MS_CONFIG), 
60);
{code}
HTH,

Bill

> Stream will not start due to TimeoutException: Timeout expired after 
> 6milliseconds while awaiting InitProducerId
> 
>
> Key: KAFKA-8803
> URL: https://issues.apache.org/jira/browse/KAFKA-8803
> Project: Kafka
>  Issue Type: Bug
>Reporter: Raman Gupta
>Priority: Major
> Attachments: logs.txt.gz, screenshot-1.png
>
>
> One streams app is consistently failing at startup with the following 
> exception:
> {code}
> 2019-08-14 17:02:29,568 ERROR --- [2ce1b-StreamThread-2] 
> org.apa.kaf.str.pro.int.StreamTask: task [0_36] Timeout 
> exception caught when initializing transactions for task 0_36. This might 
> happen if the broker is slow to respond, if the network connection to the 
> broker was interrupted, or if similar circumstances arise. You can increase 
> producer parameter `max.block.ms` to increase this timeout.
> org.apache.kafka.common.errors.TimeoutException: Timeout expired after 
> 6milliseconds while awaiting InitProducerId
> {code}
> These same brokers are used by many other streams without any issue, 
> including some in the very same processes for the stream which consistently 
> throws this exception.
> *UPDATE 08/16:*
> The very first instance of this error is August 13th 2019, 17:03:36.754 and 
> it happened for 4 different streams. For 3 of these streams, the error only 
> happened once, and then the stream recovered. For the 4th stream, the error 
> has continued to happen, and continues to happen now.
> I looked up the broker logs for this time, and see that at August 13th 2019, 
> 16:47:43, two of four brokers started reporting messages like this, for 
> multiple partitions:
> [2019-08-13 20:47:43,658] INFO [ReplicaFetcher replicaId=3, leaderId=1, 
> fetcherId=0] Retrying leaderEpoch request for partition xxx-1 as the leader 
> reported an error: UNKNOWN_LEADER_EPOCH (kafka.server.ReplicaFetcherThread)
> The UNKNOWN_LEADER_EPOCH messages continued for some time, and then stopped, 
> here is a view of the count of these messages over time:
>  !screenshot-1.png! 
> However, as noted, the stream task timeout error continues to happen.
> I use the static consumer group protocol with Kafka 2.3.0 clients and 2.3.0 
> broker. The broker has a patch for KAFKA-8773.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8828) [BC Break] Global store returns a TimestampedKeyValueStore in 2.3

2019-08-29 Thread Adam Rinehart (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918840#comment-16918840
 ] 

Adam Rinehart commented on KAFKA-8828:
--

As I had said, I'm new to Kafka Streams and only started using them since 2.3, 
so had not fully read the upgrade-guide since I wasn't upgrading. It would have 
been useful if this was in the javadoc, but I won't argue that it wasn't 
documented.

Either way, this discussion has been helpful on increasing my understanding of 
KTables and GlobalKTables and I appreciate it. I'll leave it up to Marcos, the 
original poster, if he is satisfied but personally, I understand now.

> [BC Break] Global store returns a TimestampedKeyValueStore in 2.3
> -
>
> Key: KAFKA-8828
> URL: https://issues.apache.org/jira/browse/KAFKA-8828
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
>Reporter: Marcos Passos
>Priority: Major
>
> Since 2.3, {{ProcessorContext}} returns a {{TimestampedKeyValueStore}} for 
> global stores, which is backward incompatible. This change makes the upgrade 
> path a lot painful and involves creating a non-trivial adapter to hide the 
> timestamp-related functionality in cases where it is not needed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8845) Detect and abort stalled transactions

2019-08-29 Thread Jose Armando Garcia Sancio (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918798#comment-16918798
 ] 

Jose Armando Garcia Sancio commented on KAFKA-8845:
---

[~hachikuji], could we have the case where transaction may be left hanging 
because of https://issues.apache.org/jira/browse/KAFKA-8069?

> Detect and abort stalled transactions
> -
>
> Key: KAFKA-8845
> URL: https://issues.apache.org/jira/browse/KAFKA-8845
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Priority: Major
>  Labels: needs-discussion
>
> In some situations, a transaction may be left hanging indefinitely. For 
> example, this could happen due to an unclean leader election, a bug in the 
> coordinator. We need mechanisms to detect hanging transactions and abort 
> them. A bare minimum is probably a tool which lets a user manually abort a 
> failed transaction after detecting it through monitoring.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (KAFKA-8845) Detect and abort stalled transactions

2019-08-29 Thread Jose Armando Garcia Sancio (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918798#comment-16918798
 ] 

Jose Armando Garcia Sancio edited comment on KAFKA-8845 at 8/29/19 5:26 PM:


[~hachikuji], could we have the case where transaction may be left hanging 
because of KAFKA-8069?


was (Author: jagsancio):
[~hachikuji], could we have the case where transaction may be left hanging 
because of https://issues.apache.org/jira/browse/KAFKA-8069?

> Detect and abort stalled transactions
> -
>
> Key: KAFKA-8845
> URL: https://issues.apache.org/jira/browse/KAFKA-8845
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jason Gustafson
>Priority: Major
>  Labels: needs-discussion
>
> In some situations, a transaction may be left hanging indefinitely. For 
> example, this could happen due to an unclean leader election, a bug in the 
> coordinator. We need mechanisms to detect hanging transactions and abort 
> them. A bare minimum is probably a tool which lets a user manually abort a 
> failed transaction after detecting it through monitoring.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (KAFKA-8848) Update system test to use new authorizer

2019-08-29 Thread Rajini Sivaram (Jira)
Rajini Sivaram created KAFKA-8848:
-

 Summary: Update system test to use new authorizer
 Key: KAFKA-8848
 URL: https://issues.apache.org/jira/browse/KAFKA-8848
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 2.4.0


We should run system tests with the new authorizer.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8843) Zookeeper migration tool support for TLS

2019-08-29 Thread Pere Urbon-Bayes (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918787#comment-16918787
 ] 

Pere Urbon-Bayes commented on KAFKA-8843:
-

working on writing the related KIP right now.

> Zookeeper migration tool support for TLS
> 
>
> Key: KAFKA-8843
> URL: https://issues.apache.org/jira/browse/KAFKA-8843
> Project: Kafka
>  Issue Type: Bug
>Reporter: Pere Urbon-Bayes
>Assignee: Pere Urbon-Bayes
>Priority: Minor
>
> Currently zookeeper-migration tool works based on SASL authentication. What 
> means only digest and kerberos authentication is supported.
>  
> With the introduction of ZK 3.5, TLS is added, including a new X509 
> authentication provider. 
>  
> To support this great future and utilise the TLS principals, the 
> zookeeper-migration-tool script should support the X509 authentication as 
> well.
>  
> In my newbie view, this should mean adding a new parameter to allow other 
> ways of authentication around 
> [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ZkSecurityMigrator.scala#L65.
>  
> |https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ZkSecurityMigrator.scala#L65]
>  
> If I understand the process correct, this will require a KIP, right?
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (KAFKA-8843) Zookeeper migration tool support for TLS

2019-08-29 Thread Pere Urbon-Bayes (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pere Urbon-Bayes reassigned KAFKA-8843:
---

Assignee: Pere Urbon-Bayes

> Zookeeper migration tool support for TLS
> 
>
> Key: KAFKA-8843
> URL: https://issues.apache.org/jira/browse/KAFKA-8843
> Project: Kafka
>  Issue Type: Bug
>Reporter: Pere Urbon-Bayes
>Assignee: Pere Urbon-Bayes
>Priority: Minor
>
> Currently zookeeper-migration tool works based on SASL authentication. What 
> means only digest and kerberos authentication is supported.
>  
> With the introduction of ZK 3.5, TLS is added, including a new X509 
> authentication provider. 
>  
> To support this great future and utilise the TLS principals, the 
> zookeeper-migration-tool script should support the X509 authentication as 
> well.
>  
> In my newbie view, this should mean adding a new parameter to allow other 
> ways of authentication around 
> [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ZkSecurityMigrator.scala#L65.
>  
> |https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ZkSecurityMigrator.scala#L65]
>  
> If I understand the process correct, this will require a KIP, right?
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (KAFKA-8847) Deprecate and remove usage of supporting classes in kafka.security.auth

2019-08-29 Thread Rajini Sivaram (Jira)
Rajini Sivaram created KAFKA-8847:
-

 Summary: Deprecate and remove usage of supporting classes in 
kafka.security.auth
 Key: KAFKA-8847
 URL: https://issues.apache.org/jira/browse/KAFKA-8847
 Project: Kafka
  Issue Type: Sub-task
  Components: security
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 2.4.0


Deprecate Acl, Resource etc. from `kafka.security.auth` and replace references 
to these with the equivalent Java classes.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8803) Stream will not start due to TimeoutException: Timeout expired after 60000milliseconds while awaiting InitProducerId

2019-08-29 Thread Raman Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918695#comment-16918695
 ] 

Raman Gupta commented on KAFKA-8803:


Thanks [~bbejeck]. Currently the stream is in the same state so if additional 
debugging information is needed, I can probably still get it. However, very 
soon I'll need to reset the environment and move on, as this stream has been 
down a long time.

> Stream will not start due to TimeoutException: Timeout expired after 
> 6milliseconds while awaiting InitProducerId
> 
>
> Key: KAFKA-8803
> URL: https://issues.apache.org/jira/browse/KAFKA-8803
> Project: Kafka
>  Issue Type: Bug
>Reporter: Raman Gupta
>Priority: Major
> Attachments: logs.txt.gz, screenshot-1.png
>
>
> One streams app is consistently failing at startup with the following 
> exception:
> {code}
> 2019-08-14 17:02:29,568 ERROR --- [2ce1b-StreamThread-2] 
> org.apa.kaf.str.pro.int.StreamTask: task [0_36] Timeout 
> exception caught when initializing transactions for task 0_36. This might 
> happen if the broker is slow to respond, if the network connection to the 
> broker was interrupted, or if similar circumstances arise. You can increase 
> producer parameter `max.block.ms` to increase this timeout.
> org.apache.kafka.common.errors.TimeoutException: Timeout expired after 
> 6milliseconds while awaiting InitProducerId
> {code}
> These same brokers are used by many other streams without any issue, 
> including some in the very same processes for the stream which consistently 
> throws this exception.
> *UPDATE 08/16:*
> The very first instance of this error is August 13th 2019, 17:03:36.754 and 
> it happened for 4 different streams. For 3 of these streams, the error only 
> happened once, and then the stream recovered. For the 4th stream, the error 
> has continued to happen, and continues to happen now.
> I looked up the broker logs for this time, and see that at August 13th 2019, 
> 16:47:43, two of four brokers started reporting messages like this, for 
> multiple partitions:
> [2019-08-13 20:47:43,658] INFO [ReplicaFetcher replicaId=3, leaderId=1, 
> fetcherId=0] Retrying leaderEpoch request for partition xxx-1 as the leader 
> reported an error: UNKNOWN_LEADER_EPOCH (kafka.server.ReplicaFetcherThread)
> The UNKNOWN_LEADER_EPOCH messages continued for some time, and then stopped, 
> here is a view of the count of these messages over time:
>  !screenshot-1.png! 
> However, as noted, the stream task timeout error continues to happen.
> I use the static consumer group protocol with Kafka 2.3.0 clients and 2.3.0 
> broker. The broker has a patch for KAFKA-8773.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8803) Stream will not start due to TimeoutException: Timeout expired after 60000milliseconds while awaiting InitProducerId

2019-08-29 Thread Bill Bejeck (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918675#comment-16918675
 ] 

Bill Bejeck commented on KAFKA-8803:


[~rocketraman] sorry I've been tied up with some other things, I'll take a look 
by COB today.

> Stream will not start due to TimeoutException: Timeout expired after 
> 6milliseconds while awaiting InitProducerId
> 
>
> Key: KAFKA-8803
> URL: https://issues.apache.org/jira/browse/KAFKA-8803
> Project: Kafka
>  Issue Type: Bug
>Reporter: Raman Gupta
>Priority: Major
> Attachments: logs.txt.gz, screenshot-1.png
>
>
> One streams app is consistently failing at startup with the following 
> exception:
> {code}
> 2019-08-14 17:02:29,568 ERROR --- [2ce1b-StreamThread-2] 
> org.apa.kaf.str.pro.int.StreamTask: task [0_36] Timeout 
> exception caught when initializing transactions for task 0_36. This might 
> happen if the broker is slow to respond, if the network connection to the 
> broker was interrupted, or if similar circumstances arise. You can increase 
> producer parameter `max.block.ms` to increase this timeout.
> org.apache.kafka.common.errors.TimeoutException: Timeout expired after 
> 6milliseconds while awaiting InitProducerId
> {code}
> These same brokers are used by many other streams without any issue, 
> including some in the very same processes for the stream which consistently 
> throws this exception.
> *UPDATE 08/16:*
> The very first instance of this error is August 13th 2019, 17:03:36.754 and 
> it happened for 4 different streams. For 3 of these streams, the error only 
> happened once, and then the stream recovered. For the 4th stream, the error 
> has continued to happen, and continues to happen now.
> I looked up the broker logs for this time, and see that at August 13th 2019, 
> 16:47:43, two of four brokers started reporting messages like this, for 
> multiple partitions:
> [2019-08-13 20:47:43,658] INFO [ReplicaFetcher replicaId=3, leaderId=1, 
> fetcherId=0] Retrying leaderEpoch request for partition xxx-1 as the leader 
> reported an error: UNKNOWN_LEADER_EPOCH (kafka.server.ReplicaFetcherThread)
> The UNKNOWN_LEADER_EPOCH messages continued for some time, and then stopped, 
> here is a view of the count of these messages over time:
>  !screenshot-1.png! 
> However, as noted, the stream task timeout error continues to happen.
> I use the static consumer group protocol with Kafka 2.3.0 clients and 2.3.0 
> broker. The broker has a patch for KAFKA-8773.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-08-29 Thread Rostyslav Skliar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918558#comment-16918558
 ] 

Rostyslav Skliar commented on KAFKA-5998:
-

Hi, guys! Any news about release 2.3.1, 2.4.0, 2.2.2? I can't find those 
versions in maven central.

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Assignee: John Roesler
>Priority: Critical
> Fix For: 2.2.2, 2.4.0, 2.3.1
>
> Attachments: 5998.v1.txt, 5998.v2.txt, Kafka5998.zip, Topology.txt, 
> exc.txt, props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>