[jira] [Commented] (KAFKA-9572) Sum Computation with Exactly-Once Enabled and Injected Failures Misses Some Records

2020-02-20 Thread Bruno Cadonna (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041635#comment-17041635
 ] 

Bruno Cadonna commented on KAFKA-9572:
--

No, it was on 2.4. 

> Sum Computation with Exactly-Once Enabled and Injected Failures Misses Some 
> Records
> ---
>
> Key: KAFKA-9572
> URL: https://issues.apache.org/jira/browse/KAFKA-9572
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.4.0
>Reporter: Bruno Cadonna
>Assignee: Guozhang Wang
>Priority: Blocker
> Fix For: 2.5.0
>
> Attachments: 7-changelog-1.txt, data-1.txt, streams22.log, 
> streams23.log, streams30.log, sum-1.txt
>
>
> System test {{StreamsEosTest.test_failure_and_recovery}} failed due to a 
> wrongly computed aggregation under exactly-once (EOS). The specific error is:
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Result verification 
> failed for ConsumerRecord(topic = sum, partition = 1, leaderEpoch = 0, offset 
> = 2805, CreateTime = 1580719595164, serialized key size = 4, serialized value 
> size = 8, headers = RecordHeaders(headers = [], isReadOnly = false), key = 
> [B@6c779568, value = [B@f381794) expected <6069,17269> but was <6069,10698>
>   at 
> org.apache.kafka.streams.tests.EosTestDriver.verifySum(EosTestDriver.java:444)
>   at 
> org.apache.kafka.streams.tests.EosTestDriver.verify(EosTestDriver.java:196)
>   at 
> org.apache.kafka.streams.tests.StreamsEosTest.main(StreamsEosTest.java:69)
> {code} 
> That means, the sum computed by the Streams app seems to be wrong for key 
> 6069. I checked the dumps of the log segments of the input topic partition 
> (attached: data-1.txt) and indeed two input records are not considered in the 
> sum. With those two missed records the sum would be correct. More concretely, 
> the input values for key 6069 are:
> # 147
> # 9250
> # 5340 
> # 1231
> # 1301
> The sum of this values is 17269 as stated in the exception above as expected 
> sum. If you subtract values 3 and 4, i.e., 5340 and 1231 from 17269, you get 
> 10698 , which is the actual sum in the exception above. Somehow those two 
> values are missing.
> In the log dump of the output topic partition (attached: sum-1.txt), the sum 
> is correct until the 4th value 1231 , i.e. 15968, then it is overwritten with 
> 10698.
> In the log dump of the changelog topic of the state store that stores the sum 
> (attached: 7-changelog-1.txt), the sum is also overwritten as in the output 
> topic.
> I attached the logs of the three Streams instances involved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9588) Add rocksdb event listeners in KS

2020-02-20 Thread Navinder Brar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navinder Brar updated KAFKA-9588:
-
Description: Rocsdb is coming up with the support of event listeners(like 
onCompactionCompleted) in jni which would be really helpful in KS to trigger 
checkpointing on flush completed due to filling up of memtables, rather than 
doing it periodically etc. This task is currently blocked on 
https://issues.apache.org/jira/browse/KAFKA-8897.  (was: Rocsdb is coming up 
with the support of event listeners(like onCompactionCompleted) in jni which 
would be really helpful in KS to trigger checkpointing on flush completed due 
to filling up of memtables, rather than doing it periodically etc. This task is 
currently blocked on https://issues.apache.org/jira/browse/KAFKA-8897.

 

Linking this task to https://issues.apache.org/jira/browse/KAFKA-9450 as well 
for tracking.)

> Add rocksdb event listeners in KS
> -
>
> Key: KAFKA-9588
> URL: https://issues.apache.org/jira/browse/KAFKA-9588
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Navinder Brar
>Priority: Major
>
> Rocsdb is coming up with the support of event listeners(like 
> onCompactionCompleted) in jni which would be really helpful in KS to trigger 
> checkpointing on flush completed due to filling up of memtables, rather than 
> doing it periodically etc. This task is currently blocked on 
> https://issues.apache.org/jira/browse/KAFKA-8897.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9588) Add rocksdb event listeners in KS

2020-02-20 Thread Navinder Brar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navinder Brar updated KAFKA-9588:
-
Description: Rocsdb is coming up with the support of event listeners(like 
onCompactionCompleted) in jni 
([https://github.com/facebook/rocksdb/issues/6343]) which would be really 
helpful in KS to trigger checkpointing on flush completed due to filling up of 
memtables, rather than doing it periodically etc. This task is currently 
blocked on https://issues.apache.org/jira/browse/KAFKA-8897.  (was: Rocsdb is 
coming up with the support of event listeners(like onCompactionCompleted) in 
jni which would be really helpful in KS to trigger checkpointing on flush 
completed due to filling up of memtables, rather than doing it periodically 
etc. This task is currently blocked on 
https://issues.apache.org/jira/browse/KAFKA-8897.)

> Add rocksdb event listeners in KS
> -
>
> Key: KAFKA-9588
> URL: https://issues.apache.org/jira/browse/KAFKA-9588
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Navinder Brar
>Priority: Major
>
> Rocsdb is coming up with the support of event listeners(like 
> onCompactionCompleted) in jni 
> ([https://github.com/facebook/rocksdb/issues/6343]) which would be really 
> helpful in KS to trigger checkpointing on flush completed due to filling up 
> of memtables, rather than doing it periodically etc. This task is currently 
> blocked on https://issues.apache.org/jira/browse/KAFKA-8897.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9588) Add rocksdb event listeners in KS

2020-02-20 Thread Navinder Brar (Jira)
Navinder Brar created KAFKA-9588:


 Summary: Add rocksdb event listeners in KS
 Key: KAFKA-9588
 URL: https://issues.apache.org/jira/browse/KAFKA-9588
 Project: Kafka
  Issue Type: New Feature
  Components: streams
Reporter: Navinder Brar


Rocsdb is coming up with the support of event listeners(like 
onCompactionCompleted) in jni which would be really helpful in KS to trigger 
checkpointing on flush completed due to filling up of memtables, rather than 
doing it periodically etc. This task is currently blocked on 
https://issues.apache.org/jira/browse/KAFKA-8897.

 

Linking this task to https://issues.apache.org/jira/browse/KAFKA-9450 as well 
for tracking.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9587) Producer configs are omitted in the documentation

2020-02-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041551#comment-17041551
 ] 

ASF GitHub Bot commented on KAFKA-9587:
---

dongjinleekr commented on pull request #8150: KAFKA-9587: Producer configs are 
omitted in the documentation
URL: https://github.com/apache/kafka/pull/8150
 
 
   I found this glitch while working on another issue.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Producer configs are omitted in the documentation
> -
>
> Key: KAFKA-9587
> URL: https://issues.apache.org/jira/browse/KAFKA-9587
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, documentation
>Affects Versions: 2.4.0
>Reporter: Dongjin Lee
>Assignee: Dongjin Lee
>Priority: Minor
> Fix For: 2.5.0
>
>
> As of 2.4, [the KafkaProducer 
> documentation|https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  states:
> {quote}If the request fails, the producer can automatically retry, though 
> since we have specified retries as 0 it won't.
> {quote}
> {quote}... in the code snippet above, likely all 100 records would be sent in 
> a single request since we set our linger time to 1 millisecond.
> {quote}
> However, the code snippet (below) does not include any configurtaion on 
> '{{retry'}} or '{{linger.ms'}}:
> {quote}Properties props = new Properties();
>  props.put("bootstrap.servers", "localhost:9092");
>  props.put("acks", "all");
>  props.put("key.serializer", 
> "org.apache.kafka.common.serialization.StringSerializer");
>  props.put("value.serializer", 
> "org.apache.kafka.common.serialization.StringSerializer");
> {quote}
> The same documentation in [version 
> 2.0|https://kafka.apache.org/20/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  includes the configs; However, 
> [2.1|https://kafka.apache.org/21/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  only includes '{{linger.ms}}' and 
> [2.2|https://kafka.apache.org/22/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  includes none. It seems like it was removed in the middle of two releases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9587) Producer configs are omitted in the documentation

2020-02-20 Thread Dongjin Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjin Lee updated KAFKA-9587:
---
Description: 
As of 2.4, [the KafkaProducer 
documentation|https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 states:
{quote}If the request fails, the producer can automatically retry, though since 
we have specified retries as 0 it won't.
{quote}
{quote}... in the code snippet above, likely all 100 records would be sent in a 
single request since we set our linger time to 1 millisecond.
{quote}
However, the code snippet (below) does not include any configurtaion on 
'{{retry'}} or '{{linger.ms'}}:
{quote}Properties props = new Properties();
 props.put("bootstrap.servers", "localhost:9092");
 props.put("acks", "all");
 props.put("key.serializer", 
"org.apache.kafka.common.serialization.StringSerializer");
 props.put("value.serializer", 
"org.apache.kafka.common.serialization.StringSerializer");
{quote}
The same documentation in [version 
2.0|https://kafka.apache.org/20/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 includes the configs; However, 
[2.1|https://kafka.apache.org/21/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 only includes '{{linger.ms}}' and 
[2.2|https://kafka.apache.org/22/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 includes none. It seems like it was removed in the middle of two releases.

  was:
As of 2.4, [the Producer 
documentation|https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 states:

{quote}If the request fails, the producer can automatically retry, though since 
we have specified retries as 0 it won't.{quote}

{quote}... in the code snippet above, likely all 100 records would be sent in a 
single request since we set our linger time to 1 millisecond.{quote}

However, the code snippet (below) does not include any configurtaion on 
'{{retry'}} or '{{linger.ms'}}:
{quote}Properties props = new Properties();
 props.put("bootstrap.servers", "localhost:9092");
 props.put("acks", "all");
 props.put("key.serializer", 
"org.apache.kafka.common.serialization.StringSerializer");
 props.put("value.serializer", 
"org.apache.kafka.common.serialization.StringSerializer");
{quote}
The same documentation in [version 
2.0|https://kafka.apache.org/20/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 includes the configs; However, 
[2.1|https://kafka.apache.org/21/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 only includes '{{linger.ms}}' and 
[2.2|https://kafka.apache.org/22/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 includes none. It seems like it was removed in the middle of two releases.


> Producer configs are omitted in the documentation
> -
>
> Key: KAFKA-9587
> URL: https://issues.apache.org/jira/browse/KAFKA-9587
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, documentation
>Affects Versions: 2.4.0
>Reporter: Dongjin Lee
>Assignee: Dongjin Lee
>Priority: Minor
> Fix For: 2.5.0
>
>
> As of 2.4, [the KafkaProducer 
> documentation|https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  states:
> {quote}If the request fails, the producer can automatically retry, though 
> since we have specified retries as 0 it won't.
> {quote}
> {quote}... in the code snippet above, likely all 100 records would be sent in 
> a single request since we set our linger time to 1 millisecond.
> {quote}
> However, the code snippet (below) does not include any configurtaion on 
> '{{retry'}} or '{{linger.ms'}}:
> {quote}Properties props = new Properties();
>  props.put("bootstrap.servers", "localhost:9092");
>  props.put("acks", "all");
>  props.put("key.serializer", 
> "org.apache.kafka.common.serialization.StringSerializer");
>  props.put("value.serializer", 
> "org.apache.kafka.common.serialization.StringSerializer");
> {quote}
> The same documentation in [version 
> 2.0|https://kafka.apache.org/20/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  includes the configs; However, 
> [2.1|https://kafka.apache.org/21/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  only includes '{{linger.ms}}' and 
> [2.2|https://kafka.apache.org/22/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
>  includes none. It seems like it was removed in the middle of two releases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9587) Producer configs are omitted in the documentation

2020-02-20 Thread Dongjin Lee (Jira)
Dongjin Lee created KAFKA-9587:
--

 Summary: Producer configs are omitted in the documentation
 Key: KAFKA-9587
 URL: https://issues.apache.org/jira/browse/KAFKA-9587
 Project: Kafka
  Issue Type: Improvement
  Components: clients, documentation
Affects Versions: 2.4.0
Reporter: Dongjin Lee
Assignee: Dongjin Lee
 Fix For: 2.5.0


As of 2.4, [the Producer 
documentation|https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 states:

{quote}If the request fails, the producer can automatically retry, though since 
we have specified retries as 0 it won't.{quote}

{quote}... in the code snippet above, likely all 100 records would be sent in a 
single request since we set our linger time to 1 millisecond.{quote}

However, the code snippet (below) does not include any configurtaion on 
'{{retry'}} or '{{linger.ms'}}:
{quote}Properties props = new Properties();
 props.put("bootstrap.servers", "localhost:9092");
 props.put("acks", "all");
 props.put("key.serializer", 
"org.apache.kafka.common.serialization.StringSerializer");
 props.put("value.serializer", 
"org.apache.kafka.common.serialization.StringSerializer");
{quote}
The same documentation in [version 
2.0|https://kafka.apache.org/20/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 includes the configs; However, 
[2.1|https://kafka.apache.org/21/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 only includes '{{linger.ms}}' and 
[2.2|https://kafka.apache.org/22/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html]
 includes none. It seems like it was removed in the middle of two releases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9584) Removing headers causes ConcurrentModificationException

2020-02-20 Thread Micah Ramos (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Ramos updated KAFKA-9584:
---
Affects Version/s: (was: 2.0.0)
   0.10.0.0

> Removing headers causes ConcurrentModificationException
> ---
>
> Key: KAFKA-9584
> URL: https://issues.apache.org/jira/browse/KAFKA-9584
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0
>Reporter: Micah Ramos
>Priority: Minor
>
> The consumer record that is used during punctuate is static, this can cause 
> java.util.ConcurrentModificationException when modifying the headers. 
> Using a single instance of ConsumerRecord for all punctuates causes other 
> strange behavior:
>  # Headers are shared across partitions.
>  # A topology that adds a single header could append an infinite number of 
> headers (one per punctuate iteration), causing memory problems in the current 
> topology as well as down stream consumers since the headers are written with 
> the record when it is produced to a topic.  
>  
> I would expect that each invocation of punctuate would be initialized with a 
> new header object.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9586) Fix errored json filename in ops documentation

2020-02-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041540#comment-17041540
 ] 

ASF GitHub Bot commented on KAFKA-9586:
---

dongjinleekr commented on pull request #8149: KAFKA-9586: Fix errored json 
filename in ops documentation
URL: https://github.com/apache/kafka/pull/8149
 
 
   This PR is the counterpart of apache/kafka-site#253.
   
   cc/ @omkreddy
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix errored json filename in ops documentation
> --
>
> Key: KAFKA-9586
> URL: https://issues.apache.org/jira/browse/KAFKA-9586
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.5.0
>Reporter: Dongjin Lee
>Assignee: Dongjin Lee
>Priority: Minor
> Fix For: 2.5.0
>
>
> {quote}The --verify option can be used with the tool to check the status of 
> the partition reassignment. Note that the same 
> +expand-cluster-reassignment.json+ (used with the --execute option) should be 
> used with the --verify option:
> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 
> --reassignment-json-file +custom-reassignment.json+ --verify
> {quote}
> {{expand-cluster-reassignment.json}} (underline) should be 
> {{custom-reassignment.json}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9586) Fix errored json filename in ops documentation

2020-02-20 Thread Dongjin Lee (Jira)
Dongjin Lee created KAFKA-9586:
--

 Summary: Fix errored json filename in ops documentation
 Key: KAFKA-9586
 URL: https://issues.apache.org/jira/browse/KAFKA-9586
 Project: Kafka
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.5.0
Reporter: Dongjin Lee
Assignee: Dongjin Lee
 Fix For: 2.5.0


{quote}The --verify option can be used with the tool to check the status of the 
partition reassignment. Note that the same +expand-cluster-reassignment.json+ 
(used with the --execute option) should be used with the --verify option:

bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 
--reassignment-json-file +custom-reassignment.json+ --verify
{quote}
{{expand-cluster-reassignment.json}} (underline) should be 
{{custom-reassignment.json}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9580) Log clearer error messages when there is an offset out of range

2020-02-20 Thread Sanjana Kaundinya (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9580:
-
Description: Currently in the broker side logs, when an exception due to 
offset out of range is thrown, it's not informative on what offsets are exactly 
in range, and it makes it harder to debug when Consumer issues come up in this 
situation. It would be much more helpful if we logged a message similar to 
[Log::read |#L1468]to make it easier to debug.  (was: Currently in the broker 
side logs, when an exception due to offset out of range is thrown, it's not 
informative on what offsets are exactly in range, and it makes it harder to 
debug when Consumer issues come up in this situation. It would be much more 
helpful if we logged a message similar to [Log::read|#L1468]] to make it easier 
to debug.)

> Log clearer error messages when there is an offset out of range
> ---
>
> Key: KAFKA-9580
> URL: https://issues.apache.org/jira/browse/KAFKA-9580
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Sanjana Kaundinya
>Assignee: Cheng Tan
>Priority: Major
>
> Currently in the broker side logs, when an exception due to offset out of 
> range is thrown, it's not informative on what offsets are exactly in range, 
> and it makes it harder to debug when Consumer issues come up in this 
> situation. It would be much more helpful if we logged a message similar to 
> [Log::read |#L1468]to make it easier to debug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9580) Log clearer error messages when there is an offset out of range

2020-02-20 Thread Sanjana Kaundinya (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9580:
-
Description: Currently in the broker side logs, when an exception due to 
offset out of range is thrown, it's not informative on what offsets are exactly 
in range, and it makes it harder to debug when Consumer issues come up in this 
situation. It would be much more helpful if we logged a message similar to 
[Log::read|#L1468]] to make it easier to debug.  (was: Currently in 
[Fetcher::initializedCompletedFetch|[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L1259]],
 it's not informative on what offsets are exactly in range, and it makes it 
harder to debug when Consumer issues come up in this situation. It would be 
much more helpful if we logged a message similar to 
[Log::read|[https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/Log.scala#L1468]]
 to make it easier to debug.)

> Log clearer error messages when there is an offset out of range
> ---
>
> Key: KAFKA-9580
> URL: https://issues.apache.org/jira/browse/KAFKA-9580
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Sanjana Kaundinya
>Assignee: Cheng Tan
>Priority: Major
>
> Currently in the broker side logs, when an exception due to offset out of 
> range is thrown, it's not informative on what offsets are exactly in range, 
> and it makes it harder to debug when Consumer issues come up in this 
> situation. It would be much more helpful if we logged a message similar to 
> [Log::read|#L1468]] to make it easier to debug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8770) Either switch to or add an option for emit-on-change

2020-02-20 Thread ghassan Yammine (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041511#comment-17041511
 ] 

ghassan Yammine commented on KAFKA-8770:


[~vvcephei] et al.

Emit-on-change would be very beneficial to us. Our data model is foreign-key 
heavy with some relationships on the order 400,000:1 (we rely heavily on 
KIP-213, the FK-join). Thus a single change - even if it does not modify 
the output - will generate a tremendous amount of unnecessary traffic for the 
downstream apps. This is due to the embedded #flatMap operation in the FK-join. 
Considering the size of some the associations, a single, idempotent, update can 
generate a large amount of duplicate records.

Finally, we actually do observe many idempotent/superfluous updates due to the 
lack of emit-on-change behavior.  We could do this ourselves by injecting a 
deduplicating Transformer() in the Stream but we would have to do it for every 
KStream app that we have and it would not be as efficient as a "native" 
implementation.  We are currently at 12 apps and planning for ultimately around 
100-200.

In summary, this KIP would save us from computing, storing, and transmitting 
anywhere from millions to billions of idempotent updates a day.

> Either switch to or add an option for emit-on-change
> 
>
> Key: KAFKA-8770
> URL: https://issues.apache.org/jira/browse/KAFKA-8770
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: John Roesler
>Priority: Major
>  Labels: needs-kip
>
> Currently, Streams offers two emission models:
> * emit-on-window-close: (using Suppression)
> * emit-on-update: (i.e., emit a new result whenever a new record is 
> processed, regardless of whether the result has changed)
> There is also an option to drop some intermediate results, either using 
> caching or suppression.
> However, there is no support for emit-on-change, in which results would be 
> forwarded only if the result has changed. This has been reported to be 
> extremely valuable as a performance optimizations for some high-traffic 
> applications, and it reduces the computational burden both internally for 
> downstream Streams operations, as well as for external systems that consume 
> the results, and currently have to deal with a lot of "no-op" changes.
> It would be pretty straightforward to implement this, by loading the prior 
> results before a stateful operation and comparing with the new result before 
> persisting or forwarding. In many cases, we load the prior result anyway, so 
> it may not be a significant performance impact either.
> One design challenge is what to do with timestamps. If we get one record at 
> time 1 that produces a result, and then another at time 2 that produces a 
> no-op, what should be the timestamp of the result, 1 or 2? emit-on-change 
> would require us to say 1.
> Clearly, we'd need to do some serious benchmarks to evaluate any potential 
> implementation of emit-on-change.
> Another design challenge is to decide if we should just automatically provide 
> emit-on-change for stateful operators, or if it should be configurable. 
> Configuration increases complexity, so unless the performance impact is high, 
> we may just want to change the emission model without a configuration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9585) Flaky Test: LagFetchIntegrationTest#shouldFetchLagsDuringRebalancingWithOptimization

2020-02-20 Thread Sophie Blee-Goldman (Jira)
Sophie Blee-Goldman created KAFKA-9585:
--

 Summary: Flaky Test: 
LagFetchIntegrationTest#shouldFetchLagsDuringRebalancingWithOptimization
 Key: KAFKA-9585
 URL: https://issues.apache.org/jira/browse/KAFKA-9585
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.6.0
Reporter: Sophie Blee-Goldman


Failed for me locally with 
{noformat}
java.lang.AssertionError: Condition not met within timeout 12. Should 
obtain non-empty lag information eventually
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9584) Removing headers causes ConcurrentModificationException

2020-02-20 Thread Micah Ramos (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Ramos updated KAFKA-9584:
---
Description: 
The consumer record that is used during punctuate is static, this can cause 
java.util.ConcurrentModificationException when modifying the headers. 

Using a single instance of ConsumerRecord for all punctuates causes other 
strange behavior:
 # Headers are shared across partitions.
 # A topology that adds a single header could append an infinite number of 
headers (one per punctuate iteration), causing memory problems in the current 
topology as well as down stream consumers since the headers are written with 
the record when it is produced to a topic.  

 

I would expect that each invocation of punctuate would be initialized with a 
new header object.

  was:
The consumer record that is used during punctuate is static, this can cause 
java.util.ConcurrentModificationException when modifying the headers. 

Using a single instance of ConsumerRecord for all punctuates causes other 
strange behavior:
 # Headers are shared across partitions.
 # A topology that adds a single header could append an infinite number of 
headers (one per punctuate iteration), causing memory problems in the current 
topology as well as down stream consumers since the headers are written with 
the record when it is produced to a topic.  


> Removing headers causes ConcurrentModificationException
> ---
>
> Key: KAFKA-9584
> URL: https://issues.apache.org/jira/browse/KAFKA-9584
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
>Reporter: Micah Ramos
>Priority: Minor
>
> The consumer record that is used during punctuate is static, this can cause 
> java.util.ConcurrentModificationException when modifying the headers. 
> Using a single instance of ConsumerRecord for all punctuates causes other 
> strange behavior:
>  # Headers are shared across partitions.
>  # A topology that adds a single header could append an infinite number of 
> headers (one per punctuate iteration), causing memory problems in the current 
> topology as well as down stream consumers since the headers are written with 
> the record when it is produced to a topic.  
>  
> I would expect that each invocation of punctuate would be initialized with a 
> new header object.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-6266) Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of __consumer_offsets-xx to log start offset 203569 since the checkpointed offset 120955 is invali

2020-02-20 Thread Jun Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-6266.

Resolution: Fixed

Merged into 2.4 and trunk.

> Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of 
> __consumer_offsets-xx to log start offset 203569 since the checkpointed 
> offset 120955 is invalid. (kafka.log.LogCleanerManager$)
> --
>
> Key: KAFKA-6266
> URL: https://issues.apache.org/jira/browse/KAFKA-6266
> Project: Kafka
>  Issue Type: Bug
>  Components: offset manager
>Affects Versions: 1.0.0, 1.0.1
> Environment: CentOS 7, Apache kafka_2.12-1.0.0
>Reporter: VinayKumar
>Assignee: David Mao
>Priority: Major
> Fix For: 2.5.0, 2.4.1
>
>
> I upgraded Kafka from 0.10.2.1 to 1.0.0 version. From then, I see the below 
> warnings in the log.
>  I'm seeing these continuously in the log, and want these to be fixed- so 
> that they wont repeat. Can someone please help me in fixing the below 
> warnings.
> {code}
> WARN Resetting first dirty offset of __consumer_offsets-17 to log start 
> offset 3346 since the checkpointed offset 3332 is invalid. 
> (kafka.log.LogCleanerManager$)
>  WARN Resetting first dirty offset of __consumer_offsets-23 to log start 
> offset 4 since the checkpointed offset 1 is invalid. 
> (kafka.log.LogCleanerManager$)
>  WARN Resetting first dirty offset of __consumer_offsets-19 to log start 
> offset 203569 since the checkpointed offset 120955 is invalid. 
> (kafka.log.LogCleanerManager$)
>  WARN Resetting first dirty offset of __consumer_offsets-35 to log start 
> offset 16957 since the checkpointed offset 7 is invalid. 
> (kafka.log.LogCleanerManager$)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-6266) Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of __consumer_offsets-xx to log start offset 203569 since the checkpointed offset 120955 is inval

2020-02-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041456#comment-17041456
 ] 

ASF GitHub Bot commented on KAFKA-6266:
---

junrao commented on pull request #8136: KAFKA-6266: Repeated occurrence of WARN 
Resetting first dirty offset …
URL: https://github.com/apache/kafka/pull/8136
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of 
> __consumer_offsets-xx to log start offset 203569 since the checkpointed 
> offset 120955 is invalid. (kafka.log.LogCleanerManager$)
> --
>
> Key: KAFKA-6266
> URL: https://issues.apache.org/jira/browse/KAFKA-6266
> Project: Kafka
>  Issue Type: Bug
>  Components: offset manager
>Affects Versions: 1.0.0, 1.0.1
> Environment: CentOS 7, Apache kafka_2.12-1.0.0
>Reporter: VinayKumar
>Assignee: David Mao
>Priority: Major
> Fix For: 2.5.0, 2.4.1
>
>
> I upgraded Kafka from 0.10.2.1 to 1.0.0 version. From then, I see the below 
> warnings in the log.
>  I'm seeing these continuously in the log, and want these to be fixed- so 
> that they wont repeat. Can someone please help me in fixing the below 
> warnings.
> {code}
> WARN Resetting first dirty offset of __consumer_offsets-17 to log start 
> offset 3346 since the checkpointed offset 3332 is invalid. 
> (kafka.log.LogCleanerManager$)
>  WARN Resetting first dirty offset of __consumer_offsets-23 to log start 
> offset 4 since the checkpointed offset 1 is invalid. 
> (kafka.log.LogCleanerManager$)
>  WARN Resetting first dirty offset of __consumer_offsets-19 to log start 
> offset 203569 since the checkpointed offset 120955 is invalid. 
> (kafka.log.LogCleanerManager$)
>  WARN Resetting first dirty offset of __consumer_offsets-35 to log start 
> offset 16957 since the checkpointed offset 7 is invalid. 
> (kafka.log.LogCleanerManager$)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9584) Removing headers causes ConcurrentModificationException

2020-02-20 Thread Micah Ramos (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Ramos updated KAFKA-9584:
---
Description: 
The consumer record that is used during punctuate is static, this can cause 
java.util.ConcurrentModificationException when modifying the headers. 

Using a single instance of ConsumerRecord for all punctuates causes other 
strange behavior:
 # Headers are shared across partitions.
 # A topology that adds a single header could append an infinite number of 
headers (one per punctuate iteration), causing memory problems in the current 
topology as well as down stream consumers since the headers are written with 
the record when it is produced to a topic.  

  was:
The consumer record that is used during punctuate is static, this can cause 
java.util.ConcurrentModificationException when modifying the headers. 

Using a single instance of ConsumerRecord for all punctuates causes other 
strange behavior:
 # Headers are shared across partitions.
 # A topology that adds a single header could append an infinite number of 
headers (one per punctuate iteration), causing memory problems in the current 
topology as well as down stream consumers since the headers are produced when 
the records is produced to a topic.  


> Removing headers causes ConcurrentModificationException
> ---
>
> Key: KAFKA-9584
> URL: https://issues.apache.org/jira/browse/KAFKA-9584
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
>Reporter: Micah Ramos
>Priority: Minor
>
> The consumer record that is used during punctuate is static, this can cause 
> java.util.ConcurrentModificationException when modifying the headers. 
> Using a single instance of ConsumerRecord for all punctuates causes other 
> strange behavior:
>  # Headers are shared across partitions.
>  # A topology that adds a single header could append an infinite number of 
> headers (one per punctuate iteration), causing memory problems in the current 
> topology as well as down stream consumers since the headers are written with 
> the record when it is produced to a topic.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9584) Removing headers causes ConcurrentModificationException

2020-02-20 Thread Micah Ramos (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Ramos updated KAFKA-9584:
---
Description: 
The consumer record that is used during punctuate is static, this can cause 
java.util.ConcurrentModificationException when modifying the headers. 

Using a single instance of ConsumerRecord for all punctuates causes other 
strange behavior:
 # Headers are shared across partitions.
 # A topology that adds a single header could append an infinite number of 
headers (one per punctuate iteration), causing memory problems in the current 
topology as well as down stream consumers since the headers are produced when 
the records is produced to a topic.  

  was:
The consumer record that is used during punctuate is static, this can cause 
java.util.ConcurrentModificationException when modifying the headers. 

Using a single instance of ConsumerRecord for all punctuates causes other 
strange behavior:
 # Headers are shared across partitions.
 # A topology that adds headers could append an infinite number of headers, 
causing memory problems in the current topology as well as down stream 
consumers since the headers are produced when the records is produced to a 
topic.  


> Removing headers causes ConcurrentModificationException
> ---
>
> Key: KAFKA-9584
> URL: https://issues.apache.org/jira/browse/KAFKA-9584
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.0.0
>Reporter: Micah Ramos
>Priority: Minor
>
> The consumer record that is used during punctuate is static, this can cause 
> java.util.ConcurrentModificationException when modifying the headers. 
> Using a single instance of ConsumerRecord for all punctuates causes other 
> strange behavior:
>  # Headers are shared across partitions.
>  # A topology that adds a single header could append an infinite number of 
> headers (one per punctuate iteration), causing memory problems in the current 
> topology as well as down stream consumers since the headers are produced when 
> the records is produced to a topic.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9584) Removing headers causes ConcurrentModificationException

2020-02-20 Thread Micah Ramos (Jira)
Micah Ramos created KAFKA-9584:
--

 Summary: Removing headers causes ConcurrentModificationException
 Key: KAFKA-9584
 URL: https://issues.apache.org/jira/browse/KAFKA-9584
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.0.0
Reporter: Micah Ramos


The consumer record that is used during punctuate is static, this can cause 
java.util.ConcurrentModificationException when modifying the headers. 

Using a single instance of ConsumerRecord for all punctuates causes other 
strange behavior:
 # Headers are shared across partitions.
 # A topology that adds headers could append an infinite number of headers, 
causing memory problems in the current topology as well as down stream 
consumers since the headers are produced when the records is produced to a 
topic.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9581) Deprecate rebalanceException on StreamThread to avoid infinite loop

2020-02-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041410#comment-17041410
 ] 

ASF GitHub Bot commented on KAFKA-9581:
---

abbccdda commented on pull request #8145: KAFKA-9581: Remove rebalance 
exception withholding
URL: https://github.com/apache/kafka/pull/8145
 
 
   The rebalance exception withholding is no longer necessary as we have better 
mechanism for catching and wrapping these exceptions. Throw them directly 
should be fine and simplify our current error handling.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Deprecate rebalanceException on StreamThread to avoid infinite loop
> ---
>
> Key: KAFKA-9581
> URL: https://issues.apache.org/jira/browse/KAFKA-9581
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9572) Sum Computation with Exactly-Once Enabled and Injected Failures Misses Some Records

2020-02-20 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041389#comment-17041389
 ] 

Guozhang Wang commented on KAFKA-9572:
--

8058 has been merged to trunk, I'm not sure if this is an issue in 2.5.0 though 
since the refactoring was only on 2.6.0. [~cadonna] did you get this failure in 
2.5 or in trunk?

> Sum Computation with Exactly-Once Enabled and Injected Failures Misses Some 
> Records
> ---
>
> Key: KAFKA-9572
> URL: https://issues.apache.org/jira/browse/KAFKA-9572
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.4.0
>Reporter: Bruno Cadonna
>Assignee: Guozhang Wang
>Priority: Blocker
> Fix For: 2.5.0
>
> Attachments: 7-changelog-1.txt, data-1.txt, streams22.log, 
> streams23.log, streams30.log, sum-1.txt
>
>
> System test {{StreamsEosTest.test_failure_and_recovery}} failed due to a 
> wrongly computed aggregation under exactly-once (EOS). The specific error is:
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Result verification 
> failed for ConsumerRecord(topic = sum, partition = 1, leaderEpoch = 0, offset 
> = 2805, CreateTime = 1580719595164, serialized key size = 4, serialized value 
> size = 8, headers = RecordHeaders(headers = [], isReadOnly = false), key = 
> [B@6c779568, value = [B@f381794) expected <6069,17269> but was <6069,10698>
>   at 
> org.apache.kafka.streams.tests.EosTestDriver.verifySum(EosTestDriver.java:444)
>   at 
> org.apache.kafka.streams.tests.EosTestDriver.verify(EosTestDriver.java:196)
>   at 
> org.apache.kafka.streams.tests.StreamsEosTest.main(StreamsEosTest.java:69)
> {code} 
> That means, the sum computed by the Streams app seems to be wrong for key 
> 6069. I checked the dumps of the log segments of the input topic partition 
> (attached: data-1.txt) and indeed two input records are not considered in the 
> sum. With those two missed records the sum would be correct. More concretely, 
> the input values for key 6069 are:
> # 147
> # 9250
> # 5340 
> # 1231
> # 1301
> The sum of this values is 17269 as stated in the exception above as expected 
> sum. If you subtract values 3 and 4, i.e., 5340 and 1231 from 17269, you get 
> 10698 , which is the actual sum in the exception above. Somehow those two 
> values are missing.
> In the log dump of the output topic partition (attached: sum-1.txt), the sum 
> is correct until the 4th value 1231 , i.e. 15968, then it is overwritten with 
> 10698.
> In the log dump of the changelog topic of the state store that stores the sum 
> (attached: 7-changelog-1.txt), the sum is also overwritten as in the output 
> topic.
> I attached the logs of the three Streams instances involved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9481) Improve TaskMigratedException handling on Stream thread

2020-02-20 Thread Guozhang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-9481.
--
Fix Version/s: 2.6.0
   Resolution: Fixed

> Improve TaskMigratedException handling on Stream thread
> ---
>
> Key: KAFKA-9481
> URL: https://issues.apache.org/jira/browse/KAFKA-9481
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Major
> Fix For: 2.6.0
>
>
> Today we handle TaskMigratedException as one-task at a time, when 1) producer 
> got fenced, 2) consumer got fenced, 3) adding records to closed tasks.
> When 1) and 2) happens, all tasks hosted by that thread should have migrated; 
> and for 3) it only happens when we are closing a task but not clearing its 
> corresponding record buffer.
> So a better exception handling is first better fixing 3) to also clear the 
> record buffer when closing a task (clean or dirty), and then for 1/2) we can 
> always treat it as all-tasks-are-migrated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9481) Improve TaskMigratedException handling on Stream thread

2020-02-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041387#comment-17041387
 ] 

ASF GitHub Bot commented on KAFKA-9481:
---

guozhangwang commented on pull request #8058: KAFKA-9481: Graceful handling 
TaskMigrated and TaskCorrupted
URL: https://github.com/apache/kafka/pull/8058
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve TaskMigratedException handling on Stream thread
> ---
>
> Key: KAFKA-9481
> URL: https://issues.apache.org/jira/browse/KAFKA-9481
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>Priority: Major
>
> Today we handle TaskMigratedException as one-task at a time, when 1) producer 
> got fenced, 2) consumer got fenced, 3) adding records to closed tasks.
> When 1) and 2) happens, all tasks hosted by that thread should have migrated; 
> and for 3) it only happens when we are closing a task but not clearing its 
> corresponding record buffer.
> So a better exception handling is first better fixing 3) to also clear the 
> record buffer when closing a task (clean or dirty), and then for 1/2) we can 
> always treat it as all-tasks-are-migrated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9583) OffsetsForLeaderEpoch requests are sometimes not sent to the leader of partition

2020-02-20 Thread Andy Fang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Fang updated KAFKA-9583:
-
Description: 
In 
[{{validateOffsetsAsync}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L737],
 we group the requests by leader node for efficiency. The list of 
topic-partitions are grouped from {{partitionsToValidate}} (all partitions) to 
{{node}} => [{{fetchPostitions}} (partitions by 
node)|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L739].

However, when actually sending the request with 
{{OffsetsForLeaderEpochClient}}, we use 
[{{partitionsToValidate}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L765],
 which is the list of all topic-partitions passed into 
{{validateOffsetsAsync}}. This results in extra partitions being included in 
the request sent to brokers that are potentially not the leader for those 
partitions.

I have submitted a PR, 

  was:
In 
[{{validateOffsetsAsync}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L737],
 we group the requests by leader node for efficiency. The list of 
topic-partitions are grouped from {{partitionsToValidate}} (all partitions) to 
{{node}} => [{{fetchPostitions}} (partitions by 
node)|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L739].

However, when actually sending the request with 
{{OffsetsForLeaderEpochClient}}, we use 
[{{partitionsToValidate}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L765],
 which is the list of all topic-partitions passed into 
{{validateOffsetsAsync}}. This results in extra partitions being included in 
the request sent to brokers that are potentially not the leader for those 
partitions.


> OffsetsForLeaderEpoch requests are sometimes not sent to the leader of 
> partition
> 
>
> Key: KAFKA-9583
> URL: https://issues.apache.org/jira/browse/KAFKA-9583
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.4.0, 2.3.1
>Reporter: Andy Fang
>Priority: Minor
>  Labels: newbie, patch, pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In 
> [{{validateOffsetsAsync}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L737],
>  we group the requests by leader node for efficiency. The list of 
> topic-partitions are grouped from {{partitionsToValidate}} (all partitions) 
> to {{node}} => [{{fetchPostitions}} (partitions by 
> node)|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L739].
> However, when actually sending the request with 
> {{OffsetsForLeaderEpochClient}}, we use 
> [{{partitionsToValidate}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L765],
>  which is the list of all topic-partitions passed into 
> {{validateOffsetsAsync}}. This results in extra partitions being included in 
> the request sent to brokers that are potentially not the leader for those 
> partitions.
> I have submitted a PR, 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9583) OffsetsForLeaderEpoch requests are sometimes not sent to the leader of partition

2020-02-20 Thread Andy Fang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Fang updated KAFKA-9583:
-
Description: 
In 
[{{validateOffsetsAsync}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L737],
 we group the requests by leader node for efficiency. The list of 
topic-partitions are grouped from {{partitionsToValidate}} (all partitions) to 
{{node}} => [{{fetchPostitions}} (partitions by 
node)|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L739].

However, when actually sending the request with 
{{OffsetsForLeaderEpochClient}}, we use 
[{{partitionsToValidate}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L765],
 which is the list of all topic-partitions passed into 
{{validateOffsetsAsync}}. This results in extra partitions being included in 
the request sent to brokers that are potentially not the leader for those 
partitions.

I have submitted a PR, 
[https://github.com/apache/kafka/pull/8077|https://github.com/apache/kafka/pull/8077],
 that fixes this issue.

  was:
In 
[{{validateOffsetsAsync}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L737],
 we group the requests by leader node for efficiency. The list of 
topic-partitions are grouped from {{partitionsToValidate}} (all partitions) to 
{{node}} => [{{fetchPostitions}} (partitions by 
node)|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L739].

However, when actually sending the request with 
{{OffsetsForLeaderEpochClient}}, we use 
[{{partitionsToValidate}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L765],
 which is the list of all topic-partitions passed into 
{{validateOffsetsAsync}}. This results in extra partitions being included in 
the request sent to brokers that are potentially not the leader for those 
partitions.

I have submitted a PR, [https://github.com/apache/kafka/pull/8077,] that fixes 
this issue.


> OffsetsForLeaderEpoch requests are sometimes not sent to the leader of 
> partition
> 
>
> Key: KAFKA-9583
> URL: https://issues.apache.org/jira/browse/KAFKA-9583
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.4.0, 2.3.1
>Reporter: Andy Fang
>Priority: Minor
>  Labels: newbie, patch, pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In 
> [{{validateOffsetsAsync}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L737],
>  we group the requests by leader node for efficiency. The list of 
> topic-partitions are grouped from {{partitionsToValidate}} (all partitions) 
> to {{node}} => [{{fetchPostitions}} (partitions by 
> node)|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L739].
> However, when actually sending the request with 
> {{OffsetsForLeaderEpochClient}}, we use 
> [{{partitionsToValidate}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L765],
>  which is the list of all topic-partitions passed into 
> {{validateOffsetsAsync}}. This results in extra partitions being included in 
> the request sent to brokers that are potentially not the leader for those 
> partitions.
> I have submitted a PR, 
> [https://github.com/apache/kafka/pull/8077|https://github.com/apache/kafka/pull/8077],
>  that fixes this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9583) OffsetsForLeaderEpoch requests are sometimes not sent to the leader of partition

2020-02-20 Thread Andy Fang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Fang updated KAFKA-9583:
-
Description: 
In 
[{{validateOffsetsAsync}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L737],
 we group the requests by leader node for efficiency. The list of 
topic-partitions are grouped from {{partitionsToValidate}} (all partitions) to 
{{node}} => [{{fetchPostitions}} (partitions by 
node)|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L739].

However, when actually sending the request with 
{{OffsetsForLeaderEpochClient}}, we use 
[{{partitionsToValidate}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L765],
 which is the list of all topic-partitions passed into 
{{validateOffsetsAsync}}. This results in extra partitions being included in 
the request sent to brokers that are potentially not the leader for those 
partitions.

I have submitted a PR, [https://github.com/apache/kafka/pull/8077,] that fixes 
this issue.

  was:
In 
[{{validateOffsetsAsync}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L737],
 we group the requests by leader node for efficiency. The list of 
topic-partitions are grouped from {{partitionsToValidate}} (all partitions) to 
{{node}} => [{{fetchPostitions}} (partitions by 
node)|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L739].

However, when actually sending the request with 
{{OffsetsForLeaderEpochClient}}, we use 
[{{partitionsToValidate}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L765],
 which is the list of all topic-partitions passed into 
{{validateOffsetsAsync}}. This results in extra partitions being included in 
the request sent to brokers that are potentially not the leader for those 
partitions.

I have submitted a PR, 


> OffsetsForLeaderEpoch requests are sometimes not sent to the leader of 
> partition
> 
>
> Key: KAFKA-9583
> URL: https://issues.apache.org/jira/browse/KAFKA-9583
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.4.0, 2.3.1
>Reporter: Andy Fang
>Priority: Minor
>  Labels: newbie, patch, pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In 
> [{{validateOffsetsAsync}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L737],
>  we group the requests by leader node for efficiency. The list of 
> topic-partitions are grouped from {{partitionsToValidate}} (all partitions) 
> to {{node}} => [{{fetchPostitions}} (partitions by 
> node)|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L739].
> However, when actually sending the request with 
> {{OffsetsForLeaderEpochClient}}, we use 
> [{{partitionsToValidate}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L765],
>  which is the list of all topic-partitions passed into 
> {{validateOffsetsAsync}}. This results in extra partitions being included in 
> the request sent to brokers that are potentially not the leader for those 
> partitions.
> I have submitted a PR, [https://github.com/apache/kafka/pull/8077,] that 
> fixes this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9583) OffsetsForLeaderEpoch requests are sometimes not sent to the leader of partition

2020-02-20 Thread Andy Fang (Jira)
Andy Fang created KAFKA-9583:


 Summary: OffsetsForLeaderEpoch requests are sometimes not sent to 
the leader of partition
 Key: KAFKA-9583
 URL: https://issues.apache.org/jira/browse/KAFKA-9583
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 2.3.1, 2.4.0
Reporter: Andy Fang


In 
[{{validateOffsetsAsync}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L737],
 we group the requests by leader node for efficiency. The list of 
topic-partitions are grouped from {{partitionsToValidate}} (all partitions) to 
{{node}} => [{{fetchPostitions}} (partitions by 
node)|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L739].

However, when actually sending the request with 
{{OffsetsForLeaderEpochClient}}, we use 
[{{partitionsToValidate}}|https://github.com/apache/kafka/blob/2.3.1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L765],
 which is the list of all topic-partitions passed into 
{{validateOffsetsAsync}}. This results in extra partitions being included in 
the request sent to brokers that are potentially not the leader for those 
partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9580) Log clearer error messages when there is an offset out of range

2020-02-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041344#comment-17041344
 ] 

ASF GitHub Bot commented on KAFKA-9580:
---

d8tltanc commented on pull request #8144: KAFKA-9580: Log clearer error 
messages when there is an offset out of range
URL: https://github.com/apache/kafka/pull/8144
 
 
   Log the partition start offset and last stable offset when the subscriptions 
have the default offset reset policy and the fetch offset is out of index.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Log clearer error messages when there is an offset out of range
> ---
>
> Key: KAFKA-9580
> URL: https://issues.apache.org/jira/browse/KAFKA-9580
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Sanjana Kaundinya
>Assignee: Cheng Tan
>Priority: Major
>
> Currently in 
> [Fetcher::initializedCompletedFetch|[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L1259]],
>  it's not informative on what offsets are exactly in range, and it makes it 
> harder to debug when Consumer issues come up in this situation. It would be 
> much more helpful if we logged a message similar to 
> [Log::read|[https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/Log.scala#L1468]]
>  to make it easier to debug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9582) Ensure EOS task producer not through fatal error in unclean shutdown

2020-02-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041338#comment-17041338
 ] 

ASF GitHub Bot commented on KAFKA-9582:
---

abbccdda commented on pull request #8143: KAFKA-9582: Do not abort transaction 
in unclean close
URL: https://github.com/apache/kafka/pull/8143
 
 
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure EOS task producer not through fatal error in unclean shutdown
> 
>
> Key: KAFKA-9582
> URL: https://issues.apache.org/jira/browse/KAFKA-9582
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9580) Log clearer error messages when there is an offset out of range

2020-02-20 Thread Cheng Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Tan reassigned KAFKA-9580:


Assignee: Cheng Tan

> Log clearer error messages when there is an offset out of range
> ---
>
> Key: KAFKA-9580
> URL: https://issues.apache.org/jira/browse/KAFKA-9580
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Sanjana Kaundinya
>Assignee: Cheng Tan
>Priority: Major
>
> Currently in 
> [Fetcher::initializedCompletedFetch|[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L1259]],
>  it's not informative on what offsets are exactly in range, and it makes it 
> harder to debug when Consumer issues come up in this situation. It would be 
> much more helpful if we logged a message similar to 
> [Log::read|[https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/Log.scala#L1468]]
>  to make it easier to debug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9582) Ensure EOS task producer not through fatal error in unclean shutdown

2020-02-20 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-9582:
---
Fix Version/s: 2.5.0

> Ensure EOS task producer not through fatal error in unclean shutdown
> 
>
> Key: KAFKA-9582
> URL: https://issues.apache.org/jira/browse/KAFKA-9582
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9572) Sum Computation with Exactly-Once Enabled and Injected Failures Misses Some Records

2020-02-20 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041328#comment-17041328
 ] 

Guozhang Wang commented on KAFKA-9572:
--

It seems that when we injected the error the local state would be wiped since 
we are closing dirty, and then the tasks got migrated again while the 
restoration has not completed yet -- in this case we should just update the 
checkpoint file without committing at all. However in trunk right now this was 
not done correctly, I will try to piggy-back this fix along with my ongoing PR 
for handling exceptions: https://github.com/apache/kafka/pull/8058

> Sum Computation with Exactly-Once Enabled and Injected Failures Misses Some 
> Records
> ---
>
> Key: KAFKA-9572
> URL: https://issues.apache.org/jira/browse/KAFKA-9572
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.4.0
>Reporter: Bruno Cadonna
>Assignee: Guozhang Wang
>Priority: Blocker
> Fix For: 2.5.0
>
> Attachments: 7-changelog-1.txt, data-1.txt, streams22.log, 
> streams23.log, streams30.log, sum-1.txt
>
>
> System test {{StreamsEosTest.test_failure_and_recovery}} failed due to a 
> wrongly computed aggregation under exactly-once (EOS). The specific error is:
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Result verification 
> failed for ConsumerRecord(topic = sum, partition = 1, leaderEpoch = 0, offset 
> = 2805, CreateTime = 1580719595164, serialized key size = 4, serialized value 
> size = 8, headers = RecordHeaders(headers = [], isReadOnly = false), key = 
> [B@6c779568, value = [B@f381794) expected <6069,17269> but was <6069,10698>
>   at 
> org.apache.kafka.streams.tests.EosTestDriver.verifySum(EosTestDriver.java:444)
>   at 
> org.apache.kafka.streams.tests.EosTestDriver.verify(EosTestDriver.java:196)
>   at 
> org.apache.kafka.streams.tests.StreamsEosTest.main(StreamsEosTest.java:69)
> {code} 
> That means, the sum computed by the Streams app seems to be wrong for key 
> 6069. I checked the dumps of the log segments of the input topic partition 
> (attached: data-1.txt) and indeed two input records are not considered in the 
> sum. With those two missed records the sum would be correct. More concretely, 
> the input values for key 6069 are:
> # 147
> # 9250
> # 5340 
> # 1231
> # 1301
> The sum of this values is 17269 as stated in the exception above as expected 
> sum. If you subtract values 3 and 4, i.e., 5340 and 1231 from 17269, you get 
> 10698 , which is the actual sum in the exception above. Somehow those two 
> values are missing.
> In the log dump of the output topic partition (attached: sum-1.txt), the sum 
> is correct until the 4th value 1231 , i.e. 15968, then it is overwritten with 
> 10698.
> In the log dump of the changelog topic of the state store that stores the sum 
> (attached: 7-changelog-1.txt), the sum is also overwritten as in the output 
> topic.
> I attached the logs of the three Streams instances involved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9582) Ensure EOS task producer not through fatal error in unclean shutdown

2020-02-20 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-9582:
--

Assignee: Boyang Chen

> Ensure EOS task producer not through fatal error in unclean shutdown
> 
>
> Key: KAFKA-9582
> URL: https://issues.apache.org/jira/browse/KAFKA-9582
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9582) Ensure EOS task producer not through fatal error in unclean shutdown

2020-02-20 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-9582:
--

 Summary: Ensure EOS task producer not through fatal error in 
unclean shutdown
 Key: KAFKA-9582
 URL: https://issues.apache.org/jira/browse/KAFKA-9582
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9581) Deprecate rebalanceException on StreamThread to avoid infinite loop

2020-02-20 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-9581:
--

 Summary: Deprecate rebalanceException on StreamThread to avoid 
infinite loop
 Key: KAFKA-9581
 URL: https://issues.apache.org/jira/browse/KAFKA-9581
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9581) Deprecate rebalanceException on StreamThread to avoid infinite loop

2020-02-20 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-9581:
--

Assignee: Boyang Chen

> Deprecate rebalanceException on StreamThread to avoid infinite loop
> ---
>
> Key: KAFKA-9581
> URL: https://issues.apache.org/jira/browse/KAFKA-9581
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9562) Streams not making progress under heavy failures with EOS enabled on 2.5 branch

2020-02-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041304#comment-17041304
 ] 

ASF GitHub Bot commented on KAFKA-9562:
---

guozhangwang commented on pull request #8116: KAFKA-9562: part 1: ignore 
exceptions while flushing stores in close(dirty)
URL: https://github.com/apache/kafka/pull/8116
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Streams not making progress under heavy failures with EOS enabled on 2.5 
> branch
> ---
>
> Key: KAFKA-9562
> URL: https://issues.apache.org/jira/browse/KAFKA-9562
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: John Roesler
>Assignee: Boyang Chen
>Priority: Blocker
> Fix For: 2.5.0
>
>
> During soak testing in preparation for the 2.5.0 release, we have discovered 
> a case in which Streams appears to stop making progress. Specifically, this 
> is a failure-resilience test in which we inject network faults separating the 
> instances from the brokers roughly every twenty minutes.
> On 2.4, Streams would obviously spend a lot of time rebalancing under this 
> scenario, but would still make progress. However, on the current 2.5 branch, 
> Streams effectively stops making progress except rarely.
> This appears to be a severe regression, so I'm filing this ticket as a 2.5.0 
> release blocker.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9580) Log clearer error messages when there is an offset out of range

2020-02-20 Thread Sanjana Kaundinya (Jira)
Sanjana Kaundinya created KAFKA-9580:


 Summary: Log clearer error messages when there is an offset out of 
range
 Key: KAFKA-9580
 URL: https://issues.apache.org/jira/browse/KAFKA-9580
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Sanjana Kaundinya


Currently in 
[Fetcher::initializedCompletedFetch|[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L1259]],
 it's not informative on what offsets are exactly in range, and it makes it 
harder to debug when Consumer issues come up in this situation. It would be 
much more helpful if we logged a message similar to 
[Log::read|[https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/Log.scala#L1468]]
 to make it easier to debug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9552) Stream should handle OutOfSequence exception thrown from Producer

2020-02-20 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041255#comment-17041255
 ] 

Guozhang Wang commented on KAFKA-9552:
--

Normally `OutOfSequence` should not cause the producer to fall into the 
`fatal-error` state, but only in `abortable-error` state, i.e. the producer 
caller could still continue to use that producer by just aborting the txn. 
However from the streams pov, if a txn is aborted it means that some records 
may not be processed so EOS is still violated somehow. So I think we should 
still treat it as fatal and kill the threads. But maybe we do not need to kill 
all threads since after the tasks are migrated to other threads and restart 
from the last committed position it is likely that `OutOfSequence` would go 
away.

> Stream should handle OutOfSequence exception thrown from Producer
> -
>
> Key: KAFKA-9552
> URL: https://issues.apache.org/jira/browse/KAFKA-9552
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Boyang Chen
>Priority: Major
>
> As of today the stream thread could die from OutOfSequence error:
> {code:java}
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) 
> org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) [2020-02-12 
> 15:14:35,185] ERROR 
> [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] 
> stream-thread 
> [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] Failed 
> to commit stream task 3_2 due to the following error: 
> (org.apache.kafka.streams.processor.internals.AssignedStreamsTasks)
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) 
> org.apache.kafka.streams.errors.StreamsException: task [3_2] Abort sending 
> since an error caught with a previous record (timestamp 1581484094825) to 
> topic stream-soak-test-KSTREAM-AGGREGATE-STATE-STORE-49-changelog due 
> to org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:154)
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:52)
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:214)
>  at 
> org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1353)
> {code}
>  Although this is fatal exception for Producer, stream should treat it as an 
> opportunity to reinitialize by doing a rebalance, instead of killing 
> computation resource.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8726) Producer can't abort a transaction aftersome send errors

2020-02-20 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041251#comment-17041251
 ] 

Guozhang Wang commented on KAFKA-8726:
--

OUT_OF_ORDER_SEQUENCE_NUMBER is a fatal error, and it seems in [~mbarbon]'s 
case it was caused by `MESSAGE_TOO_LARGE`. In an transactional producer case, 
the record `MESSAGE_TOO_LARGE` cannot be retried and hence the transaction it 
sits in has to be aborted.

But I think this `error-state` is not a fatal one: note that inside Producer we 
have both fatal error-state and abortable error-state, and normally 
`OUT_OF_ORDER_SEQUENCE_NUMBER` would only cause us to transit to abortable 
error-state, in which case the producer do not need to be closed, but we can 
still call `abortTxn` and the start a new transaction with that producer 
instead of closing it and creating a new one.

> Producer can't abort a transaction aftersome send errors
> 
>
> Key: KAFKA-8726
> URL: https://issues.apache.org/jira/browse/KAFKA-8726
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 2.3.0
>Reporter: Mattia Barbon
>Priority: Major
>
> I am following the producer with transactions example in 
> [https://kafka.apache.org/23/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html,]
>  and on kafkaException, I use abortTransaction and retry.
>  
> In some cases, abortTransaction fails, with:
> ```
> org.apache.kafka.common.KafkaException: Cannot execute transactional method 
> because we are in an error state
> ```
> as far as I can tell, this is caused by
> ```
> org.apache.kafka.common.KafkaException: The client hasn't received 
> acknowledgment for some previously sent messages and can no longer retry 
> them. It isn't safe to continue.
> ```
>  
> Since both are KafkaException, the example seems to imply they are retriable, 
> but they seem not to be. Ideally, I would expect abortTransaction to succeed 
> in this case (the broker will abort the transaction anyway because it can't 
> be committed), but at the very least, I would expect to have a way to 
> determine that the producer is unusable and it can't recover.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9577) Client encountering SASL_HANDSHAKE protocol version errors on 2.5 / trunk

2020-02-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041249#comment-17041249
 ] 

ASF GitHub Bot commented on KAFKA-9577:
---

lbradstreet commented on pull request #8142: KAFKA-9577: 
SaslClientAuthenticator incorrectly negotiates supported SaslHandshakeRequest 
version
URL: https://github.com/apache/kafka/pull/8142
 
 
   The SaslClientAuthenticator incorrectly negotiates supported 
SaslHandshakeRequest version and  uses the maximum version supported by the 
broker whether or not the client supports it. This PR rolls back the recent 
SaslHandshake[Request,Response] bump, fixes the version negotiation, and adds a 
test to prevent anyone from accidentally bumping the version without a 
workaround (e.g. a new ApiKey).
   
   Tests:
   - Prevent SASL_HANDSHAKE schema version bump
   - Add test to return ApiVersions unsupported by client
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Client encountering SASL_HANDSHAKE protocol version errors on 2.5 / trunk
> -
>
> Key: KAFKA-9577
> URL: https://issues.apache.org/jira/browse/KAFKA-9577
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.5.0
>Reporter: Lucas Bradstreet
>Assignee: Lucas Bradstreet
>Priority: Blocker
> Fix For: 2.5.0
>
>
> I am trying 2.5.0 with sasl turned on and my consumer and producer clients 
> receive:
> {noformat}
> org.apache.kafka.common.errors.UnsupportedVersionException: The 
> SASL_HANDSHAKE protocol does not support version 2
> {noformat}
> I believe this is due to 
> [https://github.com/apache/kafka/commit/0a2569e2b9907a1217dd50ccbc320f8ad0b42fd0]
>  which added flexible version support and bumped the protocol version.
> It appears that the SaslClientAuthenticator uses the max version for 
> SASL_HANDSHAKE received in the broker's AP_VERSIONS response, and then uses 
> that version even though it may not support it. See 
> [https://github.com/apache/kafka/blob/eb09efa9ac79efa484307bdcf03ac8eb8a3a94e2/clients/src/main/java/org/apache/kafka/common/security/authenticator/SaslClientAuthenticator.java#L290].
>  
> This may make it hard to ever evolve this schema. In the short term I suggest 
> we roll back the version bump and flexible schema until we figure out a path 
> forward.
> It appears that this may not have been a problem in the past because the 
> schema versions were the same and maybe we didn't validate the version number 
> [https://github.com/apache/kafka/commit/0cf7708007b01faac5012d939f3c50db274f858d#diff-7f65552a2e23aa7028500f8db06cbb30R47]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9572) Sum Computation with Exactly-Once Enabled and Injected Failures Misses Some Records

2020-02-20 Thread Guozhang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041245#comment-17041245
 ] 

Guozhang Wang commented on KAFKA-9572:
--

Yup I can take a look at it today.

> Sum Computation with Exactly-Once Enabled and Injected Failures Misses Some 
> Records
> ---
>
> Key: KAFKA-9572
> URL: https://issues.apache.org/jira/browse/KAFKA-9572
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.4.0
>Reporter: Bruno Cadonna
>Assignee: Guozhang Wang
>Priority: Blocker
> Fix For: 2.5.0
>
> Attachments: 7-changelog-1.txt, data-1.txt, streams22.log, 
> streams23.log, streams30.log, sum-1.txt
>
>
> System test {{StreamsEosTest.test_failure_and_recovery}} failed due to a 
> wrongly computed aggregation under exactly-once (EOS). The specific error is:
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Result verification 
> failed for ConsumerRecord(topic = sum, partition = 1, leaderEpoch = 0, offset 
> = 2805, CreateTime = 1580719595164, serialized key size = 4, serialized value 
> size = 8, headers = RecordHeaders(headers = [], isReadOnly = false), key = 
> [B@6c779568, value = [B@f381794) expected <6069,17269> but was <6069,10698>
>   at 
> org.apache.kafka.streams.tests.EosTestDriver.verifySum(EosTestDriver.java:444)
>   at 
> org.apache.kafka.streams.tests.EosTestDriver.verify(EosTestDriver.java:196)
>   at 
> org.apache.kafka.streams.tests.StreamsEosTest.main(StreamsEosTest.java:69)
> {code} 
> That means, the sum computed by the Streams app seems to be wrong for key 
> 6069. I checked the dumps of the log segments of the input topic partition 
> (attached: data-1.txt) and indeed two input records are not considered in the 
> sum. With those two missed records the sum would be correct. More concretely, 
> the input values for key 6069 are:
> # 147
> # 9250
> # 5340 
> # 1231
> # 1301
> The sum of this values is 17269 as stated in the exception above as expected 
> sum. If you subtract values 3 and 4, i.e., 5340 and 1231 from 17269, you get 
> 10698 , which is the actual sum in the exception above. Somehow those two 
> values are missing.
> In the log dump of the output topic partition (attached: sum-1.txt), the sum 
> is correct until the 4th value 1231 , i.e. 15968, then it is overwritten with 
> 10698.
> In the log dump of the changelog topic of the state store that stores the sum 
> (attached: 7-changelog-1.txt), the sum is also overwritten as in the output 
> topic.
> I attached the logs of the three Streams instances involved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-8910) Incorrect javadoc at KafkaProducer.InterceptorCallback#onCompletion

2020-02-20 Thread Guozhang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-8910.
--
Fix Version/s: 2.6.0
   Resolution: Fixed

> Incorrect javadoc at KafkaProducer.InterceptorCallback#onCompletion
> ---
>
> Key: KAFKA-8910
> URL: https://issues.apache.org/jira/browse/KAFKA-8910
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sergey Ushakov
>Assignee: Dongjin Lee
>Priority: Major
> Fix For: 2.6.0
>
>
> h1. Problem
> Javadoc for org.apache.kafka.clients.producer.Callback states:
> {noformat}
> * @param metadata The metadata for the record that was sent (i.e. the 
> partition and offset). Null if an error
> *occurred. {noformat}
> In fact, metadata is never null since KAFKA-6180. See 
> org.apache.kafka.clients.producer.KafkaProducer.InterceptorCallback#onCompletion
>  for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8910) Incorrect javadoc at KafkaProducer.InterceptorCallback#onCompletion

2020-02-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041238#comment-17041238
 ] 

ASF GitHub Bot commented on KAFKA-8910:
---

guozhangwang commented on pull request #7337: KAFKA-8910: Incorrect javadoc at 
KafkaProducer.InterceptorCallback#onCompletion
URL: https://github.com/apache/kafka/pull/7337
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Incorrect javadoc at KafkaProducer.InterceptorCallback#onCompletion
> ---
>
> Key: KAFKA-8910
> URL: https://issues.apache.org/jira/browse/KAFKA-8910
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sergey Ushakov
>Assignee: Dongjin Lee
>Priority: Major
>
> h1. Problem
> Javadoc for org.apache.kafka.clients.producer.Callback states:
> {noformat}
> * @param metadata The metadata for the record that was sent (i.e. the 
> partition and offset). Null if an error
> *occurred. {noformat}
> In fact, metadata is never null since KAFKA-6180. See 
> org.apache.kafka.clients.producer.KafkaProducer.InterceptorCallback#onCompletion
>  for details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9579) RLM fetch implementation by adding respective purgatory

2020-02-20 Thread Satish Duggana (Jira)
Satish Duggana created KAFKA-9579:
-

 Summary: RLM fetch implementation by adding respective purgatory
 Key: KAFKA-9579
 URL: https://issues.apache.org/jira/browse/KAFKA-9579
 Project: Kafka
  Issue Type: Sub-task
Reporter: Satish Duggana
Assignee: Ying Zheng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9578) Kafka Tiered Storage - System Tests

2020-02-20 Thread Harsha (Jira)
Harsha created KAFKA-9578:
-

 Summary: Kafka Tiered Storage - System  Tests
 Key: KAFKA-9578
 URL: https://issues.apache.org/jira/browse/KAFKA-9578
 Project: Kafka
  Issue Type: Sub-task
Reporter: Harsha
Assignee: Alexandre Dupriez


Initial test cases set up by [~Ying Zheng] 

 

[https://docs.google.com/spreadsheets/d/1gS0s1FOmcjpKYXBddejXAoJAjEZ7AdEzMU9wZc-JgY8/edit#gid=0]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9533) ValueTransform forwards `null` values

2020-02-20 Thread Bill Bejeck (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck updated KAFKA-9533:
---
Affects Version/s: 0.10.0.0
   0.10.2.2
   0.11.0.3
   1.1.1
   2.0.1
   2.2.2
   2.3.1

> ValueTransform forwards `null` values
> -
>
> Key: KAFKA-9533
> URL: https://issues.apache.org/jira/browse/KAFKA-9533
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0, 0.10.2.2, 0.11.0.3, 1.1.1, 2.0.1, 2.2.2, 
> 2.4.0, 2.3.1
>Reporter: Michael Viamari
>Assignee: Michael Viamari
>Priority: Major
> Fix For: 2.2.3, 2.5.0, 2.3.2, 2.4.1, 2.6.0
>
>
> According to the documentation for `KStream#transformValues`, nulls returned 
> from `ValueTransformer#transform` are not forwarded. (see 
> [KStream#transformValues|https://kafka.apache.org/24/javadoc/org/apache/kafka/streams/kstream/KStream.html#transformValues-org.apache.kafka.streams.kstream.ValueTransformerSupplier-java.lang.String...-]
> However, this does not appear to be the case. In 
> `KStreamTransformValuesProcessor#process` the result of the transform is 
> forwarded directly.
> {code:java}
>  @Override
>  public void process(final K key, final V value) {
>  context.forward(key, valueTransformer.transform(key, value));
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9533) ValueTransform forwards `null` values

2020-02-20 Thread Bill Bejeck (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041131#comment-17041131
 ] 

Bill Bejeck commented on KAFKA-9533:


cherry-picked to 2.5, 2.4, 2.3 and 2.2

> ValueTransform forwards `null` values
> -
>
> Key: KAFKA-9533
> URL: https://issues.apache.org/jira/browse/KAFKA-9533
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.0.0, 0.10.2.2, 0.11.0.3, 1.1.1, 2.0.1, 2.2.2, 
> 2.4.0, 2.3.1
>Reporter: Michael Viamari
>Assignee: Michael Viamari
>Priority: Major
> Fix For: 2.2.3, 2.5.0, 2.3.2, 2.4.1, 2.6.0
>
>
> According to the documentation for `KStream#transformValues`, nulls returned 
> from `ValueTransformer#transform` are not forwarded. (see 
> [KStream#transformValues|https://kafka.apache.org/24/javadoc/org/apache/kafka/streams/kstream/KStream.html#transformValues-org.apache.kafka.streams.kstream.ValueTransformerSupplier-java.lang.String...-]
> However, this does not appear to be the case. In 
> `KStreamTransformValuesProcessor#process` the result of the transform is 
> forwarded directly.
> {code:java}
>  @Override
>  public void process(final K key, final V value) {
>  context.forward(key, valueTransformer.transform(key, value));
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9533) ValueTransform forwards `null` values

2020-02-20 Thread Bill Bejeck (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck updated KAFKA-9533:
---
Fix Version/s: 2.4.1
   2.3.2
   2.5.0
   2.2.3

> ValueTransform forwards `null` values
> -
>
> Key: KAFKA-9533
> URL: https://issues.apache.org/jira/browse/KAFKA-9533
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.4.0
>Reporter: Michael Viamari
>Assignee: Michael Viamari
>Priority: Major
> Fix For: 2.2.3, 2.5.0, 2.3.2, 2.4.1, 2.6.0
>
>
> According to the documentation for `KStream#transformValues`, nulls returned 
> from `ValueTransformer#transform` are not forwarded. (see 
> [KStream#transformValues|https://kafka.apache.org/24/javadoc/org/apache/kafka/streams/kstream/KStream.html#transformValues-org.apache.kafka.streams.kstream.ValueTransformerSupplier-java.lang.String...-]
> However, this does not appear to be the case. In 
> `KStreamTransformValuesProcessor#process` the result of the transform is 
> forwarded directly.
> {code:java}
>  @Override
>  public void process(final K key, final V value) {
>  context.forward(key, valueTransformer.transform(key, value));
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9533) ValueTransform forwards `null` values

2020-02-20 Thread Bill Bejeck (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck updated KAFKA-9533:
---
Priority: Major  (was: Minor)

> ValueTransform forwards `null` values
> -
>
> Key: KAFKA-9533
> URL: https://issues.apache.org/jira/browse/KAFKA-9533
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.4.0
>Reporter: Michael Viamari
>Assignee: Michael Viamari
>Priority: Major
> Fix For: 2.6.0
>
>
> According to the documentation for `KStream#transformValues`, nulls returned 
> from `ValueTransformer#transform` are not forwarded. (see 
> [KStream#transformValues|https://kafka.apache.org/24/javadoc/org/apache/kafka/streams/kstream/KStream.html#transformValues-org.apache.kafka.streams.kstream.ValueTransformerSupplier-java.lang.String...-]
> However, this does not appear to be the case. In 
> `KStreamTransformValuesProcessor#process` the result of the transform is 
> forwarded directly.
> {code:java}
>  @Override
>  public void process(final K key, final V value) {
>  context.forward(key, valueTransformer.transform(key, value));
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9577) Client encountering SASL_HANDSHAKE protocol version errors on 2.5 / trunk

2020-02-20 Thread Lucas Bradstreet (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041087#comment-17041087
 ] 

Lucas Bradstreet commented on KAFKA-9577:
-

[~enether] I will have to check the system test results, but I don't think it's 
likely unless we have a SASL related test that uses an older client version.

> Client encountering SASL_HANDSHAKE protocol version errors on 2.5 / trunk
> -
>
> Key: KAFKA-9577
> URL: https://issues.apache.org/jira/browse/KAFKA-9577
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.5.0
>Reporter: Lucas Bradstreet
>Assignee: Lucas Bradstreet
>Priority: Blocker
> Fix For: 2.5.0
>
>
> I am trying 2.5.0 with sasl turned on and my consumer and producer clients 
> receive:
> {noformat}
> org.apache.kafka.common.errors.UnsupportedVersionException: The 
> SASL_HANDSHAKE protocol does not support version 2
> {noformat}
> I believe this is due to 
> [https://github.com/apache/kafka/commit/0a2569e2b9907a1217dd50ccbc320f8ad0b42fd0]
>  which added flexible version support and bumped the protocol version.
> It appears that the SaslClientAuthenticator uses the max version for 
> SASL_HANDSHAKE received in the broker's AP_VERSIONS response, and then uses 
> that version even though it may not support it. See 
> [https://github.com/apache/kafka/blob/eb09efa9ac79efa484307bdcf03ac8eb8a3a94e2/clients/src/main/java/org/apache/kafka/common/security/authenticator/SaslClientAuthenticator.java#L290].
>  
> This may make it hard to ever evolve this schema. In the short term I suggest 
> we roll back the version bump and flexible schema until we figure out a path 
> forward.
> It appears that this may not have been a problem in the past because the 
> schema versions were the same and maybe we didn't validate the version number 
> [https://github.com/apache/kafka/commit/0cf7708007b01faac5012d939f3c50db274f858d#diff-7f65552a2e23aa7028500f8db06cbb30R47]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9577) Client encountering SASL_HANDSHAKE protocol version errors on 2.5 / trunk

2020-02-20 Thread Stanislav Kozlovski (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041046#comment-17041046
 ] 

Stanislav Kozlovski commented on KAFKA-9577:


Do we have a system test that would catch this?

> Client encountering SASL_HANDSHAKE protocol version errors on 2.5 / trunk
> -
>
> Key: KAFKA-9577
> URL: https://issues.apache.org/jira/browse/KAFKA-9577
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.5.0
>Reporter: Lucas Bradstreet
>Assignee: Lucas Bradstreet
>Priority: Blocker
> Fix For: 2.5.0
>
>
> I am trying 2.5.0 with sasl turned on and my consumer and producer clients 
> receive:
> {noformat}
> org.apache.kafka.common.errors.UnsupportedVersionException: The 
> SASL_HANDSHAKE protocol does not support version 2
> {noformat}
> I believe this is due to 
> [https://github.com/apache/kafka/commit/0a2569e2b9907a1217dd50ccbc320f8ad0b42fd0]
>  which added flexible version support and bumped the protocol version.
> It appears that the SaslClientAuthenticator uses the max version for 
> SASL_HANDSHAKE received in the broker's AP_VERSIONS response, and then uses 
> that version even though it may not support it. See 
> [https://github.com/apache/kafka/blob/eb09efa9ac79efa484307bdcf03ac8eb8a3a94e2/clients/src/main/java/org/apache/kafka/common/security/authenticator/SaslClientAuthenticator.java#L290].
>  
> This may make it hard to ever evolve this schema. In the short term I suggest 
> we roll back the version bump and flexible schema until we figure out a path 
> forward.
> It appears that this may not have been a problem in the past because the 
> schema versions were the same and maybe we didn't validate the version number 
> [https://github.com/apache/kafka/commit/0cf7708007b01faac5012d939f3c50db274f858d#diff-7f65552a2e23aa7028500f8db06cbb30R47]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-8726) Producer can't abort a transaction aftersome send errors

2020-02-20 Thread Matthias J. Sax (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040906#comment-17040906
 ] 

Matthias J. Sax commented on KAFKA-8726:


>From my understanding the behavior is by design. If the producer is in an 
>internal error state, you can only `close()` it.

To resume, you would create a new producer instance. Using the same 
`transactional.id` the pending transaction would be aborted when you call 
`initTransaction()` on the new producer (otherwise, as you mentioned, the 
transaction would eventually time out and be aborted by the broker eventually). 
\cc [~bob-barrett] please correct me if I am wrong.

> Producer can't abort a transaction aftersome send errors
> 
>
> Key: KAFKA-8726
> URL: https://issues.apache.org/jira/browse/KAFKA-8726
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, producer 
>Affects Versions: 2.3.0
>Reporter: Mattia Barbon
>Priority: Major
>
> I am following the producer with transactions example in 
> [https://kafka.apache.org/23/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html,]
>  and on kafkaException, I use abortTransaction and retry.
>  
> In some cases, abortTransaction fails, with:
> ```
> org.apache.kafka.common.KafkaException: Cannot execute transactional method 
> because we are in an error state
> ```
> as far as I can tell, this is caused by
> ```
> org.apache.kafka.common.KafkaException: The client hasn't received 
> acknowledgment for some previously sent messages and can no longer retry 
> them. It isn't safe to continue.
> ```
>  
> Since both are KafkaException, the example seems to imply they are retriable, 
> but they seem not to be. Ideally, I would expect abortTransaction to succeed 
> in this case (the broker will abort the transaction anyway because it can't 
> be committed), but at the very least, I would expect to have a way to 
> determine that the producer is unusable and it can't recover.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9566) ProcessorContextImpl#forward throws NullPointerException if invoked from DeserializationExceptionHandler

2020-02-20 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-9566:
---
Priority: Minor  (was: Major)

> ProcessorContextImpl#forward throws NullPointerException if invoked from 
> DeserializationExceptionHandler
> 
>
> Key: KAFKA-9566
> URL: https://issues.apache.org/jira/browse/KAFKA-9566
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Tomas Mi
>Priority: Minor
>
> Hi, I am trying to implement custom DeserializationExceptionHandler which 
> would forward an exception to downstream processor(s), but 
> ProcessorContextImpl#forward throws a NullPointerException if invoked from 
> this custom handler.
> Handler implementation:
> {code:title=MyDeserializationExceptionHandler.java}
> public class MyDeserializationExceptionHandler implements 
> DeserializationExceptionHandler {
> @Override
> public void configure(Map configs) {
> }
> @Override
> public DeserializationHandlerResponse handle(ProcessorContext context, 
> ConsumerRecord record, Exception exception) {
> context.forward(null, exception, To.child("error-processor"));
> return DeserializationHandlerResponse.CONTINUE;
> }
> }
> {code}
> Handler is wired as default deserialization exception handler:
> {code}
> private TopologyTestDriver initializeTestDriver(StreamsBuilder 
> streamBuilder) {
> Topology topology = streamBuilder.build();
> Properties props = new Properties();
> props.setProperty(StreamsConfig.APPLICATION_ID_CONFIG, 
> "my-test-application");
> props.setProperty(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, 
> "dummy:1234");
> props.setProperty(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, 
> StreamsConfig.EXACTLY_ONCE);
> 
> props.setProperty(StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,
>  MyDeserializationExceptionHandler.class.getName());
> return new TopologyTestDriver(topology, props);
> }
> {code}
>  
> Exception stacktrace:
> {noformat}
> org.apache.kafka.streams.errors.StreamsException: Fatal user code error in 
> deserialization error callback
> at 
> org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:76)
> at 
> org.apache.kafka.streams.processor.internals.RecordQueue.maybeUpdateTimestamp(RecordQueue.java:160)
> at 
> org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:101)
> at 
> org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:136)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:742)
> at 
> org.apache.kafka.streams.TopologyTestDriver.pipeInput(TopologyTestDriver.java:392)
> ...
> Caused by: java.lang.NullPointerException
> at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:165)
> at 
> MyDeserializationExceptionHandler.handle(NewExceptionHandlerTest.java:204)
> at 
> org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:70)
>  ... 33 more
> {noformat}
> Neither DeserializationExceptionHandler, nor ProcessorContext javadocs 
> mention that ProcessorContext#forward(...) must not be invoked from 
> DeserializationExceptionHandler, so I assume that this is a defect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-1194) The kafka broker cannot delete the old log files after the configured time

2020-02-20 Thread Swaroop Kumar Sahu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040878#comment-17040878
 ] 

Swaroop Kumar Sahu commented on KAFKA-1194:
---

Hi [~tqin],

The issue is exist in the latest stable version 2.4.0 also.

Please update the Affects Versions.

 

Logs:

at kafka.cluster.Partition.delete(Partition.scala:470)
 at kafka.server.ReplicaManager.stopReplica(ReplicaManager.scala:360)
 at 
kafka.server.ReplicaManager.$anonfun$stopReplicas$2(ReplicaManager.scala:404)
 at scala.collection.immutable.HashSet.foreach(HashSet.scala:932)
 at kafka.server.ReplicaManager.stopReplicas(ReplicaManager.scala:402)
 at kafka.server.KafkaApis.handleStopReplicaRequest(KafkaApis.scala:235)
 at kafka.server.KafkaApis.handle(KafkaApis.scala:131)
 at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:70)
 at java.lang.Thread.run(Thread.java:748)
 *{color:#de350b}Suppressed: java.nio.file.AccessDeniedException: 
C:\kafka_2.13-2.4.0\data\kafka\second_topic-4 -> 
C:\kafka_2.13-2.4.0\data\kafka\second_topic-4.2f5254be07e947f7b0d999fa29a384f3-delete{color}*
 at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:83)
 at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
 at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:301)
 at 
sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:287)
 at java.nio.file.Files.move(Files.java:1395)
 at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:792)
 ... 17 more
[2020-02-20 17:01:10,761] INFO [ReplicaFetcherManager on broker 0] Removed 
fetcher for partitions Set(first_topic-0, first_topic-1, first_topic-2) 
(kafka.server.ReplicaFetcherManager)
[2020-02-20 17:01:10,762] INFO [ReplicaAlterLogDirsManager on broker 0] Removed 
fetcher for partitions Set(first_topic-0, first_topic-1, first_topic-2) 
(kafka.server.ReplicaAlterLogDirsManager)
[2020-02-20 17:01:10,765] INFO [ReplicaManager broker=0] Broker 0 stopped 
fetcher for partitions first_topic-0,first_topic-1,first_topic-2 and stopped 
moving logs for partitions because they are in the failed log directory 
C:\kafka_2.13-2.4.0\data\kafka. (kafka.server.ReplicaManager)
[2020-02-20 17:01:10,766] INFO Stopping serving logs in dir 
C:\kafka_2.13-2.4.0\data\kafka (kafka.log.LogManager)
[2020-02-20 17:01:10,770] ERROR Shutdown broker because all log dirs in 
C:\kafka_2.13-2.4.0\data\kafka have failed (kafka.log.LogManager)

C:\kafka_2.13-2.4.0>

> The kafka broker cannot delete the old log files after the configured time
> --
>
> Key: KAFKA-1194
> URL: https://issues.apache.org/jira/browse/KAFKA-1194
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.0.0, 0.11.0.0, 1.0.0
> Environment: window
>Reporter: Tao Qin
>Priority: Critical
>  Labels: features, patch, windows
> Attachments: KAFKA-1194.patch, RetentionExpiredWindows.txt, 
> Untitled.jpg, image-2018-09-12-14-25-52-632.png, 
> image-2018-11-26-10-18-59-381.png, kafka-1194-v1.patch, kafka-1194-v2.patch, 
> kafka-bombarder.7z, screenshot-1.png
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> We tested it in windows environment, and set the log.retention.hours to 24 
> hours.
> # The minimum age of a log file to be eligible for deletion
> log.retention.hours=24
> After several days, the kafka broker still cannot delete the old log file. 
> And we get the following exceptions:
> [2013-12-19 01:57:38,528] ERROR Uncaught exception in scheduled task 
> 'kafka-log-retention' (kafka.utils.KafkaScheduler)
> kafka.common.KafkaStorageException: Failed to change the log file suffix from 
>  to .deleted for log segment 1516723
>  at kafka.log.LogSegment.changeFileSuffixes(LogSegment.scala:249)
>  at kafka.log.Log.kafka$log$Log$$asyncDeleteSegment(Log.scala:638)
>  at kafka.log.Log.kafka$log$Log$$deleteSegment(Log.scala:629)
>  at kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:418)
>  at kafka.log.Log$$anonfun$deleteOldSegments$1.apply(Log.scala:418)
>  at 
> scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
>  at scala.collection.immutable.List.foreach(List.scala:76)
>  at kafka.log.Log.deleteOldSegments(Log.scala:418)
>  at 
> kafka.log.LogManager.kafka$log$LogManager$$cleanupExpiredSegments(LogManager.scala:284)
>  at 
> kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:316)
>  at 
> kafka.log.LogManager$$anonfun$cleanupLogs$3.apply(LogManager.scala:314)
>  at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:743)
>  at 

[jira] [Resolved] (KAFKA-9571) MirrorMaker task failing during pool

2020-02-20 Thread Nitish Goyal (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nitish Goyal resolved KAFKA-9571.
-
Resolution: Fixed

> MirrorMaker task failing during pool
> 
>
> Key: KAFKA-9571
> URL: https://issues.apache.org/jira/browse/KAFKA-9571
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, mirrormaker
>Affects Versions: 2.4.0
>Reporter: Nitish Goyal
>Priority: Blocker
>
> I have setup kafka replication between source and target cluster
> I am observing Mirror Source task getting killed after certain time with the 
> following error
>  
> ```
> [[2020-02-17 22:39:57,344] ERROR Failure during poll. 
> (org.apache.kafka.connect.mirror.MirrorSourceTask:161)
>  [2020-02-17 22:39:57,346] ERROR 
> WorkerSourceTask\{id=MirrorSourceConnector-99} Task threw an uncaught and 
> unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:179)
>  [2020-02-17 22:39:57,346] ERROR 
> WorkerSourceTask\{id=MirrorSourceConnector-99} Task is being killed and will 
> not recover until manually restarted 
> (org.apache.kafka.connect.runtime.WorkerTask:180)
> ```
>  
> What could be the possible reason for the above?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9571) MirrorMaker task failing during pool

2020-02-20 Thread Nitish Goyal (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040864#comment-17040864
 ] 

Nitish Goyal commented on KAFKA-9571:
-

It was an issue in our setup

 

Closing the issue

> MirrorMaker task failing during pool
> 
>
> Key: KAFKA-9571
> URL: https://issues.apache.org/jira/browse/KAFKA-9571
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, mirrormaker
>Affects Versions: 2.4.0
>Reporter: Nitish Goyal
>Priority: Blocker
>
> I have setup kafka replication between source and target cluster
> I am observing Mirror Source task getting killed after certain time with the 
> following error
>  
> ```
> [[2020-02-17 22:39:57,344] ERROR Failure during poll. 
> (org.apache.kafka.connect.mirror.MirrorSourceTask:161)
>  [2020-02-17 22:39:57,346] ERROR 
> WorkerSourceTask\{id=MirrorSourceConnector-99} Task threw an uncaught and 
> unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:179)
>  [2020-02-17 22:39:57,346] ERROR 
> WorkerSourceTask\{id=MirrorSourceConnector-99} Task is being killed and will 
> not recover until manually restarted 
> (org.apache.kafka.connect.runtime.WorkerTask:180)
> ```
>  
> What could be the possible reason for the above?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9574) Contact page links to spam/ads site on search-hadoop[.]com domain

2020-02-20 Thread Mickael Maison (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison reassigned KAFKA-9574:
-

Assignee: Gert van Dijk

> Contact page links to spam/ads site on search-hadoop[.]com domain
> -
>
> Key: KAFKA-9574
> URL: https://issues.apache.org/jira/browse/KAFKA-9574
> Project: Kafka
>  Issue Type: Bug
>  Components: website
>Reporter: Gert van Dijk
>Assignee: Gert van Dijk
>Priority: Major
>
> The current live page at [https://kafka.apache.org/contact] displays:
> {quote}A searchable archive of the mailing lists is available at 
> search-hadoop[.]com
> {quote}
> But this is shows as a scam/ads serving site in my browser.
> Git history shows me that this link has been present for many years. WHOIS 
> domain info suggests that the domain is transferred 2020-01-22, so I assume 
> this site is now a different site than was intended to link to.
> Is there another site that allows to search through all Kafka archives? If 
> not, I suggest to remove the whole paragraph.
> I did not find any other links to the domain on the kafka-site repo.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9574) Contact page links to spam/ads site on search-hadoop[.]com domain

2020-02-20 Thread Mickael Maison (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-9574.
---
Resolution: Fixed

> Contact page links to spam/ads site on search-hadoop[.]com domain
> -
>
> Key: KAFKA-9574
> URL: https://issues.apache.org/jira/browse/KAFKA-9574
> Project: Kafka
>  Issue Type: Bug
>  Components: website
>Reporter: Gert van Dijk
>Assignee: Gert van Dijk
>Priority: Major
>
> The current live page at [https://kafka.apache.org/contact] displays:
> {quote}A searchable archive of the mailing lists is available at 
> search-hadoop[.]com
> {quote}
> But this is shows as a scam/ads serving site in my browser.
> Git history shows me that this link has been present for many years. WHOIS 
> domain info suggests that the domain is transferred 2020-01-22, so I assume 
> this site is now a different site than was intended to link to.
> Is there another site that allows to search through all Kafka archives? If 
> not, I suggest to remove the whole paragraph.
> I did not find any other links to the domain on the kafka-site repo.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9574) Contact page links to spam/ads site on search-hadoop[.]com domain

2020-02-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040846#comment-17040846
 ] 

ASF GitHub Bot commented on KAFKA-9574:
---

mimaison commented on pull request #254: KAFKA-9574: Remove paragraph about 
searchable archive of the mailing lists
URL: https://github.com/apache/kafka-site/pull/254
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Contact page links to spam/ads site on search-hadoop[.]com domain
> -
>
> Key: KAFKA-9574
> URL: https://issues.apache.org/jira/browse/KAFKA-9574
> Project: Kafka
>  Issue Type: Bug
>  Components: website
>Reporter: Gert van Dijk
>Priority: Major
>
> The current live page at [https://kafka.apache.org/contact] displays:
> {quote}A searchable archive of the mailing lists is available at 
> search-hadoop[.]com
> {quote}
> But this is shows as a scam/ads serving site in my browser.
> Git history shows me that this link has been present for many years. WHOIS 
> domain info suggests that the domain is transferred 2020-01-22, so I assume 
> this site is now a different site than was intended to link to.
> Is there another site that allows to search through all Kafka archives? If 
> not, I suggest to remove the whole paragraph.
> I did not find any other links to the domain on the kafka-site repo.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (KAFKA-9455) Consider using TreeMap for in-memory stores of Streams

2020-02-20 Thread highluck (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

highluck updated KAFKA-9455:

Comment: was deleted

(was: @Guozhang

Thanks!
it was helpful!)

> Consider using TreeMap for in-memory stores of Streams
> --
>
> Key: KAFKA-9455
> URL: https://issues.apache.org/jira/browse/KAFKA-9455
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: highluck
>Priority: Major
>  Labels: newbie++
>
> From [~ableegoldman]: It's worth noting that it might be a good idea to 
> switch to TreeMap for different reasons. Right now the ConcurrentSkipListMap 
> allows us to safely perform range queries without copying over the entire 
> keyset, but the performance on point queries seems to scale noticeably worse 
> with the number of unique keys. Point queries are used by aggregations while 
> range queries are used by windowed joins, but of course both are available 
> within the PAPI and for interactive queries so it's hard to say which we 
> should prefer. Maybe rather than make that tradeoff we should have one 
> version for efficient range queries (a "JoinWindowStore") and one for 
> efficient point queries ("AggWindowStore") - or something. I know we've had 
> similar thoughts for a different RocksDB store layout for Joins (although I 
> can't find that ticket anywhere..), it seems like the in-memory stores could 
> benefit from a special "Join" version as well cc/ Guozhang Wang
> Here are some random thoughts:
> 1. For kafka streams processing logic (i.e. without IQ), it's better to make 
> all processing logic relying on point queries rather than range queries. 
> Right now the only processor that use range queries are, as mentioned above, 
> windowed stream-stream joins. I think we should consider using a different 
> window implementation for this (and as a result also get rid of the 
> retainDuplicate flags) to refactor the windowed stream-stream join operation.
> 2. With 1), range queries would only be exposed as IQ. Depending on its usage 
> frequency I think it makes lots of sense to optimize for single-point queries.
> Of course, even without step 1) we should still consider using tree-map for 
> windowed in-memory stores to have a better scaling effect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9168) Integrate JNI direct buffer support to RocksDBStore

2020-02-20 Thread Sagar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040766#comment-17040766
 ] 

Sagar Rao commented on KAFKA-9168:
--

hey [~ableegoldman] you had asked me to create this ticket to track the closure 
of the particular PR mentioned in the external Link from RocksDB.

I couldn't find the relevant conversation we had on Slack but this was in 
context of the conversation we had a few months ago around adding Prefix Scan 
behaviour for State-stores(RocksDB for now). I have the changes ready with 
relevant benchmarks created but was waiting for this PR from RocksDB to get 
resolved which happened a couple of days ago.

So, coming back to this ticket, considering that the PR got merged to master in 
RocksDB codebase, is there anything that needs to be done in the streams side 
to incorporate these changes? 

> Integrate JNI direct buffer support to RocksDBStore
> ---
>
> Key: KAFKA-9168
> URL: https://issues.apache.org/jira/browse/KAFKA-9168
> Project: Kafka
>  Issue Type: Task
>  Components: streams
>Reporter: Sagar Rao
>Priority: Minor
>  Labels: perfomance
>
> There has been a PR created on rocksdb Java client to support direct 
> ByteBuffers in Java. We can look at integrating it whenever it gets merged. 
> Link to PR: [https://github.com/facebook/rocksdb/pull/2283]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)