from:"Sanjana Kaundinya \(Jira\)"

[jira] [Updated] (KAFKA-13045) Add test for batched OffsetFetch requests where we have the same group in the request appear more than once

2021-07-07 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-13045:
--
Description: Now that we have 
https://issues.apache.org/jira/browse/KAFKA-12234 merged into {{trunk}} and 
{{3.0}}, we have to ensure that we test for the case where we have the same 
group id appearing in the request more than once and handle it correctly in 
batched {{OffsetFetch}} requests. Generally speaking we wouldn't expect this to 
happen since we wouldn't allow this for requests being built from the Builder, 
but it would still be good to have a test for this.  (was: Now that we have 
https://issues.apache.org/jira/browse/KAFKA-12234 merged into {{trunk and 3.0, 
we have to ensure that we test for the case where we have the same group id 
appearing in the request more than once}})

> Add test for batched OffsetFetch requests where we have the same group in the 
> request appear more than once
> ---
>
> Key: KAFKA-13045
> URL: https://issues.apache.org/jira/browse/KAFKA-13045
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Sanjana Kaundinya
>Assignee: Sanjana Kaundinya
>Priority: Minor
> Fix For: 3.0.0
>
>
> Now that we have https://issues.apache.org/jira/browse/KAFKA-12234 merged 
> into {{trunk}} and {{3.0}}, we have to ensure that we test for the case where 
> we have the same group id appearing in the request more than once and handle 
> it correctly in batched {{OffsetFetch}} requests. Generally speaking we 
> wouldn't expect this to happen since we wouldn't allow this for requests 
> being built from the Builder, but it would still be good to have a test for 
> this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13045) Add test for batched OffsetFetch requests where we have the same group in the request twice

2021-07-07 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-13045:
--
Description: Now that we have 
https://issues.apache.org/jira/browse/KAFKA-12234 merged into {{trunk and 3.0, 
we have to ensure that we test for the case where we have the same group id 
appearing in the request more than once}}  (was: Now that we have 
https://issues.apache.org/jira/browse/KAFKA-12234 merged into {{trunk}})

> Add test for batched OffsetFetch requests where we have the same group in the 
> request twice
> ---
>
> Key: KAFKA-13045
> URL: https://issues.apache.org/jira/browse/KAFKA-13045
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Sanjana Kaundinya
>Assignee: Sanjana Kaundinya
>Priority: Minor
> Fix For: 3.0.0
>
>
> Now that we have https://issues.apache.org/jira/browse/KAFKA-12234 merged 
> into {{trunk and 3.0, we have to ensure that we test for the case where we 
> have the same group id appearing in the request more than once}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13045) Add test for batched OffsetFetch requests where we have the same group in the request appear more than once

2021-07-07 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-13045:
--
Summary: Add test for batched OffsetFetch requests where we have the same 
group in the request appear more than once  (was: Add test for batched 
OffsetFetch requests where we have the same group in the request twice)

> Add test for batched OffsetFetch requests where we have the same group in the 
> request appear more than once
> ---
>
> Key: KAFKA-13045
> URL: https://issues.apache.org/jira/browse/KAFKA-13045
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Sanjana Kaundinya
>Assignee: Sanjana Kaundinya
>Priority: Minor
> Fix For: 3.0.0
>
>
> Now that we have https://issues.apache.org/jira/browse/KAFKA-12234 merged 
> into {{trunk and 3.0, we have to ensure that we test for the case where we 
> have the same group id appearing in the request more than once}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13045) Add test for batched OffsetFetch requests where we have the same group in the request twice

2021-07-07 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-13045:
--
Description: Now that we have 
https://issues.apache.org/jira/browse/KAFKA-12234 merged into {{trunk}}

> Add test for batched OffsetFetch requests where we have the same group in the 
> request twice
> ---
>
> Key: KAFKA-13045
> URL: https://issues.apache.org/jira/browse/KAFKA-13045
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Sanjana Kaundinya
>Assignee: Sanjana Kaundinya
>Priority: Minor
> Fix For: 3.0.0
>
>
> Now that we have https://issues.apache.org/jira/browse/KAFKA-12234 merged 
> into {{trunk}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-13045) Add test for batched OffsetFetch requests where we have the same group in the request twice.

2021-07-07 Thread Sanjana Kaundinya (Jira)

Sanjana Kaundinya created KAFKA-13045:
-

 Summary: Add test for batched OffsetFetch requests where we have 
the same group in the request twice.
 Key: KAFKA-13045
 URL: https://issues.apache.org/jira/browse/KAFKA-13045
 Project: Kafka
  Issue Type: Test
Reporter: Sanjana Kaundinya






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13045) Add test for batched OffsetFetch requests where we have the same group in the request twice

2021-07-07 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-13045:
--
Fix Version/s: 3.0.0

> Add test for batched OffsetFetch requests where we have the same group in the 
> request twice
> ---
>
> Key: KAFKA-13045
> URL: https://issues.apache.org/jira/browse/KAFKA-13045
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Sanjana Kaundinya
>Priority: Minor
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KAFKA-13045) Add test for batched OffsetFetch requests where we have the same group in the request twice

2021-07-07 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya reassigned KAFKA-13045:
-

Assignee: Sanjana Kaundinya

> Add test for batched OffsetFetch requests where we have the same group in the 
> request twice
> ---
>
> Key: KAFKA-13045
> URL: https://issues.apache.org/jira/browse/KAFKA-13045
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Sanjana Kaundinya
>Assignee: Sanjana Kaundinya
>Priority: Minor
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13045) Add test for batched OffsetFetch requests where we have the same group in the request twice

2021-07-07 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-13045:
--
Affects Version/s: 3.0.0

> Add test for batched OffsetFetch requests where we have the same group in the 
> request twice
> ---
>
> Key: KAFKA-13045
> URL: https://issues.apache.org/jira/browse/KAFKA-13045
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 3.0.0
>Reporter: Sanjana Kaundinya
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13045) Add test for batched OffsetFetch requests where we have the same group in the request twice

2021-07-07 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-13045:
--
Summary: Add test for batched OffsetFetch requests where we have the same 
group in the request twice  (was: Add test for batched OffsetFetch requests 
where we have the same group in the request twice.)

> Add test for batched OffsetFetch requests where we have the same group in the 
> request twice
> ---
>
> Key: KAFKA-13045
> URL: https://issues.apache.org/jira/browse/KAFKA-13045
> Project: Kafka
>  Issue Type: Test
>Reporter: Sanjana Kaundinya
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-13045) Add test for batched OffsetFetch requests where we have the same group in the request twice

2021-07-07 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-13045:
--
Priority: Minor  (was: Major)

> Add test for batched OffsetFetch requests where we have the same group in the 
> request twice
> ---
>
> Key: KAFKA-13045
> URL: https://issues.apache.org/jira/browse/KAFKA-13045
> Project: Kafka
>  Issue Type: Test
>Reporter: Sanjana Kaundinya
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KAFKA-12234) Extend OffsetFetch requests to accept multiple group ids.

2021-05-04 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya reassigned KAFKA-12234:
-

Assignee: Sanjana Kaundinya

> Extend OffsetFetch requests to accept multiple group ids.
> -
>
> Key: KAFKA-12234
> URL: https://issues.apache.org/jira/browse/KAFKA-12234
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Tom Scott
>Assignee: Sanjana Kaundinya
>Priority: Minor
> Fix For: 3.0.0
>
>
> More details are in the KIP: 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=173084258



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-9678) Introduce bounded exponential backoff in clients

2020-07-01 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya resolved KAFKA-9678.
--
Resolution: Duplicate

> Introduce bounded exponential backoff in clients
> 
>
> Key: KAFKA-9678
> URL: https://issues.apache.org/jira/browse/KAFKA-9678
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, consumer, producer 
>Reporter: Guozhang Wang
>Assignee: Sanjana Kaundinya
>Priority: Major
>  Labels: needs-kip
>
> In all clients (consumer, producer, admin, and streams) we have retry 
> mechanisms with fixed backoff to handle transient connection issues with 
> brokers. However, with small backoff (many defaults to 100ms) we could send 
> 10s of requests per second to the broker, and if the connection issue is 
> prolonged it means a huge overhead.
> We should consider introducing upper-bounded exponential backoff universally 
> in those clients to reduce the num of retry requests during the period of 
> connection partitioning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (KAFKA-9678) Introduce bounded exponential backoff in clients

2020-07-01 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya reopened KAFKA-9678:
--

> Introduce bounded exponential backoff in clients
> 
>
> Key: KAFKA-9678
> URL: https://issues.apache.org/jira/browse/KAFKA-9678
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, consumer, producer 
>Reporter: Guozhang Wang
>Assignee: Sanjana Kaundinya
>Priority: Major
>  Labels: needs-kip
>
> In all clients (consumer, producer, admin, and streams) we have retry 
> mechanisms with fixed backoff to handle transient connection issues with 
> brokers. However, with small backoff (many defaults to 100ms) we could send 
> 10s of requests per second to the broker, and if the connection issue is 
> prolonged it means a huge overhead.
> We should consider introducing upper-bounded exponential backoff universally 
> in those clients to reduce the num of retry requests during the period of 
> connection partitioning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-9893) Configurable TCP connection timeout and improve the initial metadata fetch

2020-07-01 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya resolved KAFKA-9893.
--
Resolution: Implemented

> Configurable TCP connection timeout and improve the initial metadata fetch
> --
>
> Key: KAFKA-9893
> URL: https://issues.apache.org/jira/browse/KAFKA-9893
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Cheng Tan
>Assignee: Cheng Tan
>Priority: Major
>
> This issue has two parts:
>  # Support transportation layer connection timeout described in KIP-601
>  # Optimize the logic for NetworkClient.leastLoadedNode()
> Changes:
>  # Added a new common client configuration parameter 
> socket.connection.setup.timeout.ms to the NetworkClient. Handle potential 
> transportation layer timeout using the same approach as it handling potential 
> request timeout.
>  # When no connected channel exists, leastLoadedNode() will now provide a 
> disconnected node that has the least number of failed attempts. 
>  # ClusterConnectionStates will keep the connecting node ids. Now it also has 
> several new public methods to provide per connection relavant data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (KAFKA-9800) [KIP-580] Client Exponential Backoff Implementation

2020-06-23 Thread Sanjana Kaundinya (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124340#comment-17124340
 ] 

Sanjana Kaundinya edited comment on KAFKA-9800 at 6/24/20, 4:24 AM:


Here are my thoughts:
 1) There should be no changes to the underlying NetworkClient interface. More 
details on what interfaces should change are outlined in the KIP here: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients|https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients).].
 As [~d8tltanc] mentioned, the key difference between the AdminClient and the 
other clients is that the AdminClient supports a per request timeout whereas 
the Producer and Consumer are using the NetworkClient to handle the timeouts. 
In our implementation we should add a per-request support for all clients and 
reuse the code existing in CallRetryContext to apply to all clients to do this.

2) I agree with [~ijuma], I don't see really much of a benefit for doing a 
different backoff strategy per client, plus that kind of improvement goes 
beyond the scope of this KIP.

 

[~d8tltanc] could you shed more light on how to refactor the AdminClient to 
take out the redundant logic? I want to understand how much this would change 
and if all this change would still be in the scope of the KIP.


was (Author: skaundinya):
Here are my thoughts:
 1) There should be no changes to the underlying NetworkClient interface. More 
details on what interfaces should change are outlined in the KIP 
[here|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients|https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients).]].
 As [~d8tltanc] mentioned, the key difference between the AdminClient and the 
other clients is that the AdminClient supports a per request timeout whereas 
the Producer and Consumer are using the NetworkClient to handle the timeouts. 
In our implementation we should add a per-request support for all clients and 
reuse the code existing in CallRetryContext to apply to all clients to do this.

2) I agree with [~ijuma], I don't see really much of a benefit for doing a 
different backoff strategy per client, plus that kind of improvement goes 
beyond the scope of this KIP.

 

[~d8tltanc] could you shed more light on how to refactor the AdminClient to 
take out the redundant logic? I want to understand how much this would change 
and if all this change would still be in the scope of the KIP.

> [KIP-580] Client Exponential Backoff Implementation
> ---
>
> Key: KAFKA-9800
> URL: https://issues.apache.org/jira/browse/KAFKA-9800
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Cheng Tan
>Assignee: Cheng Tan
>Priority: Major
>  Labels: KIP-580
>
> Design:
> The main idea is to bookkeep the failed attempt. Currently, the retry backoff 
> has two main usage patterns:
>  # Synchronous retires and blocking loop. The thread will sleep in each 
> iteration for retry backoff ms.
>  # Async retries. In each polling, the retries do not meet the backoff will 
> be filtered. The data class often maintains a 1:1 mapping to a set of 
> requests which are logically associated. (i.e. a set contains only one 
> initial request and only its retries.)
> For type 1, we can utilize a local failure counter of a Java generic data 
> type.
> For case 2, I already wrapped the exponential backoff/timeout util class in 
> my KIP-601 
> [implementation|https://github.com/apache/kafka/pull/8683/files#diff-9ca2b1294653dfa914b9277de62b52e3R28]
>  which takes the number of attempts and returns the backoff/timeout value at 
> the corresponding level. Thus, we can add a new class property to those 
> classes containing retriable data in order to record the number of failed 
> attempts.
>  
> Changes:
> KafkaProducer:
>  # Produce request (ApiKeys.PRODUCE). Currently, the backoff applies to each 
> ProducerBatch in Accumulator, which already has an attribute attempts 
> recording the number of failed attempts. So we can let the Accumulator 
> calculate the new retry backoff for each bach when it enqueues them, to avoid 
> instantiate the util class multiple times.
>  # Transaction request (ApiKeys..*TXN). TxnRequestHandler will have a new 
> class property of type `Long` to record the number of attempts.
> KafkaConsumer:
>  # Some synchronous retry use cases. Record the failed attempts in the 
> blocking loop.
>  # Partition request (ApiKeys.OFFSET_FOR_LEADER_EPOCH, ApiKeys.LIST_OFFSETS). 
> Though the actual requests are packed for each node, the current 
> implementation is applying backoff to each topic partition, where the backoff 
> value is kept

[jira] [Comment Edited] (KAFKA-9800) [KIP-580] Client Exponential Backoff Implementation

2020-06-23 Thread Sanjana Kaundinya (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124340#comment-17124340
 ] 

Sanjana Kaundinya edited comment on KAFKA-9800 at 6/24/20, 4:24 AM:


Here are my thoughts:
 1) There should be no changes to the underlying NetworkClient interface. More 
details on what interfaces should change are outlined in the KIP 
[here|[https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients|https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients).]].
 As [~d8tltanc] mentioned, the key difference between the AdminClient and the 
other clients is that the AdminClient supports a per request timeout whereas 
the Producer and Consumer are using the NetworkClient to handle the timeouts. 
In our implementation we should add a per-request support for all clients and 
reuse the code existing in CallRetryContext to apply to all clients to do this.

2) I agree with [~ijuma], I don't see really much of a benefit for doing a 
different backoff strategy per client, plus that kind of improvement goes 
beyond the scope of this KIP.

 

[~d8tltanc] could you shed more light on how to refactor the AdminClient to 
take out the redundant logic? I want to understand how much this would change 
and if all this change would still be in the scope of the KIP.


was (Author: skaundinya):
Here are my thoughts:
1) There should be no changes to the underlying NetworkClient interface. More 
details on what interfaces should change are outlined in the KIP 
[here]([https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients)|https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients).].
 As [~d8tltanc] mentioned, the key difference between the AdminClient and the 
other clients is that the AdminClient supports a per request timeout whereas 
the Producer and Consumer are using the NetworkClient to handle the timeouts. 
In our implementation we should add a per-request support for all clients and 
reuse the code existing in CallRetryContext to apply to all clients to do this.

2) I agree with [~ijuma], I don't see really much of a benefit for doing a 
different backoff strategy per client, plus that kind of improvement goes 
beyond the scope of this KIP.

 

[~d8tltanc] could you shed more light on how to refactor the AdminClient to 
take out the redundant logic? I want to understand how much this would change 
and if all this change would still be in the scope of the KIP.

> [KIP-580] Client Exponential Backoff Implementation
> ---
>
> Key: KAFKA-9800
> URL: https://issues.apache.org/jira/browse/KAFKA-9800
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Cheng Tan
>Assignee: Cheng Tan
>Priority: Major
>  Labels: KIP-580
>
> Design:
> The main idea is to bookkeep the failed attempt. Currently, the retry backoff 
> has two main usage patterns:
>  # Synchronous retires and blocking loop. The thread will sleep in each 
> iteration for retry backoff ms.
>  # Async retries. In each polling, the retries do not meet the backoff will 
> be filtered. The data class often maintains a 1:1 mapping to a set of 
> requests which are logically associated. (i.e. a set contains only one 
> initial request and only its retries.)
> For type 1, we can utilize a local failure counter of a Java generic data 
> type.
> For case 2, I already wrapped the exponential backoff/timeout util class in 
> my KIP-601 
> [implementation|https://github.com/apache/kafka/pull/8683/files#diff-9ca2b1294653dfa914b9277de62b52e3R28]
>  which takes the number of attempts and returns the backoff/timeout value at 
> the corresponding level. Thus, we can add a new class property to those 
> classes containing retriable data in order to record the number of failed 
> attempts.
>  
> Changes:
> KafkaProducer:
>  # Produce request (ApiKeys.PRODUCE). Currently, the backoff applies to each 
> ProducerBatch in Accumulator, which already has an attribute attempts 
> recording the number of failed attempts. So we can let the Accumulator 
> calculate the new retry backoff for each bach when it enqueues them, to avoid 
> instantiate the util class multiple times.
>  # Transaction request (ApiKeys..*TXN). TxnRequestHandler will have a new 
> class property of type `Long` to record the number of attempts.
> KafkaConsumer:
>  # Some synchronous retry use cases. Record the failed attempts in the 
> blocking loop.
>  # Partition request (ApiKeys.OFFSET_FOR_LEADER_EPOCH, ApiKeys.LIST_OFFSETS). 
> Though the actual requests are packed for each node, the current 
> implementation is applying backoff to each topic partition, where the backoff 
> value is

[jira] [Resolved] (KAFKA-9678) Introduce bounded exponential backoff in clients

2020-06-23 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya resolved KAFKA-9678.
--
Resolution: Duplicate

> Introduce bounded exponential backoff in clients
> 
>
> Key: KAFKA-9678
> URL: https://issues.apache.org/jira/browse/KAFKA-9678
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, consumer, producer 
>Reporter: Guozhang Wang
>Assignee: Sanjana Kaundinya
>Priority: Major
>  Labels: needs-kip
>
> In all clients (consumer, producer, admin, and streams) we have retry 
> mechanisms with fixed backoff to handle transient connection issues with 
> brokers. However, with small backoff (many defaults to 100ms) we could send 
> 10s of requests per second to the broker, and if the connection issue is 
> prolonged it means a huge overhead.
> We should consider introducing upper-bounded exponential backoff universally 
> in those clients to reduce the num of retry requests during the period of 
> connection partitioning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-9678) Introduce bounded exponential backoff in clients

2020-06-23 Thread Sanjana Kaundinya (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17143505#comment-17143505
 ] 

Sanjana Kaundinya commented on KAFKA-9678:
--

The work for this is being done by [~d8tltanc] in KAFKA-9800, so I will close 
this ticket out as he has a PR in progress against that ticket.

> Introduce bounded exponential backoff in clients
> 
>
> Key: KAFKA-9678
> URL: https://issues.apache.org/jira/browse/KAFKA-9678
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, consumer, producer 
>Reporter: Guozhang Wang
>Assignee: Sanjana Kaundinya
>Priority: Major
>  Labels: needs-kip
>
> In all clients (consumer, producer, admin, and streams) we have retry 
> mechanisms with fixed backoff to handle transient connection issues with 
> brokers. However, with small backoff (many defaults to 100ms) we could send 
> 10s of requests per second to the broker, and if the connection issue is 
> prolonged it means a huge overhead.
> We should consider introducing upper-bounded exponential backoff universally 
> in those clients to reduce the num of retry requests during the period of 
> connection partitioning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-9800) [KIP-580] Client Exponential Backoff Implementation

2020-06-02 Thread Sanjana Kaundinya (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-9800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124340#comment-17124340
 ] 

Sanjana Kaundinya commented on KAFKA-9800:
--

Here are my thoughts:
1) There should be no changes to the underlying NetworkClient interface. More 
details on what interfaces should change are outlined in the KIP 
[here]([https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients)|https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients).].
 As [~d8tltanc] mentioned, the key difference between the AdminClient and the 
other clients is that the AdminClient supports a per request timeout whereas 
the Producer and Consumer are using the NetworkClient to handle the timeouts. 
In our implementation we should add a per-request support for all clients and 
reuse the code existing in CallRetryContext to apply to all clients to do this.

2) I agree with [~ijuma], I don't see really much of a benefit for doing a 
different backoff strategy per client, plus that kind of improvement goes 
beyond the scope of this KIP.

 

[~d8tltanc] could you shed more light on how to refactor the AdminClient to 
take out the redundant logic? I want to understand how much this would change 
and if all this change would still be in the scope of the KIP.

> [KIP-580] Client Exponential Backoff Implementation
> ---
>
> Key: KAFKA-9800
> URL: https://issues.apache.org/jira/browse/KAFKA-9800
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Cheng Tan
>Assignee: Cheng Tan
>Priority: Major
>  Labels: KIP-580
>
> In {{KafkaAdminClient}}, we will have to modify the way the retry backoff is 
> calculated for the calls that have failed and need to be retried. >From the 
> current static retry backoff, we have to introduce a mechanism for all calls 
> that upon failure, the next retry time is dynamically calculated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-10021) When reading to the end of the config log, check if fetch.max.wait.ms is greater than worker.sync.timeout.ms

2020-05-19 Thread Sanjana Kaundinya (Jira)

Sanjana Kaundinya created KAFKA-10021:
-

 Summary: When reading to the end of the config log, check if 
fetch.max.wait.ms is greater than worker.sync.timeout.ms
 Key: KAFKA-10021
 URL: https://issues.apache.org/jira/browse/KAFKA-10021
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Sanjana Kaundinya


Currently in the Connect code in DistributedHerder.java, we see the following 
piece of code

 

{{if (!canReadConfigs && !readConfigToEnd(workerSyncTimeoutMs))
return; // Safe to return and tick immediately because 
readConfigToEnd will do the backoff for us}}

where the workerSyncTimeoutMs passed in is the timeout given to read to the end 
of the config log. This is a bug as we should check if fetch.wait.max.ms is 
greater than worker.sync.timeout.ms and if it is, use worker.sync.timeout.ms as 
the fetch.wait.max.ms. A better fix would be to use the AdminClient to read to 
the end of the log, but at a minimum we should check the configs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9988) Connect incorrectly reports task has failed when task takes too long to shutdown

2020-05-13 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9988:
-
Affects Version/s: 2.5.1
   2.4.2
   2.3.2
   2.2.3
   2.3.0
   2.4.0
   2.3.1
   2.5.0
   2.4.1

> Connect incorrectly reports task has failed when task takes too long to 
> shutdown
> 
>
> Key: KAFKA-9988
> URL: https://issues.apache.org/jira/browse/KAFKA-9988
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.3.0, 2.4.0, 2.3.1, 2.2.3, 2.5.0, 2.3.2, 2.4.1, 2.4.2, 
> 2.5.1
>Reporter: Sanjana Kaundinya
>Priority: Major
>
> If the OffsetStorageReader is closed while the task is trying to shutdown, 
> and the task is trying to access the offsets from the OffsetStorageReader, 
> then we see the following in the logs.
> {code:java}
> [2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=connector-18} Task threw 
> an uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask)
> org.apache.kafka.connect.errors.ConnectException: Failed to fetch offsets.
> at 
> org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:114)
> at 
> org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offset(OffsetStorageReaderImpl.java:63)
> at 
> org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:205)
> at 
> org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
> at 
> org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kafka.connect.errors.ConnectException: Offset reader 
> closed while attempting to read offsets. This is likely because the task was 
> been scheduled to stop but has taken longer than the graceful shutdown period 
> to do so.
> at 
> org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:103)
> ... 14 more
> [2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=connector-18} Task is 
> being killed and will not recover until manually restarted 
> (org.apache.kafka.connect.runtime.WorkerTask)
> {code}
> This is a bit misleading, because the task is already on its way of being 
> shutdown, and doesn't actually need manual intervention to be restarted. We 
> can see that as later on in the logs we see that it throws another 
> unrecoverable exception.
> {code:java}
> [2020-05-05 05:40:39,361] ERROR WorkerSourceTask{id=connector-18} Task threw 
> an uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask)
> {code}
> If we know a task is on its way of shutting down, we should not throw a 
> ConnectException and instead log a warning so that we don't log false 
> negatives.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9988) Log incorrectly reports task has failed when task takes too long to shutdown

2020-05-13 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9988:
-
Description: 
If the OffsetStorageReader is closed while the task is trying to shutdown, and 
the task is trying to access the offsets from the OffsetStorageReader, then we 
see the following in the logs.

{code:java}
[2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=connector-18} Task threw an 
uncaught and unrecoverable exception 
(org.apache.kafka.connect.runtime.WorkerTask)
org.apache.kafka.connect.errors.ConnectException: Failed to fetch offsets.
at 
org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:114)
at 
org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offset(OffsetStorageReaderImpl.java:63)
at 
org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:205)
at 
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.kafka.connect.errors.ConnectException: Offset reader 
closed while attempting to read offsets. This is likely because the task was 
been scheduled to stop but has taken longer than the graceful shutdown period 
to do so.
at 
org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:103)
... 14 more
[2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=connector-18} Task is being 
killed and will not recover until manually restarted 
(org.apache.kafka.connect.runtime.WorkerTask)
{code}

This is a bit misleading, because the task is already on its way of being 
shutdown, and doesn't actually need manual intervention to be restarted. We can 
see that as later on in the logs we see that it throws another unrecoverable 
exception.

{code:java}
[2020-05-05 05:40:39,361] ERROR WorkerSourceTask{id=connector-18} Task threw an 
uncaught and unrecoverable exception 
(org.apache.kafka.connect.runtime.WorkerTask)
{code}

If we know a task is on its way of shutting down, we should not throw a 
ConnectException and instead log a warning so that we don't log false negatives.


  was:
If the OffsetStorageReader is closed while the task is trying to shutdown, and 
the task is trying to access the offsets from the OffsetStorageReader, then we 
see the following in the logs.

{code:java}
[2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=replicator-18} Task threw 
an uncaught and unrecoverable exception 
(org.apache.kafka.connect.runtime.WorkerTask)
org.apache.kafka.connect.errors.ConnectException: Failed to fetch offsets.
at 
org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:114)
at 
org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offset(OffsetStorageReaderImpl.java:63)
at 
org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:205)
at 
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.kafka.connect.errors.ConnectException: Offset reader 
closed while attempting to read offsets. This is likely because the task was 
been scheduled to stop but has taken longer than the graceful shutdown period 
to do so.
at 
org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:103)
... 14 more
[2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=replicator-18} Task is 
being killed and will not recover until manually restarted 
(org.apache.kafka.connect.runtime.WorkerTask)
{code}

This is a bit misleading, because the task is already on its way of being 
shutdown, and doesn't actually need manual intervention to be restarted. We can 
see that as later on in the logs we see that it throws another unrecoverable 
exception.

{code:java}
[2020-05-05 05:40:39,361] ERROR WorkerSourceTask{id=replicator-18} Task threw 
an uncaught and unrecoverable exception 
(org.apache.kafka.connect.runtime.WorkerTask)
{code}

If

[jira] [Updated] (KAFKA-9988) Connect incorrectly reports task has failed when task takes too long to shutdown

2020-05-13 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9988:
-
Summary: Connect incorrectly reports task has failed when task takes too 
long to shutdown  (was: Log incorrectly reports task has failed when task takes 
too long to shutdown)

> Connect incorrectly reports task has failed when task takes too long to 
> shutdown
> 
>
> Key: KAFKA-9988
> URL: https://issues.apache.org/jira/browse/KAFKA-9988
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Sanjana Kaundinya
>Priority: Major
>
> If the OffsetStorageReader is closed while the task is trying to shutdown, 
> and the task is trying to access the offsets from the OffsetStorageReader, 
> then we see the following in the logs.
> {code:java}
> [2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=connector-18} Task threw 
> an uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask)
> org.apache.kafka.connect.errors.ConnectException: Failed to fetch offsets.
> at 
> org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:114)
> at 
> org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offset(OffsetStorageReaderImpl.java:63)
> at 
> org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:205)
> at 
> org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
> at 
> org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kafka.connect.errors.ConnectException: Offset reader 
> closed while attempting to read offsets. This is likely because the task was 
> been scheduled to stop but has taken longer than the graceful shutdown period 
> to do so.
> at 
> org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:103)
> ... 14 more
> [2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=connector-18} Task is 
> being killed and will not recover until manually restarted 
> (org.apache.kafka.connect.runtime.WorkerTask)
> {code}
> This is a bit misleading, because the task is already on its way of being 
> shutdown, and doesn't actually need manual intervention to be restarted. We 
> can see that as later on in the logs we see that it throws another 
> unrecoverable exception.
> {code:java}
> [2020-05-05 05:40:39,361] ERROR WorkerSourceTask{id=connector-18} Task threw 
> an uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask)
> {code}
> If we know a task is on its way of shutting down, we should not throw a 
> ConnectException and instead log a warning so that we don't log false 
> negatives.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-9988) Log incorrectly reports task has failed when task takes too long to shutdown

2020-05-13 Thread Sanjana Kaundinya (Jira)

Sanjana Kaundinya created KAFKA-9988:


 Summary: Log incorrectly reports task has failed when task takes 
too long to shutdown
 Key: KAFKA-9988
 URL: https://issues.apache.org/jira/browse/KAFKA-9988
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Sanjana Kaundinya


If the OffsetStorageReader is closed while the task is trying to shutdown, and 
the task is trying to access the offsets from the OffsetStorageReader, then we 
see the following in the logs.

{code:java}
[2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=replicator-18} Task threw 
an uncaught and unrecoverable exception 
(org.apache.kafka.connect.runtime.WorkerTask)
org.apache.kafka.connect.errors.ConnectException: Failed to fetch offsets.
at 
org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:114)
at 
org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offset(OffsetStorageReaderImpl.java:63)
at 
org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:205)
at 
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.kafka.connect.errors.ConnectException: Offset reader 
closed while attempting to read offsets. This is likely because the task was 
been scheduled to stop but has taken longer than the graceful shutdown period 
to do so.
at 
org.apache.kafka.connect.storage.OffsetStorageReaderImpl.offsets(OffsetStorageReaderImpl.java:103)
... 14 more
[2020-05-05 05:28:58,937] ERROR WorkerSourceTask{id=replicator-18} Task is 
being killed and will not recover until manually restarted 
(org.apache.kafka.connect.runtime.WorkerTask)
{code}

This is a bit misleading, because the task is already on its way of being 
shutdown, and doesn't actually need manual intervention to be restarted. We can 
see that as later on in the logs we see that it throws another unrecoverable 
exception.

{code:java}
[2020-05-05 05:40:39,361] ERROR WorkerSourceTask{id=replicator-18} Task threw 
an uncaught and unrecoverable exception 
(org.apache.kafka.connect.runtime.WorkerTask)
{code}

If we know a task is on its way of shutting down, we should not throw a 
ConnectException and instead log a warning so that we don't log false negatives.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KAFKA-9625) Unable to Describe broker configurations that have been set via IncrementalAlterConfigs

2020-03-18 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya closed KAFKA-9625.


> Unable to Describe broker configurations that have been set via 
> IncrementalAlterConfigs
> ---
>
> Key: KAFKA-9625
> URL: https://issues.apache.org/jira/browse/KAFKA-9625
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Colin McCabe
>Assignee: Sanjana Kaundinya
>Priority: Critical
> Fix For: 2.6.0
>
>
> There seem to be at least two bugs in the broker configuration APIs and/or 
> logic:
> 1. Broker throttles are incorrectly marked as sensitive configurations.  This 
> includes leader.replication.throttled.rate, 
> follower.replication.throttled.rate, 
> replica.alter.log.dirs.io.max.bytes.per.second.  This means that their values 
> cannot be read back by DescribeConfigs after they are set.
> 2. When we clear the broker throttles via incrementalAlterConfigs, 
> DescribeConfigs continues returning the old throttles indefinitely.  In other 
> words, the clearing is not reflected in the Describe API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (KAFKA-9047) AdminClient group operations may not respect backoff

2020-03-18 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya closed KAFKA-9047.


> AdminClient group operations may not respect backoff
> 
>
> Key: KAFKA-9047
> URL: https://issues.apache.org/jira/browse/KAFKA-9047
> Project: Kafka
>  Issue Type: Bug
>  Components: admin
>Reporter: Jason Gustafson
>Assignee: Sanjana Kaundinya
>Priority: Major
> Fix For: 2.6.0
>
>
> The retry logic for consumer group operations in the admin client is 
> complicated by the need to find the coordinator. Instead of simply retry 
> loops which send the same request over and over, we can get more complex 
> retry loops like the following:
>  # Send FindCoordinator to B -> Coordinator is A
>  # Send DescribeGroup to A -> NOT_COORDINATOR
>  # Go back to 1
> Currently we construct a new Call object for each step in this loop, which 
> means we lose some of retry bookkeeping such as the last retry time and the 
> number of tries. This means it is possible to have tight retry loops which 
> bounce between steps 1 and 2 and do not respect the retry backoff. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-9678) Introduce bounded exponential backoff in clients

2020-03-13 Thread Sanjana Kaundinya (Jira)



[ 
https://issues.apache.org/jira/browse/KAFKA-9678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059128#comment-17059128
 ] 

Sanjana Kaundinya commented on KAFKA-9678:
--

I have a KIP out for this here: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients.
 Would appreciate any feedback on it.

> Introduce bounded exponential backoff in clients
> 
>
> Key: KAFKA-9678
> URL: https://issues.apache.org/jira/browse/KAFKA-9678
> Project: Kafka
>  Issue Type: Improvement
>  Components: admin, consumer, producer 
>Reporter: Guozhang Wang
>Assignee: Sanjana Kaundinya
>Priority: Major
>  Labels: needs-kip
>
> In all clients (consumer, producer, admin, and streams) we have retry 
> mechanisms with fixed backoff to handle transient connection issues with 
> brokers. However, with small backoff (many defaults to 100ms) we could send 
> 10s of requests per second to the broker, and if the connection issue is 
> prolonged it means a huge overhead.
> We should consider introducing upper-bounded exponential backoff universally 
> in those clients to reduce the num of retry requests during the period of 
> connection partitioning.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9621) AdminClient listOffsets operation does not respect retries and backoff

2020-02-27 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9621:
-
Description: 
Similar to https://issues.apache.org/jira/browse/KAFKA-9047, currently the 
{{listOffsets}} operation doesn't respect the configured retries and backoff 
for a given call. 

For example, the code path could go like so:
 1) Make a metadata request and schedule subsequent list offsets calls
 2) Metadata error comes back with {{InvalidMetadataException}}

3) Go back to 1

The problem here is that the state is not preserved across calls. We loose the 
information regarding how many tries the call has been tried and how far out we 
should schedule the call to try again. This could lead to a tight retry loop 
and put pressure on the brokers.

  was:
Similar to https://issues.apache.org/jira/browse/KAFKA-9047, currently the 
listOffsets operation doesn't respect the configured retries and backoff for a 
given call. 

For example, the code path could go like so:
 1) Make a metadata request and schedule subsequent list offsets calls
 2) Metadata error comes back with `InvalidMetadataException`

3) Go back to 1

The problem here is that the state is not preserved across calls. We loose the 
information regarding how many tries the call has been tried and how far out we 
should schedule the call to try again. This could lead to a tight retry loop 
and put pressure on the brokers.


> AdminClient listOffsets operation does not respect retries and backoff
> --
>
> Key: KAFKA-9621
> URL: https://issues.apache.org/jira/browse/KAFKA-9621
> Project: Kafka
>  Issue Type: Bug
>  Components: admin
>Reporter: Sanjana Kaundinya
>Assignee: Cheng Tan
>Priority: Major
> Fix For: 2.6.0
>
>
> Similar to https://issues.apache.org/jira/browse/KAFKA-9047, currently the 
> {{listOffsets}} operation doesn't respect the configured retries and backoff 
> for a given call. 
> For example, the code path could go like so:
>  1) Make a metadata request and schedule subsequent list offsets calls
>  2) Metadata error comes back with {{InvalidMetadataException}}
> 3) Go back to 1
> The problem here is that the state is not preserved across calls. We loose 
> the information regarding how many tries the call has been tried and how far 
> out we should schedule the call to try again. This could lead to a tight 
> retry loop and put pressure on the brokers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9621) AdminClient listOffsets operation does not respect retries and backoff

2020-02-27 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9621:
-
Description: 
Similar to https://issues.apache.org/jira/browse/KAFKA-9047, currently the 
listOffsets operation doesn't respect the configured retries and backoff for a 
given call. 

For example, the code path could go like so:
 1) Make a metadata request and schedule subsequent list offsets calls
 2) Metadata error comes back with `InvalidMetadataException`

3) Go back to 1

The problem here is that the state is not preserved across calls. We loose the 
information regarding how many tries the call has been tried and how far out we 
should schedule the call to try again. This could lead to a tight retry loop 
and put pressure on the brokers.

  was:
Similar to https://issues.apache.org/jira/browse/KAFKA-9047, currently the 
listOffsets operation doesn't respect the configured retries and backoff for a 
given call. 

For example, the code path could go like so:
1) Make a metadata request and schedule subsequent list offsets calls
2) Metadata error comes back with `InvalidMetadataException`

3) Go back to 1

The problem here is that the state is not preserved across calls. We loose the 
information regarding how many tries the call has been tried and how far out we 
should schedule the call to try again. This could lead to a tight retry loop 
and put pressure on the brokers.


> AdminClient listOffsets operation does not respect retries and backoff
> --
>
> Key: KAFKA-9621
> URL: https://issues.apache.org/jira/browse/KAFKA-9621
> Project: Kafka
>  Issue Type: Bug
>  Components: admin
>Reporter: Sanjana Kaundinya
>Assignee: Cheng Tan
>Priority: Major
> Fix For: 2.6.0
>
>
> Similar to https://issues.apache.org/jira/browse/KAFKA-9047, currently the 
> listOffsets operation doesn't respect the configured retries and backoff for 
> a given call. 
> For example, the code path could go like so:
>  1) Make a metadata request and schedule subsequent list offsets calls
>  2) Metadata error comes back with `InvalidMetadataException`
> 3) Go back to 1
> The problem here is that the state is not preserved across calls. We loose 
> the information regarding how many tries the call has been tried and how far 
> out we should schedule the call to try again. This could lead to a tight 
> retry loop and put pressure on the brokers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-9621) AdminClient listOffsets operation does not respect retries and backoff

2020-02-27 Thread Sanjana Kaundinya (Jira)

Sanjana Kaundinya created KAFKA-9621:


 Summary: AdminClient listOffsets operation does not respect 
retries and backoff
 Key: KAFKA-9621
 URL: https://issues.apache.org/jira/browse/KAFKA-9621
 Project: Kafka
  Issue Type: Bug
  Components: admin
Reporter: Sanjana Kaundinya
Assignee: Cheng Tan
 Fix For: 2.6.0


Similar to https://issues.apache.org/jira/browse/KAFKA-9047, currently the 
listOffsets operation doesn't respect the configured retries and backoff for a 
given call. 

For example, the code path could go like so:
1) Make a metadata request and schedule subsequent list offsets calls
2) Metadata error comes back with `InvalidMetadataException`

3) Go back to 1

The problem here is that the state is not preserved across calls. We loose the 
information regarding how many tries the call has been tried and how far out we 
should schedule the call to try again. This could lead to a tight retry loop 
and put pressure on the brokers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9047) AdminClient group operations may not respect backoff

2020-02-25 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9047:
-
Reviewer: Jason Gustafson

> AdminClient group operations may not respect backoff
> 
>
> Key: KAFKA-9047
> URL: https://issues.apache.org/jira/browse/KAFKA-9047
> Project: Kafka
>  Issue Type: Bug
>  Components: admin
>Reporter: Jason Gustafson
>Assignee: Sanjana Kaundinya
>Priority: Major
>
> The retry logic for consumer group operations in the admin client is 
> complicated by the need to find the coordinator. Instead of simply retry 
> loops which send the same request over and over, we can get more complex 
> retry loops like the following:
>  # Send FindCoordinator to B -> Coordinator is A
>  # Send DescribeGroup to A -> NOT_COORDINATOR
>  # Go back to 1
> Currently we construct a new Call object for each step in this loop, which 
> means we lose some of retry bookkeeping such as the last retry time and the 
> number of tries. This means it is possible to have tight retry loops which 
> bounce between steps 1 and 2 and do not respect the retry backoff. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9580) Log clearer error messages when there is an offset out of range

2020-02-20 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9580:
-
Description: Currently in the broker side logs, when an exception due to 
offset out of range is thrown, it's not informative on what offsets are exactly 
in range, and it makes it harder to debug when Consumer issues come up in this 
situation. It would be much more helpful if we logged a message similar to 
[Log::read |#L1468]to make it easier to debug.  (was: Currently in the broker 
side logs, when an exception due to offset out of range is thrown, it's not 
informative on what offsets are exactly in range, and it makes it harder to 
debug when Consumer issues come up in this situation. It would be much more 
helpful if we logged a message similar to [Log::read|#L1468]] to make it easier 
to debug.)

> Log clearer error messages when there is an offset out of range
> ---
>
> Key: KAFKA-9580
> URL: https://issues.apache.org/jira/browse/KAFKA-9580
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Sanjana Kaundinya
>Assignee: Cheng Tan
>Priority: Major
>
> Currently in the broker side logs, when an exception due to offset out of 
> range is thrown, it's not informative on what offsets are exactly in range, 
> and it makes it harder to debug when Consumer issues come up in this 
> situation. It would be much more helpful if we logged a message similar to 
> [Log::read |#L1468]to make it easier to debug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9580) Log clearer error messages when there is an offset out of range

2020-02-20 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9580:
-
Description: Currently in the broker side logs, when an exception due to 
offset out of range is thrown, it's not informative on what offsets are exactly 
in range, and it makes it harder to debug when Consumer issues come up in this 
situation. It would be much more helpful if we logged a message similar to 
[Log::read|#L1468]] to make it easier to debug.  (was: Currently in 
[Fetcher::initializedCompletedFetch|[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L1259]],
 it's not informative on what offsets are exactly in range, and it makes it 
harder to debug when Consumer issues come up in this situation. It would be 
much more helpful if we logged a message similar to 
[Log::read|[https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/Log.scala#L1468]]
 to make it easier to debug.)

> Log clearer error messages when there is an offset out of range
> ---
>
> Key: KAFKA-9580
> URL: https://issues.apache.org/jira/browse/KAFKA-9580
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Sanjana Kaundinya
>Assignee: Cheng Tan
>Priority: Major
>
> Currently in the broker side logs, when an exception due to offset out of 
> range is thrown, it's not informative on what offsets are exactly in range, 
> and it makes it harder to debug when Consumer issues come up in this 
> situation. It would be much more helpful if we logged a message similar to 
> [Log::read|#L1468]] to make it easier to debug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-9580) Log clearer error messages when there is an offset out of range

2020-02-20 Thread Sanjana Kaundinya (Jira)

Sanjana Kaundinya created KAFKA-9580:


 Summary: Log clearer error messages when there is an offset out of 
range
 Key: KAFKA-9580
 URL: https://issues.apache.org/jira/browse/KAFKA-9580
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Sanjana Kaundinya


Currently in 
[Fetcher::initializedCompletedFetch|[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java#L1259]],
 it's not informative on what offsets are exactly in range, and it makes it 
harder to debug when Consumer issues come up in this situation. It would be 
much more helpful if we logged a message similar to 
[Log::read|[https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/Log.scala#L1468]]
 to make it easier to debug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (KAFKA-9306) Kafka Consumer does not clean up all metrics after shutdown

2020-02-18 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya resolved KAFKA-9306.
--
Fix Version/s: 2.4.1
   Resolution: Fixed

> Kafka Consumer does not clean up all metrics after shutdown
> ---
>
> Key: KAFKA-9306
> URL: https://issues.apache.org/jira/browse/KAFKA-9306
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Colin McCabe
>Assignee: Colin McCabe
>Priority: Major
> Fix For: 2.4.1
>
>
> The Kafka Consumer does not clean up all metrics after shutdown.  It seems 
> like this was a regression introduced in Kafka 2.4 when we added the 
> KafkaConsumerMetrics class.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-9558) getListOffsetsCalls doesn't update node in case of leader change

2020-02-14 Thread Sanjana Kaundinya (Jira)

Sanjana Kaundinya created KAFKA-9558:


 Summary: getListOffsetsCalls doesn't update node in case of leader 
change
 Key: KAFKA-9558
 URL: https://issues.apache.org/jira/browse/KAFKA-9558
 Project: Kafka
  Issue Type: Bug
  Components: admin
Affects Versions: 2.5.0
Reporter: Sanjana Kaundinya
Assignee: Sanjana Kaundinya


As seen here:
[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java#L3810]

In handling the response in the `listOffsets` call, if there are errors in the 
topic partition that require a metadata refresh, it simply passes the call 
object as `this`. This produces incorrect behavior if there was a leader 
change, because the call object never gets its leader node updated. This will 
result in a tight loop of list offsets being called to the same old leader and 
not resulting in offsets, even though the metadata was correctly updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KAFKA-9047) AdminClient group operations may not respect backoff

2020-02-10 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya reassigned KAFKA-9047:


Assignee: Sanjana Kaundinya  (was: Vikas Singh)

> AdminClient group operations may not respect backoff
> 
>
> Key: KAFKA-9047
> URL: https://issues.apache.org/jira/browse/KAFKA-9047
> Project: Kafka
>  Issue Type: Bug
>  Components: admin
>Reporter: Jason Gustafson
>Assignee: Sanjana Kaundinya
>Priority: Major
>
> The retry logic for consumer group operations in the admin client is 
> complicated by the need to find the coordinator. Instead of simply retry 
> loops which send the same request over and over, we can get more complex 
> retry loops like the following:
>  # Send FindCoordinator to B -> Coordinator is A
>  # Send DescribeGroup to A -> NOT_COORDINATOR
>  # Go back to 1
> Currently we construct a new Call object for each step in this loop, which 
> means we lose some of retry bookkeeping such as the last retry time and the 
> number of tries. This means it is possible to have tight retry loops which 
> bounce between steps 1 and 2 and do not respect the retry backoff. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-9520) Deprecate ZooKeeper access for kafka-reassign-partitions.sh

2020-02-06 Thread Sanjana Kaundinya (Jira)

Sanjana Kaundinya created KAFKA-9520:


 Summary: Deprecate ZooKeeper access for 
kafka-reassign-partitions.sh
 Key: KAFKA-9520
 URL: https://issues.apache.org/jira/browse/KAFKA-9520
 Project: Kafka
  Issue Type: Sub-task
Affects Versions: 2.5.0
Reporter: Sanjana Kaundinya
 Fix For: 2.5.0


As part of KIP-555 access for zookeeper must be deprecated  for 
kafka-reassign-partitions.sh.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-9519) Deprecate ZooKeeper access for kafka-configs.sh

2020-02-06 Thread Sanjana Kaundinya (Jira)

Sanjana Kaundinya created KAFKA-9519:


 Summary: Deprecate ZooKeeper access for kafka-configs.sh   
 Key: KAFKA-9519
 URL: https://issues.apache.org/jira/browse/KAFKA-9519
 Project: Kafka
  Issue Type: Sub-task
Affects Versions: 2.5.0
Reporter: Sanjana Kaundinya
Assignee: Sanjana Kaundinya
 Fix For: 2.5.0


As part of KIP-555 access for zookeeper must be deprecated  for 
kafka-configs.sh.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9509) Fix flaky test MirrorConnectorsIntegrationTest.testReplication

2020-02-05 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9509:
-
Fix Version/s: 2.5.0

> Fix flaky test MirrorConnectorsIntegrationTest.testReplication
> --
>
> Key: KAFKA-9509
> URL: https://issues.apache.org/jira/browse/KAFKA-9509
> Project: Kafka
>  Issue Type: Test
>  Components: mirrormaker
>Affects Versions: 2.4.0, 2.4.1, 2.5.0
>Reporter: Sanjana Kaundinya
>Assignee: Sanjana Kaundinya
>Priority: Major
> Fix For: 2.5.0
>
>
> The test 
> org.apache.kafka.connect.mirror.MirrorConnectorsIntegrationTest.testReplication
>  is a flaky test for MirrorMaker 2.0. Its flakiness lies in the timing of 
> when the connectors and tasks are started up. The fix for this would make it 
> such that when the connectors are started up, to wait until the REST endpoint 
> returns a positive number of tasks to be confident that we can start testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9509) Fix flaky test MirrorConnectorsIntegrationTest.testReplication

2020-02-05 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya updated KAFKA-9509:
-
Description: The test 
org.apache.kafka.connect.mirror.MirrorConnectorsIntegrationTest.testReplication 
is a flaky test for MirrorMaker 2.0. Its flakiness lies in the timing of when 
the connectors and tasks are started up. The fix for this would make it such 
that when the connectors are started up, to wait until the REST endpoint 
returns a positive number of tasks to be confident that we can start testing.  
(was: The test `MirrorConnectorsIntegrationTest.testReplication` is a flaky 
test for MirrorMaker 2.0. Its flakiness lies in the timing of when the 
connectors and tasks are started up. The fix for this would make it such that 
when the connectors are started up, to wait until the REST endpoint returns a 
positive number of tasks to be confident that we can start testing.)

> Fix flaky test MirrorConnectorsIntegrationTest.testReplication
> --
>
> Key: KAFKA-9509
> URL: https://issues.apache.org/jira/browse/KAFKA-9509
> Project: Kafka
>  Issue Type: Test
>  Components: mirrormaker
>Affects Versions: 2.4.0, 2.4.1, 2.5.0
>Reporter: Sanjana Kaundinya
>Assignee: Sanjana Kaundinya
>Priority: Major
>
> The test 
> org.apache.kafka.connect.mirror.MirrorConnectorsIntegrationTest.testReplication
>  is a flaky test for MirrorMaker 2.0. Its flakiness lies in the timing of 
> when the connectors and tasks are started up. The fix for this would make it 
> such that when the connectors are started up, to wait until the REST endpoint 
> returns a positive number of tasks to be confident that we can start testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (KAFKA-9509) Fix flaky test MirrorConnectorsIntegrationTest.testReplication

2020-02-05 Thread Sanjana Kaundinya (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Kaundinya reassigned KAFKA-9509:


Assignee: Sanjana Kaundinya

> Fix flaky test MirrorConnectorsIntegrationTest.testReplication
> --
>
> Key: KAFKA-9509
> URL: https://issues.apache.org/jira/browse/KAFKA-9509
> Project: Kafka
>  Issue Type: Test
>  Components: mirrormaker
>Affects Versions: 2.4.0, 2.4.1, 2.5.0
>Reporter: Sanjana Kaundinya
>Assignee: Sanjana Kaundinya
>Priority: Major
>
> The test `MirrorConnectorsIntegrationTest.testReplication` is a flaky test 
> for MirrorMaker 2.0. Its flakiness lies in the timing of when the connectors 
> and tasks are started up. The fix for this would make it such that when the 
> connectors are started up, to wait until the REST endpoint returns a positive 
> number of tasks to be confident that we can start testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-9509) Fix flaky test MirrorConnectorsIntegrationTest.testReplication

2020-02-05 Thread Sanjana Kaundinya (Jira)

Sanjana Kaundinya created KAFKA-9509:


 Summary: Fix flaky test 
MirrorConnectorsIntegrationTest.testReplication
 Key: KAFKA-9509
 URL: https://issues.apache.org/jira/browse/KAFKA-9509
 Project: Kafka
  Issue Type: Test
  Components: mirrormaker
Affects Versions: 2.4.0, 2.4.1, 2.5.0
Reporter: Sanjana Kaundinya


The test `MirrorConnectorsIntegrationTest.testReplication` is a flaky test for 
MirrorMaker 2.0. Its flakiness lies in the timing of when the connectors and 
tasks are started up. The fix for this would make it such that when the 
connectors are started up, to wait until the REST endpoint returns a positive 
number of tasks to be confident that we can start testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (KAFKA-8945) Incorrect null check in the constructor for ConnectorHealth and AbstractState

2019-09-25 Thread Sanjana Kaundinya (Jira)

Sanjana Kaundinya created KAFKA-8945:


 Summary: Incorrect null check in the constructor for 
ConnectorHealth and AbstractState
 Key: KAFKA-8945
 URL: https://issues.apache.org/jira/browse/KAFKA-8945
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 2.3.0
Reporter: Sanjana Kaundinya


This bug is in relation to KIP-285: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-285%3A+Connect+Rest+Extension+Plugin]

In the constructors of ConnectorHealth.java and AbstractState.java, the check 
that is done for the null parameters is done incorrectly. The current code only 
allows for the class to be instantiated if the parameters passed in are null. 
However the expected behavior has to be the opposite of this: we only want this 
class to be instantiated if the parameters passed in are not null. While the 
fix for this is pretty trivial, it would be good to add in some testing that 
tests the appropriate classes related to the rest extension.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

44 matches

Mail list logo