[jira] [Commented] (KAFKA-13326) Add multi-cluster support to Kafka Streams

2021-09-27 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17421107#comment-17421107
 ] 

Boyang Chen commented on KAFKA-13326:
-

[~wangguangyuan] Thanks for your interest. There are some known challenges to 
support cross-cluster processing:
 # The exactly-once semantic will break because Kafka couldn't do cross-cluster 
transaction
 # The topic tracking will need to be augmented with multi-cluster support in 
mind, which is a significant amount of work
 # The failure scenario will be complicated, as before it is within one broker 
cluster but now multiple, any pair of topic-cluster mapping could fail and 
cause a weird state to recover

Considering numerous replicators existing on the market, it's not a top 
priority to support that natively in Kafka Streams. [~guozhang] [~mjsax]could 
give more insights here.

> Add multi-cluster support to Kafka Streams
> --
>
> Key: KAFKA-13326
> URL: https://issues.apache.org/jira/browse/KAFKA-13326
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Guangyuan Wang
>Priority: Major
>  Labels: needs-kip
>
> Dear Kafka Team,
> According to the link, 
> https://kafka.apache.org/28/documentation/streams/developer-guide/config-streams.html#bootstrap-servers.
> Kafka Streams applications can only communicate with a single Kafka cluster 
> specified by this config value. Future versions of Kafka Streams will support 
> connecting to different Kafka clusters for reading input streams and writing 
> output streams.
> Which version will this feature be added in the Kafka stream?  This is really 
> a very good feature.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10733) Enforce exception thrown for KafkaProducer txn APIs

2021-06-11 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361932#comment-17361932
 ] 

Boyang Chen commented on KAFKA-10733:
-

Hey [~kkonstantine] I don't think anyone has cycles to push this to 3.0, we 
should consider moving it to future releases, cc [~guozhang] [~mjsax]

> Enforce exception thrown for KafkaProducer txn APIs
> ---
>
> Key: KAFKA-10733
> URL: https://issues.apache.org/jira/browse/KAFKA-10733
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer , streams
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Boyang Chen
>Priority: Major
>  Labels: need-kip
> Fix For: 3.0.0
>
>
> In general, KafkaProducer could throw both fatal and non-fatal errors as 
> KafkaException, which makes the exception catching hard. Furthermore, not 
> every single fatal exception (checked) is marked on the function signature 
> yet as of 2.7.
> We should have a general supporting strategy in long term for this matter, as 
> whether to declare all non-fatal exceptions as wrapped KafkaException while 
> extracting all fatal ones, or just add a flag to KafkaException indicating 
> fatal or not, to maintain binary compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10594) Enhance Raft exception handling

2021-05-13 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17344088#comment-17344088
 ] 

Boyang Chen commented on KAFKA-10594:
-

That's correct, thanks! [~jagsancio]

> Enhance Raft exception handling
> ---
>
> Key: KAFKA-10594
> URL: https://issues.apache.org/jira/browse/KAFKA-10594
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> The current exception handling on the Raft implementation is superficial, for 
> example we don't treat file system exception and request handling exception 
> differently. It's necessary to decide what kind of exception should be fatal, 
> what kind of exception should be responding to the client, and what exception 
> could be retried.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-7728) Add JoinReason to the join group request for better rebalance handling

2021-04-13 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-7728.

Resolution: Won't Fix

This is no longer a critical fix to the static membership, as leader rejoin 
with no member info won't trigger rebalance in the original implementation.

> Add JoinReason to the join group request for better rebalance handling
> --
>
> Key: KAFKA-7728
> URL: https://issues.apache.org/jira/browse/KAFKA-7728
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Boyang Chen
>Assignee: Mayuresh Gharat
>Priority: Major
>  Labels: consumer, mirror-maker, needs-kip
>
> Recently [~mgharat] and I discussed about the current rebalance logic on 
> leader join group request handling. So far we blindly trigger rebalance when 
> the leader rejoins. The caveat is that KIP-345 is not covering this effort 
> and if a consumer group is not using sticky assignment but using other 
> strategy like round robin, the redundant rebalance could still shuffle the 
> topic partitions around consumers. (for example mirror maker application)
> I checked on broker side and here is what we currently do:
>  
> {code:java}
> if (group.isLeader(memberId) || !member.matches(protocols))  
> // force a rebalance if a member has changed metadata or if the leader sends 
> JoinGroup. 
> // The latter allows the leader to trigger rebalances for changes affecting 
> assignment 
> // which do not affect the member metadata (such as topic metadata changes 
> for the consumer) {code}
> Based on the broker logic, we only need to trigger rebalance for leader 
> rejoin when the topic metadata change has happened. I also looked up the 
> ConsumerCoordinator code on client side, and found out the metadata 
> monitoring logic here:
> {code:java}
> public boolean rejoinNeededOrPending() {
> ...
> // we need to rejoin if we performed the assignment and metadata has changed
> if (assignmentSnapshot != null && 
> !assignmentSnapshot.equals(metadataSnapshot))
>   return true;
> }{code}
>  I guess instead of just returning true, we could introduce a new enum field 
> called JoinReason which could indicate the purpose of the rejoin. Thus we 
> don't need to do a full rebalance when the leader is just in rolling bounce.
> We could utilize this information I guess. Just add another enum field into 
> the join group request called JoinReason so that we know whether leader is 
> rejoining due to topic metadata change. If yes, we trigger rebalance 
> obviously; if no, we shouldn't trigger rebalance.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9910) Implement new transaction timed out error

2021-03-22 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-9910:
---
Fix Version/s: (was: 2.8.0)
   3.0.0

> Implement new transaction timed out error
> -
>
> Key: KAFKA-9910
> URL: https://issues.apache.org/jira/browse/KAFKA-9910
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core
>Reporter: Boyang Chen
>Assignee: HaiyuanZhao
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9705) Zookeeper mutation protocols should be redirected to Controller only

2021-03-22 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-9705:
---
Fix Version/s: (was: 2.8.0)
   3.0.0

> Zookeeper mutation protocols should be redirected to Controller only
> 
>
> Key: KAFKA-9705
> URL: https://issues.apache.org/jira/browse/KAFKA-9705
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 3.0.0
>
>
> In the bridge release, we need to restrict the direct access of ZK to 
> controller only. This means the existing AlterConfig path should be migrated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12490) Forwarded requests should use timeout from request when possible

2021-03-21 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17305880#comment-17305880
 ] 

Boyang Chen commented on KAFKA-12490:
-

>From what I understand, the Create/DeleteRequest timeout is embedded in the 
>message data, which will be wrapped and forwarded to the active controller. 
>And their corresponding timeout values are used in the zk admin operation, not 
>for the request timeout. Are you suggesting that we are preventing the case 
>where the forwarding request timeout is less than the given timeout in the 
>original request, which would cause a premature timeout for the forwarding 
>request? [~hachikuji]

> Forwarded requests should use timeout from request when possible
> 
>
> Key: KAFKA-12490
> URL: https://issues.apache.org/jira/browse/KAFKA-12490
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Boyang Chen
>Priority: Major
>  Labels: kip-500
>
> Currently forwarded requests timeout according to the broker configuration 
> `request.timeout.ms`. However, some requests, such as CreateTopics and 
> DeleteTopics have their own timeouts that are part of the request object. We 
> should try to use these timeouts when possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12499) Adjust transaction timeout according to commit interval on Streams EOS

2021-03-18 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-12499:
---

 Summary: Adjust transaction timeout according to commit interval 
on Streams EOS
 Key: KAFKA-12499
 URL: https://issues.apache.org/jira/browse/KAFKA-12499
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Boyang Chen
Assignee: Boyang Chen
 Fix For: 3.0.0


The transaction timeout is set to 1 minute by default on Producer today, while 
the commit interval on the other hand could be set to a very large value, which 
makes the stream always hit transaction timeout and drop into rebalance. We 
should increase the transaction timeout correspondingly when commit interval is 
large.

On the other hand, broker could have a limit on the max transaction timeout to 
be set. If we scale up client transaction timeout over the limit, stream will 
fail due to  INVALID_TRANSACTION_TIMEOUT. To alleviate this problem, user could 
define their own customized transaction timeout to avoid hitting the limit, so 
we should still respect what user configures in the override.

The new rule for configuring transaction timeout should look like:
1. If transaction timeout is set in streams config, use it

2. if not, transaction_timeout = max(default_transaction_timeout, 10 * 
commit_interval) 

Additionally if INVALID_TRANSACTION_TIMEOUT was thrown on Streams when calling 
initTransaction(), we should wrap the exception to inform user that their 
setting for commit interval could potentially be too high and should adjust.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-12381) Incompatible change in verifiable_producer.log in 2.8

2021-03-05 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-12381:

Fix Version/s: 2.8.0

> Incompatible change in verifiable_producer.log in 2.8
> -
>
> Key: KAFKA-12381
> URL: https://issues.apache.org/jira/browse/KAFKA-12381
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.8.0
>Reporter: Colin McCabe
>Assignee: Boyang Chen
>Priority: Blocker
>  Labels: kip-500
> Fix For: 2.8.0
>
>
> In test_verifiable_producer.py , we used to see this error message in 
> verifiable_producer.log when a topic couldn't be created:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=LEADER_NOT_AVAILABLE} 
> (org.apache.kafka.clients.NetworkClient)
> The test does a grep LEADER_NOT_AVAILABLE on the log in this case, and it 
> used to pass.
> Now we are instead seeing this in the log file:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient)
> And of course now the test fails.
> The INVALID_REPLICATION_FACTOR is coming from the new auto topic creation 
> manager.
> It is a simple matter to make the test pass -- I have confirmed that it 
> passes if we grep for INVALID_REPLICATION_FACTOR in the log file instead of 
> LEADER_NOT_AVAILABLE.
> I think we just need to decide if this change in behavior is acceptable or 
> not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-12381) Incompatible change in verifiable_producer.log in 2.8

2021-03-05 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-12381.
-
Resolution: Fixed

> Incompatible change in verifiable_producer.log in 2.8
> -
>
> Key: KAFKA-12381
> URL: https://issues.apache.org/jira/browse/KAFKA-12381
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.8.0
>Reporter: Colin McCabe
>Assignee: Boyang Chen
>Priority: Blocker
>  Labels: kip-500
>
> In test_verifiable_producer.py , we used to see this error message in 
> verifiable_producer.log when a topic couldn't be created:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=LEADER_NOT_AVAILABLE} 
> (org.apache.kafka.clients.NetworkClient)
> The test does a grep LEADER_NOT_AVAILABLE on the log in this case, and it 
> used to pass.
> Now we are instead seeing this in the log file:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient)
> And of course now the test fails.
> The INVALID_REPLICATION_FACTOR is coming from the new auto topic creation 
> manager.
> It is a simple matter to make the test pass -- I have confirmed that it 
> passes if we grep for INVALID_REPLICATION_FACTOR in the log file instead of 
> LEADER_NOT_AVAILABLE.
> I think we just need to decide if this change in behavior is acceptable or 
> not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12381) Incompatible change in verifiable_producer.log in 2.8

2021-03-01 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293286#comment-17293286
 ] 

Boyang Chen commented on KAFKA-12381:
-

I checked 2.7 code, and we only return INVALID_REPLICATION_FACTOR for internal 
topics:
{code:java}
if (isInternal(topic)) {
  val topicMetadata = createInternalTopic(topic)
  if (topicMetadata.errorCode == Errors.COORDINATOR_NOT_AVAILABLE.code)
metadataResponseTopic(Errors.INVALID_REPLICATION_FACTOR, topic, true, 
util.Collections.emptyList())
  else
topicMetadata
} else if (allowAutoTopicCreation && config.autoCreateTopicsEnable) {
  createTopic(topic, config.numPartitions, config.defaultReplicationFactor)
} else {
  metadataResponseTopic(Errors.UNKNOWN_TOPIC_OR_PARTITION, topic, false, 
util.Collections.emptyList())
}
{code}
which seems to be lost in the new auto topic creation module. I will try to do 
a fix there.

> Incompatible change in verifiable_producer.log in 2.8
> -
>
> Key: KAFKA-12381
> URL: https://issues.apache.org/jira/browse/KAFKA-12381
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.8.0
>Reporter: Colin McCabe
>Assignee: Boyang Chen
>Priority: Blocker
>  Labels: kip-500
>
> In test_verifiable_producer.py , we used to see this error message in 
> verifiable_producer.log when a topic couldn't be created:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=LEADER_NOT_AVAILABLE} 
> (org.apache.kafka.clients.NetworkClient)
> The test does a grep LEADER_NOT_AVAILABLE on the log in this case, and it 
> used to pass.
> Now we are instead seeing this in the log file:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient)
> And of course now the test fails.
> The INVALID_REPLICATION_FACTOR is coming from the new auto topic creation 
> manager.
> It is a simple matter to make the test pass -- I have confirmed that it 
> passes if we grep for INVALID_REPLICATION_FACTOR in the log file instead of 
> LEADER_NOT_AVAILABLE.
> I think we just need to decide if this change in behavior is acceptable or 
> not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12381) Incompatible change in verifiable_producer.log in 2.8

2021-03-01 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293161#comment-17293161
 ] 

Boyang Chen commented on KAFKA-12381:
-

[~cmccabe] Could you provide more specific details here, such like which test 
case is failing and where you were fixing the log grep?

> Incompatible change in verifiable_producer.log in 2.8
> -
>
> Key: KAFKA-12381
> URL: https://issues.apache.org/jira/browse/KAFKA-12381
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Colin McCabe
>Assignee: Boyang Chen
>Priority: Blocker
>
> In test_verifiable_producer.py , we used to see this error message in 
> verifiable_producer.log when a topic couldn't be created:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=LEADER_NOT_AVAILABLE} 
> (org.apache.kafka.clients.NetworkClient)
> The test does a grep LEADER_NOT_AVAILABLE on the log in this case, and it 
> used to pass.
> Now we are instead seeing this in the log file:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient)
> And of course now the test fails.
> The INVALID_REPLICATION_FACTOR is coming from the new auto topic creation 
> manager.
> It is a simple matter to make the test pass -- I have confirmed that it 
> passes if we grep for INVALID_REPLICATION_FACTOR in the log file instead of 
> LEADER_NOT_AVAILABLE.
> I think we just need to decide if this change in behavior is acceptable or 
> not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-12381) Incompatible change in verifiable_producer.log in 2.8

2021-03-01 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-12381:

Component/s: core

> Incompatible change in verifiable_producer.log in 2.8
> -
>
> Key: KAFKA-12381
> URL: https://issues.apache.org/jira/browse/KAFKA-12381
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.8.0
>Reporter: Colin McCabe
>Assignee: Boyang Chen
>Priority: Blocker
>
> In test_verifiable_producer.py , we used to see this error message in 
> verifiable_producer.log when a topic couldn't be created:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=LEADER_NOT_AVAILABLE} 
> (org.apache.kafka.clients.NetworkClient)
> The test does a grep LEADER_NOT_AVAILABLE on the log in this case, and it 
> used to pass.
> Now we are instead seeing this in the log file:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient)
> And of course now the test fails.
> The INVALID_REPLICATION_FACTOR is coming from the new auto topic creation 
> manager.
> It is a simple matter to make the test pass -- I have confirmed that it 
> passes if we grep for INVALID_REPLICATION_FACTOR in the log file instead of 
> LEADER_NOT_AVAILABLE.
> I think we just need to decide if this change in behavior is acceptable or 
> not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-12381) Incompatible change in verifiable_producer.log in 2.8

2021-03-01 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-12381:

Labels: kip-500  (was: )

> Incompatible change in verifiable_producer.log in 2.8
> -
>
> Key: KAFKA-12381
> URL: https://issues.apache.org/jira/browse/KAFKA-12381
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.8.0
>Reporter: Colin McCabe
>Assignee: Boyang Chen
>Priority: Blocker
>  Labels: kip-500
>
> In test_verifiable_producer.py , we used to see this error message in 
> verifiable_producer.log when a topic couldn't be created:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=LEADER_NOT_AVAILABLE} 
> (org.apache.kafka.clients.NetworkClient)
> The test does a grep LEADER_NOT_AVAILABLE on the log in this case, and it 
> used to pass.
> Now we are instead seeing this in the log file:
> WARN [Producer clientId=producer-1] Error while fetching metadata with 
> correlation id 1 : {test_topic=INVALID_REPLICATION_FACTOR} 
> (org.apache.kafka.clients.NetworkClient)
> And of course now the test fails.
> The INVALID_REPLICATION_FACTOR is coming from the new auto topic creation 
> manager.
> It is a simple matter to make the test pass -- I have confirmed that it 
> passes if we grep for INVALID_REPLICATION_FACTOR in the log file instead of 
> LEADER_NOT_AVAILABLE.
> I think we just need to decide if this change in behavior is acceptable or 
> not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12169) Consumer can not know paritions change when client leader restart with static membership protocol

2021-02-19 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287357#comment-17287357
 ] 

Boyang Chen commented on KAFKA-12169:
-

[~guozhang] One question regarding T1, when we receive the assignment, do we 
validate its total partitions against our own metadata? If so, the leader 
should attempt a rejoin IMHO.

> Consumer can not know paritions change when client leader restart with static 
> membership protocol
> -
>
> Key: KAFKA-12169
> URL: https://issues.apache.org/jira/browse/KAFKA-12169
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 2.5.1, 2.6.1
>Reporter: zou shengfu
>Priority: Major
>  Labels: bug
>
> Background: 
>  Kafka consumer services run with static membership and cooperative rebalance 
> protocol on kubernetes, and services often restart because of operation. When 
> we added partitions from 1000 to 2000 for the topic, client leader restart 
> with unknown member id at the same time, we found  the consumers do not 
> tigger rebalance and still consume 1000 paritions
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10345) Add auto reloading for trust/key store paths

2021-02-08 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10345:

Fix Version/s: (was: 2.8.0)
   3.0.0

> Add auto reloading for trust/key store paths
> 
>
> Key: KAFKA-10345
> URL: https://issues.apache.org/jira/browse/KAFKA-10345
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 3.0.0
>
>
> With forwarding enabled, per-broker alter-config doesn't go to the target 
> broker anymore, we need to have a mechanism to propagate in-place file update 
> through file watch and time-based reloading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12307) Wrap non-fatal exception as CommitFailedException when thrown from commitTxn

2021-02-06 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-12307:
---

 Summary: Wrap non-fatal exception as CommitFailedException when 
thrown from commitTxn
 Key: KAFKA-12307
 URL: https://issues.apache.org/jira/browse/KAFKA-12307
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen


As stated in the KIP, we need to ensure the non-fatal exception get wrapped 
inside CommitFailed to be properly handled by the new template.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12304) Improve topic validation in auto topic creation

2021-02-05 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-12304:
---

 Summary: Improve topic validation in auto topic creation
 Key: KAFKA-12304
 URL: https://issues.apache.org/jira/browse/KAFKA-12304
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen


The topic validation path should have a higher priority than other follow-up 
processes. Basically we should move it right after the topic authorization is 
done in metadata request handling.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-12294) Consider using the forwarding mechanism for metadata auto topic creation

2021-02-04 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-12294:

Component/s: core

> Consider using the forwarding mechanism for metadata auto topic creation
> 
>
> Key: KAFKA-12294
> URL: https://issues.apache.org/jira/browse/KAFKA-12294
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Boyang Chen
>Priority: Major
>
> Once [https://github.com/apache/kafka/pull/9579] is merged, there is a way to 
> improve the topic creation auditing by forwarding the CreateTopicsRequest 
> inside Envelope for the given client. Details in 
> [here|https://github.com/apache/kafka/pull/9579#issuecomment-772283780]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-12294) Consider using the forwarding mechanism for metadata auto topic creation

2021-02-04 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-12294:

Parent: (was: KAFKA-9705)
Issue Type: Improvement  (was: Sub-task)

> Consider using the forwarding mechanism for metadata auto topic creation
> 
>
> Key: KAFKA-12294
> URL: https://issues.apache.org/jira/browse/KAFKA-12294
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Boyang Chen
>Priority: Major
>
> Once [https://github.com/apache/kafka/pull/9579] is merged, there is a way to 
> improve the topic creation auditing by forwarding the CreateTopicsRequest 
> inside Envelope for the given client. Details in 
> [here|https://github.com/apache/kafka/pull/9579#issuecomment-772283780]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12294) Consider using the forwarding mechanism for metadata auto topic creation

2021-02-04 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-12294:
---

 Summary: Consider using the forwarding mechanism for metadata auto 
topic creation
 Key: KAFKA-12294
 URL: https://issues.apache.org/jira/browse/KAFKA-12294
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen


Once [https://github.com/apache/kafka/pull/9579] is merged, there is a way to 
improve the topic creation auditing by forwarding the CreateTopicsRequest 
inside Envelope for the given client. Details in 
[here|https://github.com/apache/kafka/pull/9579#issuecomment-772283780]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10348) Consider consolidation of broker to controller communication

2021-02-03 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10348:

Description: 
Right now, we made forwarding and AlterISR into separate channels to make 
non-blocking behavior for each other. However, the controller queue is single 
threaded without prioritization for various requests, so only separating 
connections may not really help unblocking the AlterISR when forwarding request 
is taking indefinite time. In long term, we want to see if it is possible to 
consolidate these two with a systematic logical change on the controller side 
to ensure AlterISR always have higher priority.

 

After auto topic creation is done, there was more need for this consolidation.

  was:Right now, we made forwarding and AlterISR into separate channels to make 
non-blocking behavior for each other. However, the controller queue is single 
threaded without prioritization for various requests, so only separating 
connections may not really help unblocking the AlterISR when forwarding request 
is taking indefinite time. In long term, we want to see if it is possible to 
consolidate these two with a systematic logical change on the controller side 
to ensure AlterISR always have higher priority.


> Consider consolidation of broker to controller communication
> 
>
> Key: KAFKA-10348
> URL: https://issues.apache.org/jira/browse/KAFKA-10348
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> Right now, we made forwarding and AlterISR into separate channels to make 
> non-blocking behavior for each other. However, the controller queue is single 
> threaded without prioritization for various requests, so only separating 
> connections may not really help unblocking the AlterISR when forwarding 
> request is taking indefinite time. In long term, we want to see if it is 
> possible to consolidate these two with a systematic logical change on the 
> controller side to ensure AlterISR always have higher priority.
>  
> After auto topic creation is done, there was more need for this consolidation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-12260) PartitionsFor should not return null value

2021-01-30 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-12260:
---

Assignee: Boyang Chen

> PartitionsFor should not return null value
> --
>
> Key: KAFKA-12260
> URL: https://issues.apache.org/jira/browse/KAFKA-12260
> Project: Kafka
>  Issue Type: Improvement
>  Components: consumer
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Minor
>
> consumer.partitionsFor() could return null value when topic was not found. 
> This was not properly documented and was error-prone when the return type was 
> a list. We should fix the logic internally to prevent partitionsFor returning 
> null result.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12260) PartitionsFor should not return null value

2021-01-30 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-12260:
---

 Summary: PartitionsFor should not return null value
 Key: KAFKA-12260
 URL: https://issues.apache.org/jira/browse/KAFKA-12260
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Reporter: Boyang Chen


consumer.partitionsFor() could return null value when topic was not found. This 
was not properly documented and was error-prone when the return type was a 
list. We should fix the logic internally to prevent partitionsFor returning 
null result.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9689) Automatic broker version detection to initialize stream client

2021-01-27 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273379#comment-17273379
 ] 

Boyang Chen commented on KAFKA-9689:


I agree with A) since it is internal struct.

> Automatic broker version detection to initialize stream client
> --
>
> Key: KAFKA-9689
> URL: https://issues.apache.org/jira/browse/KAFKA-9689
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Boyang Chen
>Assignee: feyman
>Priority: Major
>
> Eventually we shall deprecate the flag to suppress EOS thread producer 
> feature, instead we take version detection approach on broker to decide which 
> semantic to use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12169) Consumer can not know paritions chage when client leader restart with static membership protocol

2021-01-27 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273328#comment-17273328
 ] 

Boyang Chen commented on KAFKA-12169:
-

In general, the leader should be able to detect metadata discrepancy between 
its remembered topic metadata and broker side metadata. I don't think we have 
any test case to cover both the topic partition change and leader rejoin at the 
same time, so it's possible and needs some verification. 

> Consumer can not know paritions chage when client leader restart with static 
> membership protocol
> 
>
> Key: KAFKA-12169
> URL: https://issues.apache.org/jira/browse/KAFKA-12169
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 2.5.1, 2.6.1
>Reporter: zou shengfu
>Priority: Major
>
> Background: 
>  Kafka consumer services run with static membership and cooperative rebalance 
> protocol on kubernetes, and services often restart because of operation. When 
> we added partitions from 1000 to 2000 for the topic, client leader restart 
> with unknown member id at the same time, we found  the consumers do not 
> tigger rebalance and still consume 1000 paritions
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-12215) Broker could cache its overlapping ApiVersions with active controller

2021-01-15 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-12215:
---

 Summary: Broker could cache its overlapping ApiVersions with 
active controller
 Key: KAFKA-12215
 URL: https://issues.apache.org/jira/browse/KAFKA-12215
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen


Right now, each ApiVersionRequest would need to compute overlapping api 
versions with controller every time: 
https://issues.apache.org/jira/browse/KAFKA-10674

Unless the active controller is changed, the computation result should always 
be the same. We may consider cache the result and only update it when the 
controller change happens.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-12161) Raft observers should not require an id to fetch

2021-01-07 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261072#comment-17261072
 ] 

Boyang Chen commented on KAFKA-12161:
-

I was wondering whether we could just get a random UUID for observer when we do 
the tooling?

> Raft observers should not require an id to fetch
> 
>
> Key: KAFKA-12161
> URL: https://issues.apache.org/jira/browse/KAFKA-12161
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Major
>
> It is useful to allow observers to replay the metadata log without requiring 
> a replica id. For example, this can be used by tools in order to inspect the 
> current metadata state. In order to support this, we should modify 
> `KafkaRaftClient` so that the broker id is not required.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10868) Avoid double wrapping KafkaException

2020-12-18 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10868:

Description: Today certain exceptions get double wraps of KafkaException. 
We should remove those cases

> Avoid double wrapping KafkaException
> 
>
> Key: KAFKA-10868
> URL: https://issues.apache.org/jira/browse/KAFKA-10868
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> Today certain exceptions get double wraps of KafkaException. We should remove 
> those cases



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10868) Avoid double wrapping KafkaException

2020-12-18 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10868:
---

 Summary: Avoid double wrapping KafkaException
 Key: KAFKA-10868
 URL: https://issues.apache.org/jira/browse/KAFKA-10868
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10345) Add auto reloading for trust/key store paths

2020-12-10 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10345:

Summary: Add auto reloading for trust/key store paths  (was: Add file-watch 
based update for trust/key store paths)

> Add auto reloading for trust/key store paths
> 
>
> Key: KAFKA-10345
> URL: https://issues.apache.org/jira/browse/KAFKA-10345
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> With forwarding enabled, per-broker alter-config doesn't go to the target 
> broker anymore, we need to have a mechanism to propagate in-place file update 
> through file watch and time-based reloading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10345) Add file-watch based update for trust/key store paths

2020-12-10 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10345:

Description: With forwarding enabled, per-broker alter-config doesn't go to 
the target broker anymore, we need to have a mechanism to propagate in-place 
file update through file watch and time-based reloading.  (was: With forwarding 
enabled, per-broker alter-config doesn't go to the target broker anymore, we 
need to have a mechanism to propagate in-place file update through file watch.)

> Add file-watch based update for trust/key store paths
> -
>
> Key: KAFKA-10345
> URL: https://issues.apache.org/jira/browse/KAFKA-10345
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> With forwarding enabled, per-broker alter-config doesn't go to the target 
> broker anymore, we need to have a mechanism to propagate in-place file update 
> through file watch and time-based reloading.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9552) Stream should handle OutOfSequence exception thrown from Producer

2020-12-10 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-9552.

Resolution: Not A Problem

> Stream should handle OutOfSequence exception thrown from Producer
> -
>
> Key: KAFKA-9552
> URL: https://issues.apache.org/jira/browse/KAFKA-9552
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Boyang Chen
>Priority: Major
>
> As of today the stream thread could die from OutOfSequence error:
> {code:java}
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) 
> org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) [2020-02-12 
> 15:14:35,185] ERROR 
> [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] 
> stream-thread 
> [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] Failed 
> to commit stream task 3_2 due to the following error: 
> (org.apache.kafka.streams.processor.internals.AssignedStreamsTasks)
>  [2020-02-12T07:14:35-08:00] 
> (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) 
> org.apache.kafka.streams.errors.StreamsException: task [3_2] Abort sending 
> since an error caught with a previous record (timestamp 1581484094825) to 
> topic stream-soak-test-KSTREAM-AGGREGATE-STATE-STORE-49-changelog due 
> to org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker 
> received an out of order sequence number.
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:154)
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:52)
>  at 
> org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:214)
>  at 
> org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1353)
> {code}
>  Although this is fatal exception for Producer, stream should treat it as an 
> opportunity to reinitialize by doing a rebalance, instead of killing 
> computation resource.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-10813) StreamsProducer should catch InvalidProducerEpoch and throw TaskMigrated in all cases

2020-12-10 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-10813.
-
Resolution: Fixed

> StreamsProducer should catch InvalidProducerEpoch and throw TaskMigrated in 
> all cases
> -
>
> Key: KAFKA-10813
> URL: https://issues.apache.org/jira/browse/KAFKA-10813
> Project: Kafka
>  Issue Type: Bug
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Blocker
> Fix For: 2.7.0
>
>
> We fixed the error code handling on producer in 
> https://issues.apache.org/jira/browse/KAFKA-10687, however the newly thrown 
> `InvalidProducerEpoch` exception was not properly handled on Streams side in 
> all cases. We should catch it and rethrow as TaskMigrated to trigger 
> exception, similar to ProducerFenced.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-10813) StreamsProducer should catch InvalidProducerEpoch and throw TaskMigrated in all cases

2020-12-04 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-10813:
---

Assignee: Boyang Chen

> StreamsProducer should catch InvalidProducerEpoch and throw TaskMigrated in 
> all cases
> -
>
> Key: KAFKA-10813
> URL: https://issues.apache.org/jira/browse/KAFKA-10813
> Project: Kafka
>  Issue Type: Bug
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Blocker
> Fix For: 2.7.0
>
>
> We fixed the error code handling on producer in 
> https://issues.apache.org/jira/browse/KAFKA-10687, however the newly thrown 
> `InvalidProducerEpoch` exception was not properly handled on Streams side in 
> all cases. We should catch it and rethrow as TaskMigrated to trigger 
> exception, similar to ProducerFenced.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10813) StreamsProducer should catch InvalidProducerEpoch and throw TaskMigrated in all cases

2020-12-04 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10813:
---

 Summary: StreamsProducer should catch InvalidProducerEpoch and 
throw TaskMigrated in all cases
 Key: KAFKA-10813
 URL: https://issues.apache.org/jira/browse/KAFKA-10813
 Project: Kafka
  Issue Type: Bug
Reporter: Boyang Chen
 Fix For: 2.7.0


We fixed the error code handling on producer in 
https://issues.apache.org/jira/browse/KAFKA-10687, however the newly thrown 
`InvalidProducerEpoch` exception was not properly handled on Streams side in 
all cases. We should catch it and rethrow as TaskMigrated to trigger exception, 
similar to ProducerFenced.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10807) AlterConfig should be validated by the target broker

2020-12-03 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10807:
---

 Summary: AlterConfig should be validated by the target broker
 Key: KAFKA-10807
 URL: https://issues.apache.org/jira/browse/KAFKA-10807
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen


After forwarding is enabled, AlterConfigs will no longer be sent to the target 
broker. This behavior bypasses some important config change validations, such 
as path existence, static config conflict, or even worse when the target broker 
is offline, the propagated result does not reflect a true applied result. We 
should gather those necessary cases, and decide whether to actually handle the 
AlterConfig request firstly on the target broker, and then forward, in a 
validate-forward-apply path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10345) Add file-watch based update for trust/key store paths

2020-12-03 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10345:

Description: With forwarding enabled, per-broker alter-config doesn't go to 
the target broker anymore, we need to have a mechanism to propagate in-place 
file update through file watch.  (was: With forwarding enabled, per-broker 
alter-config doesn't go to the target broker anymore, we need to have a 
mechanism to propagate that update through ZK without affecting other config 
changes.)

> Add file-watch based update for trust/key store paths
> -
>
> Key: KAFKA-10345
> URL: https://issues.apache.org/jira/browse/KAFKA-10345
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> With forwarding enabled, per-broker alter-config doesn't go to the target 
> broker anymore, we need to have a mechanism to propagate in-place file update 
> through file watch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10345) Add file-watch based update for trust/key store paths

2020-12-03 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10345:

Summary: Add file-watch based update for trust/key store paths  (was: Add 
ZK-notification based update for trust/key store paths)

> Add file-watch based update for trust/key store paths
> -
>
> Key: KAFKA-10345
> URL: https://issues.apache.org/jira/browse/KAFKA-10345
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> With forwarding enabled, per-broker alter-config doesn't go to the target 
> broker anymore, we need to have a mechanism to propagate that update through 
> ZK without affecting other config changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10733) Enforce exception thrown for KafkaProducer txn APIs

2020-11-23 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10733:

Component/s: streams
 producer 

> Enforce exception thrown for KafkaProducer txn APIs
> ---
>
> Key: KAFKA-10733
> URL: https://issues.apache.org/jira/browse/KAFKA-10733
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer , streams
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Boyang Chen
>Priority: Major
>  Labels: need-kip
>
> In general, KafkaProducer could throw both fatal and non-fatal errors as 
> KafkaException, which makes the exception catching hard. Furthermore, not 
> every single fatal exception (checked) is marked on the function signature 
> yet as of 2.7.
> We should have a general supporting strategy in long term for this matter, as 
> whether to declare all non-fatal exceptions as wrapped KafkaException while 
> extracting all fatal ones, or just add a flag to KafkaException indicating 
> fatal or not, to maintain binary compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10733) Enforce exception thrown for KafkaProducer txn APIs

2020-11-23 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237765#comment-17237765
 ] 

Boyang Chen commented on KAFKA-10733:
-

Also we should understand why we see a trend of seeing more InvalidPid error 
since 2.5

> Enforce exception thrown for KafkaProducer txn APIs
> ---
>
> Key: KAFKA-10733
> URL: https://issues.apache.org/jira/browse/KAFKA-10733
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Boyang Chen
>Priority: Major
>  Labels: need-kip
>
> In general, KafkaProducer could throw both fatal and non-fatal errors as 
> KafkaException, which makes the exception catching hard. Furthermore, not 
> every single fatal exception (checked) is marked on the function signature 
> yet as of 2.7.
> We should have a general supporting strategy in long term for this matter, as 
> whether to declare all non-fatal exceptions as wrapped KafkaException while 
> extracting all fatal ones, or just add a flag to KafkaException indicating 
> fatal or not, to maintain binary compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10733) Enforce exception thrown for KafkaProducer txn APIs

2020-11-23 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10733:

Affects Version/s: 2.7.0
   2.5.0
   2.6.0

> Enforce exception thrown for KafkaProducer txn APIs
> ---
>
> Key: KAFKA-10733
> URL: https://issues.apache.org/jira/browse/KAFKA-10733
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Boyang Chen
>Priority: Major
>  Labels: need-kip
>
> In general, KafkaProducer could throw both fatal and non-fatal errors as 
> KafkaException, which makes the exception catching hard. Furthermore, not 
> every single fatal exception (checked) is marked on the function signature 
> yet as of 2.7.
> We should have a general supporting strategy in long term for this matter, as 
> whether to declare all non-fatal exceptions as wrapped KafkaException while 
> extracting all fatal ones, or just add a flag to KafkaException indicating 
> fatal or not, to maintain binary compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10733) Enforce exception thrown for KafkaProducer txn APIs

2020-11-23 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237764#comment-17237764
 ] 

Boyang Chen commented on KAFKA-10733:
-

Several agreements being made on offline sync:



1. We should catch certain fatal exceptions and throw task migrated, such as 
ProducerFenced
2. We should have a new producer exception type to wrap all non-fatal exception 
to let user catch them and throw as task corrupted
3. We should still crash the stream thread for certain fatal exceptions, such 
as AuthorizationException

> Enforce exception thrown for KafkaProducer txn APIs
> ---
>
> Key: KAFKA-10733
> URL: https://issues.apache.org/jira/browse/KAFKA-10733
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Boyang Chen
>Priority: Major
>  Labels: need-kip
>
> In general, KafkaProducer could throw both fatal and non-fatal errors as 
> KafkaException, which makes the exception catching hard. Furthermore, not 
> every single fatal exception (checked) is marked on the function signature 
> yet as of 2.7.
> We should have a general supporting strategy in long term for this matter, as 
> whether to declare all non-fatal exceptions as wrapped KafkaException while 
> extracting all fatal ones, or just add a flag to KafkaException indicating 
> fatal or not, to maintain binary compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10733) Enforce exception thrown for KafkaProducer txn APIs

2020-11-23 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10733:

Labels: need-kip  (was: )

> Enforce exception thrown for KafkaProducer txn APIs
> ---
>
> Key: KAFKA-10733
> URL: https://issues.apache.org/jira/browse/KAFKA-10733
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Boyang Chen
>Priority: Major
>  Labels: need-kip
>
> In general, KafkaProducer could throw both fatal and non-fatal errors as 
> KafkaException, which makes the exception catching hard. Furthermore, not 
> every single fatal exception (checked) is marked on the function signature 
> yet as of 2.7.
> We should have a general supporting strategy in long term for this matter, as 
> whether to declare all non-fatal exceptions as wrapped KafkaException while 
> extracting all fatal ones, or just add a flag to KafkaException indicating 
> fatal or not, to maintain binary compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10733) Enforce exception thrown for KafkaProducer txn APIs

2020-11-17 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10733:
---

 Summary: Enforce exception thrown for KafkaProducer txn APIs
 Key: KAFKA-10733
 URL: https://issues.apache.org/jira/browse/KAFKA-10733
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen


In general, KafkaProducer could throw both fatal and non-fatal errors as 
KafkaException, which makes the exception catching hard. Furthermore, not every 
single fatal exception (checked) is marked on the function signature yet as of 
2.7.

We should have a general supporting strategy in long term for this matter, as 
whether to declare all non-fatal exceptions as wrapped KafkaException while 
extracting all fatal ones, or just add a flag to KafkaException indicating 
fatal or not, to maintain binary compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-10655) Raft leader should resign after write failures

2020-11-17 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-10655:
---

Assignee: Boyang Chen  (was: Jason Gustafson)

> Raft leader should resign after write failures
> --
>
> Key: KAFKA-10655
> URL: https://issues.apache.org/jira/browse/KAFKA-10655
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Boyang Chen
>Priority: Major
>
> The controller's state machine relies on strong ordering guarantees. Each 
> write assumes that all previous writes are either committed or will 
> eventually become committed. In order to protect this assumption, the 
> controller must not accept additional writes in the same epoch if a preceding 
> write has failed. Instead, it should resign so that another leader can be 
> elected. There are basically three classes of failures that we consider:
> 1. Serialization/state errors. Any unexpected write errors should be treated 
> as fatal. The leader should gracefully resign and the process should shutdown.
> 2. Disk IO errors. Similarly, the leader should resign (gracefully if 
> possible) and the process should shutdown. 
> 3. Commit failures. If the leader is unable to commit data after some time, 
> then it should gracefully resign, but the process should not exit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-10674) Brokers should know the active controller ApiVersion after enabling KIP-590 forwarding

2020-11-13 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-10674:
---

Assignee: Boyang Chen

> Brokers should know the active controller ApiVersion after enabling KIP-590 
> forwarding
> --
>
> Key: KAFKA-10674
> URL: https://issues.apache.org/jira/browse/KAFKA-10674
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> Admin clients send ApiVersions to the broker upon the first connection 
> establishes. The tricky thing after forwarding is enabled is that for 
> forwardable APIs, admin client needs to know a commonly-agreed range of 
> ApiVersions among handling broker, active controller and itself.
> Right now the inter-broker APIs are guaranteed by IBP constraints, but not 
> for forwardable APIs. A compromised solution would be to put all forwardable 
> APIs under IBP, which is brittle and hard to maintain consistency.
> Instead, any broker connecting to the active controller should send an 
> ApiVersion request from beginning, so it is easy to compute that information 
> and send back to the admin clients upon ApiVersion request from admin.  Any 
> rolling of the active controller will trigger reconnection between broker and 
> controller, which guarantees a refreshed ApiVersions between the two. This 
> approach avoids the tight bond with IBP and broker could just close the 
> connection between admin client to trigger retry logic and refreshing of the 
> ApiVersions. Since this failure should be rare, two round-trips and timeout 
> delays are well compensated by the less engineering work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10714) Save unnecessary end txn call when the transaction is confirmed to be done

2020-11-12 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10714:
---

 Summary: Save unnecessary end txn call when the transaction is 
confirmed to be done
 Key: KAFKA-10714
 URL: https://issues.apache.org/jira/browse/KAFKA-10714
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen


In certain error cases after KIP-588, we may skip the call of `EndTxn` to the 
txn coordinator such as for the transaction timeout case, where we know the 
transaction is already terminated on the broker side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10347) Deprecate Metadata ControllerId

2020-11-11 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10347:

Description: When forwarding is enabled, we should not let client know the 
active controller location anymore, so the Metadata request shall be bumped and 
deprecate the controllerId field. For older client, we shall return a random 
broker id to substitute the controller id.  (was: In the bridge release broker, 
the Create/DeleteTopics should be redirected to the active controller instead 
of relying on admin client discovery.)

> Deprecate Metadata ControllerId
> ---
>
> Key: KAFKA-10347
> URL: https://issues.apache.org/jira/browse/KAFKA-10347
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> When forwarding is enabled, we should not let client know the active 
> controller location anymore, so the Metadata request shall be bumped and 
> deprecate the controllerId field. For older client, we shall return a random 
> broker id to substitute the controller id.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10347) Deprecate Metadata ControllerId

2020-11-11 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10347:

Summary: Deprecate Metadata ControllerId  (was: Redirect 
Create/DeleteTopics to the controller)

> Deprecate Metadata ControllerId
> ---
>
> Key: KAFKA-10347
> URL: https://issues.apache.org/jira/browse/KAFKA-10347
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> In the bridge release broker, the Create/DeleteTopics should be redirected to 
> the active controller instead of relying on admin client discovery.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10350) Add forwarding request monitoring metrics

2020-11-10 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10350:

Summary: Add forwarding request monitoring metrics  (was: Add redirect 
request monitoring metrics)

> Add forwarding request monitoring metrics
> -
>
> Key: KAFKA-10350
> URL: https://issues.apache.org/jira/browse/KAFKA-10350
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> We need to add the metric for monitoring redirection progress as stated in 
> the KIP:
> MBean:kafka.server:type=RequestMetrics,name=NumRequestsForwardingToControllerPerSec,clientId=([-.\w]+)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9751) Auto topic creation should go to controller

2020-11-09 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-9751:
---
Summary: Auto topic creation should go to controller  (was: Internal topic 
creation should go to controller)

> Auto topic creation should go to controller
> ---
>
> Key: KAFKA-9751
> URL: https://issues.apache.org/jira/browse/KAFKA-9751
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> For use cases to create internal topics through FindCoordinator or Metadata 
> request, receiving broker should route the topic creation request to the 
> controller instead of handling by itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10346) Propagate topic creation policy violation to the clients

2020-11-09 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10346:

Description: At the moment, topic creation policy is not enforced on auto 
topic creation path for Metadata/FindCoordinator. After the topic creation 
fully goes to adminManager instead of zkManager, we may actually have the 
policy violation for those auto create topics, and inform client side the 
situation by bumping both RPCs version to include POLICY_VIOLATION error.  
(was: In the bridge release broker, the CreatePartition should be redirected to 
the active controller instead of relying on admin client discovery.)
Summary: Propagate topic creation policy violation to the clients  
(was: Redirect CreatePartition to the controller)

> Propagate topic creation policy violation to the clients
> 
>
> Key: KAFKA-10346
> URL: https://issues.apache.org/jira/browse/KAFKA-10346
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> At the moment, topic creation policy is not enforced on auto topic creation 
> path for Metadata/FindCoordinator. After the topic creation fully goes to 
> adminManager instead of zkManager, we may actually have the policy violation 
> for those auto create topics, and inform client side the situation by bumping 
> both RPCs version to include POLICY_VIOLATION error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9751) Internal topic creation should go to controller

2020-11-09 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-9751:
--

Assignee: Boyang Chen

> Internal topic creation should go to controller
> ---
>
> Key: KAFKA-9751
> URL: https://issues.apache.org/jira/browse/KAFKA-9751
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> For use cases to create internal topics through FindCoordinator or Metadata 
> request, receiving broker should route the topic creation request to the 
> controller instead of handling by itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-10699) Add system test coverage for group coordinator emigration

2020-11-09 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-10699:
---

Assignee: feyman

> Add system test coverage for group coordinator emigration
> -
>
> Key: KAFKA-10699
> URL: https://issues.apache.org/jira/browse/KAFKA-10699
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Boyang Chen
>Assignee: feyman
>Priority: Major
>
> After merging the fix https://issues.apache.org/jira/browse/KAFKA-10284, we 
> believe that it is important to add system test coverage for the group 
> coordinator migration to verify consumer behaviors are correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10699) Add system test coverage for group coordinator emigration

2020-11-08 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10699:
---

 Summary: Add system test coverage for group coordinator emigration
 Key: KAFKA-10699
 URL: https://issues.apache.org/jira/browse/KAFKA-10699
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen


After merging the fix https://issues.apache.org/jira/browse/KAFKA-10284, we 
believe that it is important to add system test coverage for the group 
coordinator migration to verify consumer behaviors are correct.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-10667) Add timeout for forwarding requests

2020-11-05 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-10667:
---

Assignee: Boyang Chen

> Add timeout for forwarding requests
> ---
>
> Key: KAFKA-10667
> URL: https://issues.apache.org/jira/browse/KAFKA-10667
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> It makes sense to handle timeout for forwarding request coming from the 
> client, instead of retry indefinitely. We could either use the api timeout, 
> or a customized timeout hook which could be defined by different request 
> types.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10674) Brokers should know the active controller ApiVersion after enabling KIP-590 forwarding

2020-11-05 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10674:

Parent: KAFKA-9705
Issue Type: Sub-task  (was: Improvement)

> Brokers should know the active controller ApiVersion after enabling KIP-590 
> forwarding
> --
>
> Key: KAFKA-10674
> URL: https://issues.apache.org/jira/browse/KAFKA-10674
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core
>Reporter: Boyang Chen
>Priority: Major
>
> Admin clients send ApiVersions to the broker upon the first connection 
> establishes. The tricky thing after forwarding is enabled is that for 
> forwardable APIs, admin client needs to know a commonly-agreed range of 
> ApiVersions among handling broker, active controller and itself.
> Right now the inter-broker APIs are guaranteed by IBP constraints, but not 
> for forwardable APIs. A compromised solution would be to put all forwardable 
> APIs under IBP, which is brittle and hard to maintain consistency.
> Instead, any broker connecting to the active controller should send an 
> ApiVersion request from beginning, so it is easy to compute that information 
> and send back to the admin clients upon ApiVersion request from admin.  Any 
> rolling of the active controller will trigger reconnection between broker and 
> controller, which guarantees a refreshed ApiVersions between the two. This 
> approach avoids the tight bond with IBP and broker could just close the 
> connection between admin client to trigger retry logic and refreshing of the 
> ApiVersions. Since this failure should be rare, two round-trips and timeout 
> delays are well compensated by the less engineering work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10674) Brokers should know the active controller ApiVersion after enabling KIP-590 forwarding

2020-11-05 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10674:

Description: 
Admin clients send ApiVersions to the broker upon the first connection 
establishes. The tricky thing after forwarding is enabled is that for 
forwardable APIs, admin client needs to know a commonly-agreed range of 
ApiVersions among handling broker, active controller and itself.

Right now the inter-broker APIs are guaranteed by IBP constraints, but not for 
forwardable APIs. A compromised solution would be to put all forwardable APIs 
under IBP, which is brittle and hard to maintain consistency.

Instead, any broker connecting to the active controller should send an 
ApiVersion request from beginning, so it is easy to compute that information 
and send back to the admin clients upon ApiVersion request from admin.  Any 
rolling of the active controller will trigger reconnection between broker and 
controller, which guarantees a refreshed ApiVersions between the two. This 
approach avoids the tight bond with IBP and broker could just close the 
connection between admin client to trigger retry logic and refreshing of the 
ApiVersions. Since this failure should be rare, two round-trips and timeout 
delays are well compensated by the less engineering work.

  was:
Admin clients send ApiVersions to the broker upon the first connection 
establishes. The tricky thing after forwarding is enabled is that for 
forwardable APIs, admin client needs to know a commonly-agreed rang of 
ApiVersions among handling broker, active controller and itself.

Right now the inter-broker APIs are guaranteed by IBP constraints, but not for 
forwardable APIs. A compromised solution would be to put all forwardable APIs 
under IBP, which is brittle and hard to maintain consistency.

Instead, any broker connecting to the active controller should send an 
ApiVersion request from beginning, so it is easy to compute that information 
and send back to the admin clients upon ApiVersion request from admin.  Any 
rolling of the active controller will trigger reconnection between broker and 
controller, which guarantees a refreshed ApiVersions between the two. This 
approach avoids the tight bond with IBP and broker could just close the 
connection between admin client to trigger retry logic and refreshing of the 
ApiVersions. Since this failure should be rare, two round-trips and timeout 
delays are well compensated by the less engineering work.


> Brokers should know the active controller ApiVersion after enabling KIP-590 
> forwarding
> --
>
> Key: KAFKA-10674
> URL: https://issues.apache.org/jira/browse/KAFKA-10674
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, core
>Reporter: Boyang Chen
>Priority: Major
>
> Admin clients send ApiVersions to the broker upon the first connection 
> establishes. The tricky thing after forwarding is enabled is that for 
> forwardable APIs, admin client needs to know a commonly-agreed range of 
> ApiVersions among handling broker, active controller and itself.
> Right now the inter-broker APIs are guaranteed by IBP constraints, but not 
> for forwardable APIs. A compromised solution would be to put all forwardable 
> APIs under IBP, which is brittle and hard to maintain consistency.
> Instead, any broker connecting to the active controller should send an 
> ApiVersion request from beginning, so it is easy to compute that information 
> and send back to the admin clients upon ApiVersion request from admin.  Any 
> rolling of the active controller will trigger reconnection between broker and 
> controller, which guarantees a refreshed ApiVersions between the two. This 
> approach avoids the tight bond with IBP and broker could just close the 
> connection between admin client to trigger retry logic and refreshing of the 
> ApiVersions. Since this failure should be rare, two round-trips and timeout 
> delays are well compensated by the less engineering work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10687) Produce request should be bumped for new error code PRODUCE_FENCED

2020-11-05 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10687:

Issue Type: Bug  (was: Improvement)

> Produce request should be bumped for new error code PRODUCE_FENCED
> --
>
> Key: KAFKA-10687
> URL: https://issues.apache.org/jira/browse/KAFKA-10687
> Project: Kafka
>  Issue Type: Bug
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Blocker
> Fix For: 2.7.0
>
>
> In https://issues.apache.org/jira/browse/KAFKA-9910, we missed a case where 
> the ProduceRequest needs to be bumped to return the new error code 
> PRODUCE_FENCED. This gap needs to be addressed as a blocker since it is 
> shipping in 2.7.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10687) Produce request should be bumped for new error code PRODUCE_FENCED

2020-11-05 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10687:
---

 Summary: Produce request should be bumped for new error code 
PRODUCE_FENCED
 Key: KAFKA-10687
 URL: https://issues.apache.org/jira/browse/KAFKA-10687
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen
Assignee: Boyang Chen
 Fix For: 2.7.0


In https://issues.apache.org/jira/browse/KAFKA-9910, we missed a case where the 
ProduceRequest needs to be bumped to return the new error code PRODUCE_FENCED. 
This gap needs to be addressed as a blocker since it is shipping in 2.7.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-10181) Create Envelope RPC and redirection template for configuration change RPCs

2020-11-04 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-10181.
-
Resolution: Fixed

> Create Envelope RPC and redirection template for configuration change RPCs
> --
>
> Key: KAFKA-10181
> URL: https://issues.apache.org/jira/browse/KAFKA-10181
> Project: Kafka
>  Issue Type: Sub-task
>  Components: admin
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> In the bridge release broker, 
> AlterConfig/IncrementalAlterConfig/CreateTopics/AlterClientQuota should be 
> redirected to the active controller. This ticket will ensure those RPCs get 
> redirected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-10345) Add ZK-notification based update for trust/key store paths

2020-11-03 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-10345:
---

Assignee: Boyang Chen

> Add ZK-notification based update for trust/key store paths
> --
>
> Key: KAFKA-10345
> URL: https://issues.apache.org/jira/browse/KAFKA-10345
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> With forwarding enabled, per-broker alter-config doesn't go to the target 
> broker anymore, we need to have a mechanism to propagate that update through 
> ZK without affecting other config changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10345) Add ZK-notification based update for trust/key store paths

2020-11-03 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10345:

Description: With forwarding enabled, per-broker alter-config doesn't go to 
the target broker anymore, we need to have a mechanism to propagate that update 
through ZK without affecting other config changes.  (was: In the bridge release 
broker, the AlterPartitionReassignment should be redirected to the active 
controller instead of relying on admin client discovery.)

> Add ZK-notification based update for trust/key store paths
> --
>
> Key: KAFKA-10345
> URL: https://issues.apache.org/jira/browse/KAFKA-10345
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> With forwarding enabled, per-broker alter-config doesn't go to the target 
> broker anymore, we need to have a mechanism to propagate that update through 
> ZK without affecting other config changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10345) Add ZK-notification based update for trust/key store paths

2020-11-03 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10345:

Fix Version/s: 2.8.0

> Add ZK-notification based update for trust/key store paths
> --
>
> Key: KAFKA-10345
> URL: https://issues.apache.org/jira/browse/KAFKA-10345
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> With forwarding enabled, per-broker alter-config doesn't go to the target 
> broker anymore, we need to have a mechanism to propagate that update through 
> ZK without affecting other config changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10345) Add ZK-notification based update for trust/key store paths

2020-11-03 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10345:

Summary: Add ZK-notification based update for trust/key store paths  (was: 
Redirect AlterPartitionReassignment to the controller )

> Add ZK-notification based update for trust/key store paths
> --
>
> Key: KAFKA-10345
> URL: https://issues.apache.org/jira/browse/KAFKA-10345
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> In the bridge release broker, the AlterPartitionReassignment should be 
> redirected to the active controller instead of relying on admin client 
> discovery.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10342) Redirect remaining RPCs to the controller

2020-11-02 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10342:

Description: This ticket tracks the progress along migrating the rest of 
RPCs to the controller access only through forwarding.  (was: In the bridge 
release broker,Create/DeleteAcls should be redirected to the active controller.)

> Redirect remaining RPCs to the controller
> -
>
> Key: KAFKA-10342
> URL: https://issues.apache.org/jira/browse/KAFKA-10342
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> This ticket tracks the progress along migrating the rest of RPCs to the 
> controller access only through forwarding.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-10342) Migrate remaining RPCs to forward to the controller

2020-11-02 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-10342:
---

Assignee: Boyang Chen

> Migrate remaining RPCs to forward to the controller
> ---
>
> Key: KAFKA-10342
> URL: https://issues.apache.org/jira/browse/KAFKA-10342
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> This ticket tracks the progress along migrating the rest of RPCs to the 
> controller access only through forwarding.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10342) Migrate remaining RPCs to forward to the controller

2020-11-02 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10342:

Summary: Migrate remaining RPCs to forward to the controller  (was: 
Redirect remaining RPCs to the controller)

> Migrate remaining RPCs to forward to the controller
> ---
>
> Key: KAFKA-10342
> URL: https://issues.apache.org/jira/browse/KAFKA-10342
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> This ticket tracks the progress along migrating the rest of RPCs to the 
> controller access only through forwarding.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10342) Redirect remaining RPCs to the controller

2020-11-02 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10342:

Summary: Redirect remaining RPCs to the controller  (was: Redirect 
Create/DeleteAcls to the controller)

> Redirect remaining RPCs to the controller
> -
>
> Key: KAFKA-10342
> URL: https://issues.apache.org/jira/browse/KAFKA-10342
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> In the bridge release broker,Create/DeleteAcls should be redirected to the 
> active controller.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-10343) Add IBP based ApiVersion constraint tests

2020-11-02 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-10343.
-
Resolution: Won't Fix

We are planning to work on a more long-term fix for this issue, see 

> Add IBP based ApiVersion constraint tests
> -
>
> Key: KAFKA-10343
> URL: https://issues.apache.org/jira/browse/KAFKA-10343
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> We need to add ApiVersion constraints test based on IBP to remind future 
> developer bump it when new RPC version is developed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KAFKA-10343) Add IBP based ApiVersion constraint tests

2020-11-02 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17225029#comment-17225029
 ] 

Boyang Chen edited comment on KAFKA-10343 at 11/3/20, 12:32 AM:


We are planning to work on a more long-term fix for this issue, see 
https://issues.apache.org/jira/browse/KAFKA-10674


was (Author: bchen225242):
We are planning to work on a more long-term fix for this issue, see 

> Add IBP based ApiVersion constraint tests
> -
>
> Key: KAFKA-10343
> URL: https://issues.apache.org/jira/browse/KAFKA-10343
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> We need to add ApiVersion constraints test based on IBP to remind future 
> developer bump it when new RPC version is developed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10674) Brokers should know the active controller ApiVersion after enabling KIP-590 forwarding

2020-11-02 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10674:
---

 Summary: Brokers should know the active controller ApiVersion 
after enabling KIP-590 forwarding
 Key: KAFKA-10674
 URL: https://issues.apache.org/jira/browse/KAFKA-10674
 Project: Kafka
  Issue Type: Improvement
  Components: clients, core
Reporter: Boyang Chen


Admin clients send ApiVersions to the broker upon the first connection 
establishes. The tricky thing after forwarding is enabled is that for 
forwardable APIs, admin client needs to know a commonly-agreed rang of 
ApiVersions among handling broker, active controller and itself.

Right now the inter-broker APIs are guaranteed by IBP constraints, but not for 
forwardable APIs. A compromised solution would be to put all forwardable APIs 
under IBP, which is brittle and hard to maintain consistency.

Instead, any broker connecting to the active controller should send an 
ApiVersion request from beginning, so it is easy to compute that information 
and send back to the admin clients upon ApiVersion request from admin.  Any 
rolling of the active controller will trigger reconnection between broker and 
controller, which guarantees a refreshed ApiVersions between the two. This 
approach avoids the tight bond with IBP and broker could just close the 
connection between admin client to trigger retry logic and refreshing of the 
ApiVersions. Since this failure should be rare, two round-trips and timeout 
delays are well compensated by the less engineering work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10344) Redirect Create/Renew/ExpireDelegationToken to the controller

2020-10-30 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10344:

Description: Original comment: 
https://github.com/apache/kafka/pull/9103#discussion_r515427912  (was: In the 
bridge release broker, Create/Renew/ExpireDelegationToken should be redirected 
to the active controller.)

> Redirect Create/Renew/ExpireDelegationToken to the controller
> -
>
> Key: KAFKA-10344
> URL: https://issues.apache.org/jira/browse/KAFKA-10344
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> Original comment: 
> https://github.com/apache/kafka/pull/9103#discussion_r515427912



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10344) Add active controller check to the controller level in KIP-500

2020-10-30 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10344:

Summary: Add active controller check to the controller level in KIP-500  
(was: Redirect Create/Renew/ExpireDelegationToken to the controller)

> Add active controller check to the controller level in KIP-500
> --
>
> Key: KAFKA-10344
> URL: https://issues.apache.org/jira/browse/KAFKA-10344
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> Original comment: 
> https://github.com/apache/kafka/pull/9103#discussion_r515427912



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10348) Consider consolidation of broker to controller communication

2020-10-30 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10348:

Description: Right now, we made forwarding and AlterISR into separate 
channels to make non-blocking behavior for each other. However, the controller 
queue is single threaded without prioritization for various requests, so only 
separating connections may not really help unblocking the AlterISR when 
forwarding request is taking indefinite time. In long term, we want to see if 
it is possible to consolidate these two with a systematic logical change on the 
controller side to ensure AlterISR always have higher priority.  (was: In the 
bridge release broker, the UpdateFeatures should be redirected to the active 
controller instead of relying on admin client discovery.)

> Consider consolidation of broker to controller communication
> 
>
> Key: KAFKA-10348
> URL: https://issues.apache.org/jira/browse/KAFKA-10348
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> Right now, we made forwarding and AlterISR into separate channels to make 
> non-blocking behavior for each other. However, the controller queue is single 
> threaded without prioritization for various requests, so only separating 
> connections may not really help unblocking the AlterISR when forwarding 
> request is taking indefinite time. In long term, we want to see if it is 
> possible to consolidate these two with a systematic logical change on the 
> controller side to ensure AlterISR always have higher priority.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10348) Consider consolidation of broker to controller communication

2020-10-30 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10348:

Summary: Consider consolidation of broker to controller communication  
(was: Redirect UpdateFeatures to the controller)

> Consider consolidation of broker to controller communication
> 
>
> Key: KAFKA-10348
> URL: https://issues.apache.org/jira/browse/KAFKA-10348
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> In the bridge release broker, the UpdateFeatures should be redirected to the 
> active controller instead of relying on admin client discovery.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10668) Avoid deserialization on second hop for request forwarding

2020-10-30 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10668:

Description: 
Right now on forwarding broker we would deserialize the response and serialize 
it again to respond to the client. It should be able to keep the response data 
sealed and send back to the client to save some CPU cost.

Original comment: 
https://github.com/apache/kafka/pull/9103#discussion_r515219726

  was:Right now on forwarding broker we would deserialize the response and 
serialize it again to respond to the client. It should be able to keep the 
response data sealed and send back to the client to save some CPU cost.


> Avoid deserialization on second hop for request forwarding
> --
>
> Key: KAFKA-10668
> URL: https://issues.apache.org/jira/browse/KAFKA-10668
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> Right now on forwarding broker we would deserialize the response and 
> serialize it again to respond to the client. It should be able to keep the 
> response data sealed and send back to the client to save some CPU cost.
> Original comment: 
> https://github.com/apache/kafka/pull/9103#discussion_r515219726



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10668) Avoid deserialization on second hop for request forwarding

2020-10-30 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10668:
---

 Summary: Avoid deserialization on second hop for request forwarding
 Key: KAFKA-10668
 URL: https://issues.apache.org/jira/browse/KAFKA-10668
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen


Right now on forwarding broker we would deserialize the response and serialize 
it again to respond to the client. It should be able to keep the response data 
sealed and send back to the client to save some CPU cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10667) Add timeout for forwarding requests

2020-10-30 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10667:
---

 Summary: Add timeout for forwarding requests
 Key: KAFKA-10667
 URL: https://issues.apache.org/jira/browse/KAFKA-10667
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen


It makes sense to handle timeout for forwarding request coming from the client, 
instead of retry indefinitely. We could either use the api timeout, or a 
customized timeout hook which could be defined by different request types.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10663) Flakey test ConsumerBounceTest#testSeekAndCommitWithBrokerFailures

2020-10-29 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10663:
---

 Summary: Flakey test 
ConsumerBounceTest#testSeekAndCommitWithBrokerFailures
 Key: KAFKA-10663
 URL: https://issues.apache.org/jira/browse/KAFKA-10663
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 2.7.0
Reporter: Boyang Chen


org.apache.kafka.common.KafkaException: Socket server failed to bind to 
localhost:40823: Address already in use. at 
kafka.network.Acceptor.openServerSocket(SocketServer.scala:671) at 
kafka.network.Acceptor.(SocketServer.scala:539) at 
kafka.network.SocketServer.createAcceptor(SocketServer.scala:280) at 
kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1(SocketServer.scala:253)
 at 
kafka.network.SocketServer.$anonfun$createDataPlaneAcceptorsAndProcessors$1$adapted(SocketServer.scala:251)
 at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at 
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at 
kafka.network.SocketServer.createDataPlaneAcceptorsAndProcessors(SocketServer.scala:251)
 at kafka.network.SocketServer.startup(SocketServer.scala:125) at 
kafka.server.KafkaServer.startup(KafkaServer.scala:303) at 
kafka.utils.TestUtils$.createServer(TestUtils.scala:160) at 
kafka.utils.TestUtils$.createServer(TestUtils.scala:151) at 
kafka.integration.KafkaServerTestHarness.$anonfun$setUp$1(KafkaServerTestHarness.scala:102)
 at scala.collection.Iterator.foreach(Iterator.scala:943) at 
scala.collection.Iterator.foreach$(Iterator.scal

 

>From AK 2.7



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10657) Incorporate Envelope into auto-generated JSON schema

2020-10-28 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10657:
---

 Summary: Incorporate Envelope into auto-generated JSON schema
 Key: KAFKA-10657
 URL: https://issues.apache.org/jira/browse/KAFKA-10657
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen


We need to add support to output JSON format for embed request inside Envelope 
to do better request loggin.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-673%3A+Emit+JSONs+with+new+auto-generated+schema



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10657) Incorporate Envelope into auto-generated JSON schema

2020-10-28 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10657:

Description: 
We need to add support to output JSON format for embed request inside Envelope 
to do better request logging.

[https://cwiki.apache.org/confluence/display/KAFKA/KIP-673%3A+Emit+JSONs+with+new+auto-generated+schema]

  was:
We need to add support to output JSON format for embed request inside Envelope 
to do better request loggin.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-673%3A+Emit+JSONs+with+new+auto-generated+schema


> Incorporate Envelope into auto-generated JSON schema
> 
>
> Key: KAFKA-10657
> URL: https://issues.apache.org/jira/browse/KAFKA-10657
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> We need to add support to output JSON format for embed request inside 
> Envelope to do better request logging.
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-673%3A+Emit+JSONs+with+new+auto-generated+schema]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-9999) Internal topic creation failure should be non-fatal and trigger explicit rebalance

2020-10-22 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen resolved KAFKA-.

Resolution: Won't Fix

> Internal topic creation failure should be non-fatal and trigger explicit 
> rebalance 
> ---
>
> Key: KAFKA-
> URL: https://issues.apache.org/jira/browse/KAFKA-
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, streams
>Affects Versions: 2.4.0
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> We spotted a case in system test failure where the topic already exists but 
> the admin client still attempts to recreate it:
>  
> {code:java}
> [2020-05-14 09:56:40,120] INFO stream-thread [main] Could not create topic 
> SmokeTest-KSTREAM-REDUCE-STATE-STORE-20-changelog. Topic is probably 
> marked for deletion (number of partitions is unknown).
> Will retry to create this topic in 100 ms (to let broker finish async delete 
> operation first).
> Error message was: org.apache.kafka.common.errors.TopicExistsException: Topic 
> 'SmokeTest-KSTREAM-REDUCE-STATE-STORE-20-changelog' already exists. 
> (org.apache.kafka.streams.processor.internals.InternalTopicManager)
> [2020-05-14 09:56:40,120] INFO stream-thread [main] Could not create topic 
> SmokeTest-uwin-cnt-changelog. Topic is probably marked for deletion (number 
> of partitions is unknown).
> Will retry to create this topic in 100 ms (to let broker finish async delete 
> operation first).
> Error message was: org.apache.kafka.common.errors.TopicExistsException: Topic 
> 'SmokeTest-uwin-cnt-changelog' already exists. 
> (org.apache.kafka.streams.processor.internals.InternalTopicManager) 
> [2020-05-14 09:56:40,120] INFO stream-thread [main] Could not create topic 
> SmokeTest-cntByCnt-changelog. Topic is probably marked for deletion (number 
> of partitions is unknown).
> Will retry to create this topic in 100 ms (to let broker finish async delete 
> operation first).
> Error message was: org.apache.kafka.common.errors.TopicExistsException: Topic 
> 'SmokeTest-cntByCnt-changelog' already exists. 
> (org.apache.kafka.streams.processor.internals.InternalTopicManager)
> [2020-05-14 09:56:40,120] INFO stream-thread [main] Topics 
> [SmokeTest-KSTREAM-REDUCE-STATE-STORE-20-changelog, 
> SmokeTest-uwin-cnt-changelog, SmokeTest-cntByCnt-changelog] can not be made 
> ready with 5 retries left 
> (org.apache.kafka.streams.processor.internals.InternalTopicManager)
> [2020-05-14 09:56:40,220] ERROR stream-thread [main] Could not create topics 
> after 5 retries. This can happen if the Kafka cluster is temporary not 
> available. You can increase admin client config `retries` to be resilient 
> against this error. 
> (org.apache.kafka.streams.processor.internals.InternalTopicManager)
> [2020-05-14 09:56:40,221] ERROR stream-thread 
> [SmokeTest-05374457-074b-4d33-bca0-8686465e8157-StreamThread-2] Encountered 
> the following unexpected Kafka exception during processing, this usually 
> indicate Streams internal errors: 
> (org.apache.kafka.streams.processor.internals.StreamThread)
> org.apache.kafka.streams.errors.StreamsException: Could not create topics 
> after 5 retries. This can happen if the Kafka cluster is temporary not 
> available. You can increase admin client config `retries` to be resilient 
> against this error.
>         at 
> org.apache.kafka.streams.processor.internals.InternalTopicManager.makeReady(InternalTopicManager.java:171)
>         at 
> org.apache.kafka.streams.processor.internals.StreamsPartitionAssignor.prepareTopic(StreamsPartitionAssignor.java:1229)
>         at 
> org.apache.kafka.streams.processor.internals.StreamsPartitionAssignor.assign(StreamsPartitionAssignor.java:588)
>  
>         at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.performAssignment(ConsumerCoordinator.java:548)
>         at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onJoinLeader(AbstractCoordinator.java:650)
>  
>         at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.access$1300(AbstractCoordinator.java:111)
>         at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:572)
>         at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$JoinGroupResponseHandler.handle(AbstractCoordinator.java:555)
>         at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:1026)
>         at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:1006)
>         at 
> 

[jira] [Created] (KAFKA-10607) Ensure the error counts contains the NONE

2020-10-13 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10607:
---

 Summary: Ensure the error counts contains the NONE
 Key: KAFKA-10607
 URL: https://issues.apache.org/jira/browse/KAFKA-10607
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Boyang Chen


In the RPC errorCounts() call, there are inconsistent behaviors from the 
default implementation, for example certain RPCs filter out Errors.NONE during 
the map generation. We should make it consistent by applying the errorCounts() 
to Errors.NONE for all RPCs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10343) Add IBP based ApiVersion constraint tests

2020-10-13 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10343:

Summary: Add IBP based ApiVersion constraint tests  (was: Remove 2.7 IBP 
for redirection enablement)

> Add IBP based ApiVersion constraint tests
> -
>
> Key: KAFKA-10343
> URL: https://issues.apache.org/jira/browse/KAFKA-10343
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> We need to add ApiVersion constraints test based on IBP to remind future 
> developer bump it when new RPC version is developed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-10343) Remove 2.7 IBP for redirection enablement

2020-10-13 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-10343:
---

Assignee: Boyang Chen

> Remove 2.7 IBP for redirection enablement
> -
>
> Key: KAFKA-10343
> URL: https://issues.apache.org/jira/browse/KAFKA-10343
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> We need to add ApiVersion constraints test based on IBP to remind future 
> developer bump it when new RPC version is developed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10343) Remove 2.7 IBP for redirection enablement

2020-10-13 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10343:

Description: We need to add ApiVersion constraints test based on IBP to 
remind future developer bump it when new RPC version is developed.  (was: The 
shipment of redirection could not be complete in 2.7. With that being said, we 
need to patch a PR to disable it once the release branch is cut, by removing 
the new IBP flag entirely.)

> Remove 2.7 IBP for redirection enablement
> -
>
> Key: KAFKA-10343
> URL: https://issues.apache.org/jira/browse/KAFKA-10343
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> We need to add ApiVersion constraints test based on IBP to remind future 
> developer bump it when new RPC version is developed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10594) Enhance Raft exception handling

2020-10-09 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10594:
---

 Summary: Enhance Raft exception handling
 Key: KAFKA-10594
 URL: https://issues.apache.org/jira/browse/KAFKA-10594
 Project: Kafka
  Issue Type: Sub-task
Reporter: Boyang Chen
Assignee: Boyang Chen


The current exception handling on the Raft implementation is superficial, for 
example we don't treat file system exception and request handling exception 
differently. It's necessary to decide what kind of exception should be fatal, 
what kind of exception should be responding to the client, and what exception 
could be retried.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10181) Create Envelope RPC and redirection template for configuration change RPCs

2020-10-09 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10181:

Fix Version/s: (was: 2.7.0)
   2.8.0

> Create Envelope RPC and redirection template for configuration change RPCs
> --
>
> Key: KAFKA-10181
> URL: https://issues.apache.org/jira/browse/KAFKA-10181
> Project: Kafka
>  Issue Type: Sub-task
>  Components: admin
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.8.0
>
>
> In the bridge release broker, 
> AlterConfig/IncrementalAlterConfig/CreateTopics/AlterClientQuota should be 
> redirected to the active controller. This ticket will ensure those RPCs get 
> redirected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10181) Create Envelope RPC and redirection template for configuration change RPCs

2020-10-09 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10181:

Summary: Create Envelope RPC and redirection template for configuration 
change RPCs  (was: Create redirection template for configuration change RPCs)

> Create Envelope RPC and redirection template for configuration change RPCs
> --
>
> Key: KAFKA-10181
> URL: https://issues.apache.org/jira/browse/KAFKA-10181
> Project: Kafka
>  Issue Type: Sub-task
>  Components: admin
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.7.0
>
>
> In the bridge release broker, 
> AlterConfig/IncrementalAlterConfig/CreateTopics/AlterClientQuota should be 
> redirected to the active controller. This ticket will ensure those RPCs get 
> redirected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10513) Newly added topic or partitions are not assigned to running consumer groups using static membership

2020-09-28 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203342#comment-17203342
 ] 

Boyang Chen commented on KAFKA-10513:
-

My understanding is that the consumer was supposed to check the metadata for 
topic partition, and rejoin as necessary when the partition number changes?

> Newly added topic or partitions are not assigned to running consumer groups 
> using static membership
> ---
>
> Key: KAFKA-10513
> URL: https://issues.apache.org/jira/browse/KAFKA-10513
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 2.6.0
>Reporter: Marlon Ou
>Priority: Major
>
> If consumers are polling messages from a certain topic with static membership 
> and we add new partitions to this topic while the consumers are running, no 
> partition reassignment is ever triggered (and hence messages published into 
> the new partitions are never consumed). 
> To reproduce, simply set group instance IDs on the consumers: 
> {code:java}
> props.setProperty("group.instance.id", instanceId);
> {code}
> And then while the static consumers are running, use Kafka's admin client to 
> add more partitions to the topic:
> {code:java}
> adminClient.createPartitions(...)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10504) It will not work to skip to InitProducerId as lastError is always null

2020-09-21 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199715#comment-17199715
 ] 

Boyang Chen commented on KAFKA-10504:
-

Is this optimization necessary?

> It will not work to skip to InitProducerId as lastError is always null
> --
>
> Key: KAFKA-10504
> URL: https://issues.apache.org/jira/browse/KAFKA-10504
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Reporter: HaiyuanZhao
>Assignee: HaiyuanZhao
>Priority: Major
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> Kafka-8805 introduced an optimization for txn abort process: If the last 
> error is an INVALID_PRODUCER_ID_MAPPING error, skip directly to InitProduceId.
> However this optimization will not work as the var lastError is always null. 
> Because the txn state will transit to ABORTING_TRANSACTION from 
> ABORTABLE_ERROR when beginAbort is called, and the lastError will updated to 
> null.
> So then EndTxn is always called before InitProduceId.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10475) Using same key reports different count of records for groupBy() and groupByKey() in Kafka Streaming Application

2020-09-21 Thread Boyang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199689#comment-17199689
 ] 

Boyang Chen commented on KAFKA-10475:
-

[~saad-rasool] [~guozhang]I don't think we have enough information to reproduce 
this issue, could you give us a sample setup (the application code, the input 
data, and expected output) so that we could verify there is indeed a problem in 
groupByKey?

> Using same key reports different count of records for groupBy() and 
> groupByKey() in Kafka Streaming Application
> ---
>
> Key: KAFKA-10475
> URL: https://issues.apache.org/jira/browse/KAFKA-10475
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.3.0
> Environment: Kafka Cluster:
> Kafka Version: kafka_2.12-2.6.0/
> openjdk version "1.8.0_265"
> Kafka Streams:
> Kafka Streams Version: 2.3.0
> openjdk version "11.0.8"
>Reporter: Saad Rasool
>Assignee: Divya Guduru
>Priority: Major
>
>  
> We are experiencing what amounts to “lost packets” in our stream processing 
> when we use custom groupByKey() values. We have a single processor node, with 
> a source topic from which we read packets, do a grouping and aggregation on 
> that group, and output based on a computation that requires access to a 
> statestore.
>  
> Let me give greater details of the problem and how we have tried to 
> understand it until now, below:
> *Overview* We are setting up a Kafka Streams application in which we have to 
> perform windowed operations. We are grouping devices based on a specific key. 
> Following are the sample columns we are using for GroupBy:
>  
> ||Field Name ||Field Value||
> |A|12|
> |B|abc|
> |C|x13|
>  
> Sample Key based on the above data: 12abcx13 where key = Field (A) + Field 
> (B) + Field (C)
> *Problem* Getting a different count of records in two scenarios against the 
> same key When specifying the key ourselves using groupBy() Using groupByKey() 
> to group the data on the ‘Input Kafka Topic’ partitioning key.
> *Description* We were first using the groupBy() function of Kafka streams to 
> group the devices using the key above. In this case, the streams application 
> dropped several records and produced less number of records than expected. 
> However, when we did not specify our own custom grouping using the groupBy() 
> function, and instead used groupByKey() to key the data on the original 
> incoming Kafka partition key, we got the exact number of records which were 
> expected.
> To check that we were using the exact same keys as the input topic for our 
> custom groupBy() function we compared both Keys within the code. The Input 
> topic key and the custom key were exactly the same.
> So now we have come to the conclusion that there is some internal 
> functionality of the groupBy function that we are not able to understand 
> because of which the groupBy function and the groupByKey function both report 
> different counts for the same key. We have searched multiple forums but are 
> unable to understand the reason for this phenomenon.
> *Code Snippet:*
> With groupBykey()
>   
> {code:java}
> KStream myStream = this.stream
> .groupByKey() 
> .windowedBy(TimeWindows.of(Duration.ofMinutes(windowTime)).advanceBy(Duration.ofMinutes(advanceByTime)).grace(Duration.ofSeconds(gracePeriod)))
>  .reduce((value1, value2) -> value2) 
> .suppress(Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded())) 
> .toStream() .transform(new myTransformer(this.store.name(), 
> this.store.name());{code}
>  
>   
> With groupBy():
>   
> {code:java}
> KStream myStream = this.stream
> .groupBy((key, value) -> value.A + value.B + value.C, 
> Grouped.with(Serdes.String(), SerdesCollection.getInputSerdes())) 
> .windowedBy(TimeWindows.of(Duration.ofMinutes(windowTime)).advanceBy(Duration.ofMinutes(advanceByTime)).grace(Duration.ofSeconds(gracePeriod)))
>  .reduce((value1, value2) -> value2) 
> .suppress(Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded())) 
> .toStream() .transform(new myTransformer(this.store.name()), 
> this.store.name());{code}
>  
>   
> ||*Kafka Cluster Setup*|| ||
> |Number of Nodes|       3|
> |CPU Cores|       2|
> |RAM|     8 Gb|
>  
> ||*Streaming Application Setup*||Version||
> |       {{Kafka Streams Version }}| {{2.3.0}}|
> |          openjdk version| 11.0.8|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10343) Remove 2.7 IBP for redirection enablement

2020-09-21 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10343:

Fix Version/s: 2.7.0

> Remove 2.7 IBP for redirection enablement
> -
>
> Key: KAFKA-10343
> URL: https://issues.apache.org/jira/browse/KAFKA-10343
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
> Fix For: 2.7.0
>
>
> In the bridge release broker, AlterClientQuotas should be redirected to the 
> active controller.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10343) Remove 2.7 IBP for redirection enablement

2020-09-21 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10343:

Description: The shipment of redirection could not be complete in 2.7. With 
that being said, we need to patch a PR to disable it once the release branch is 
cut, by removing the new IBP flag entirely.  (was: In the bridge release 
broker, AlterClientQuotas should be redirected to the active controller.)

> Remove 2.7 IBP for redirection enablement
> -
>
> Key: KAFKA-10343
> URL: https://issues.apache.org/jira/browse/KAFKA-10343
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
> Fix For: 2.7.0
>
>
> The shipment of redirection could not be complete in 2.7. With that being 
> said, we need to patch a PR to disable it once the release branch is cut, by 
> removing the new IBP flag entirely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10343) Remove 2.7 IBP for redirection enablement

2020-09-21 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10343:

Summary: Remove 2.7 IBP for redirection enablement  (was: Redirect 
AlterClientQuotas to the controller)

> Remove 2.7 IBP for redirection enablement
> -
>
> Key: KAFKA-10343
> URL: https://issues.apache.org/jira/browse/KAFKA-10343
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Priority: Major
>
> In the bridge release broker, AlterClientQuotas should be redirected to the 
> active controller.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10508) Consider moving ForwardRequestHandler to a separate class

2020-09-21 Thread Boyang Chen (Jira)
Boyang Chen created KAFKA-10508:
---

 Summary: Consider moving ForwardRequestHandler to a separate class
 Key: KAFKA-10508
 URL: https://issues.apache.org/jira/browse/KAFKA-10508
 Project: Kafka
  Issue Type: Improvement
Reporter: Boyang Chen


With the new redirection template merged in 
[https://github.com/apache/kafka/pull/9103,] the size of KafkaApis file grows 
to 3500+, which is reaching a fair large size. We should consider moving the 
redirection template out as a separate file to reduce the main class size.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-10350) Add redirect request monitoring metrics

2020-09-21 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen reassigned KAFKA-10350:
---

Assignee: Boyang Chen

> Add redirect request monitoring metrics
> ---
>
> Key: KAFKA-10350
> URL: https://issues.apache.org/jira/browse/KAFKA-10350
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> We need to add the metric for monitoring redirection progress as stated in 
> the KIP:
> MBean:kafka.server:type=RequestMetrics,name=NumRequestsForwardingToControllerPerSec,clientId=([-.\w]+)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10181) Create redirection template for configuration change RPCs

2020-09-21 Thread Boyang Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-10181:

Summary: Create redirection template for configuration change RPCs  (was: 
Redirect AlterConfig/IncrementalAlterConfig to the controller)

> Create redirection template for configuration change RPCs
> -
>
> Key: KAFKA-10181
> URL: https://issues.apache.org/jira/browse/KAFKA-10181
> Project: Kafka
>  Issue Type: Sub-task
>  Components: admin
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
> Fix For: 2.7.0
>
>
> In the bridge release broker, AlterConfig/IncrementalAlterConfig should be 
> redirected to the active controller.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   8   9   10   >