[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-09-09 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-689921735 > Thanks a lot for staying on this tricky issue and finding a simpler solution! thanks for all suggestions. I benefit a lot from it.

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-09-08 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-689256048 ``` Build / JDK 15 / kafka.network.ConnectionQuotasTest.testNoConnectionLimitsByDefault Build / JDK 11 /

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-09-07 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-688481474 @junrao Thanks for all reviews again  > Do you plan to remove some of the unused methods in DelayedOperations in Partition? my bad. I forgot this request :(

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-09-02 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-685838009 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-09-02 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-685837334 ```kafka.network.DynamicConnectionQuotaTest.testDynamicListenerConnectionCreationRateQuota``` pass on my local. retest this please

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-09-02 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-685833800 > 00:31:51 [Error]

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-31 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-683863356 > It seems that the average across multiple runs doesn't change much? yep. I didn't observe obvious regression caused by this patch. > Also, 1 failure in the latest

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-31 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-683602732 @junrao the result of ```benchmark_test.py``` is attached (see description) The main regression ({"records_per_sec": 3653635.3672, "mb_per_sec": 348.4378} ->

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-30 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-683434648 > Could you do some perf tests so that we know the performance of fetch requests doesn't change noticeably? the result of ```Benchmark.test_producer_and_consumer``` is

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-30 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-683400677 > Could you do some perf tests so that we know the performance of fetch requests doesn't change noticeably? @junrao Are there any suggested official benchmark tools?

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-29 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-683284825 > If it's meant to be called by every request, then maybe we should have the delayed actions in a separate class instead of ReplicaManager. Other classes could, in theory, add

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-28 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-682879578 @junrao Thanks for all suggestions! Is Jenkins on vacation? Could you trigger a system test? This is an

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-27 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-681708424 @junrao I have submitted a draft patch. As this new approach is totally different from previous code, I delete previous commits in order to clean git history.

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-26 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-681375395 > This issue will be resolved if we complete those delayed operations due to groupManager.storeGroup() elsewhere without holding any locks. So ```DelayedJoin``` DOES NOT

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-26 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-681318790 > We keep DelayedJoin as it is, still passing in the group lock to DelayedOperation to avoid deadlocks due to two levels of locking. Just double check. We use a separate

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-25 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-679855792 > The second issue is that we hold a group lock while calling joinPurgatory.tryCompleteElseWatch. In this call, it's possible that DelayedJoin.onComplete() will be called. In

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-11 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-671749755 > Both operations are done under the same group lock. If we relax the lock, it seems that the condition "all members have joined" may no longer be true when we get to

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-06 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-670293057 > Could you provide a bit more detail? The new deadlock you mentioned is caused by PR. This PR introduces extra lock (```ReentrantLock```) to ```DelayedJoin``` (By

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-08-05 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-669647781 @junrao I also don't want to include more mechanisms to complicate this story. It seems to me we can do a litter factor for ```DelayedOperation``` to resolve this issue.

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-31 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-667442912 > So, it's just a rebase and there is no change in your PR? The last change to this PR is to rename a class (according to @ijuma’s comment)

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-30 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-666882466 > Were there significant changes since the last review? I rebased code base to include https://github.com/apache/kafka/commit/96e0719e421a91b5fa289dc0d23276977655633e

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-30 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-666398632 > 2 system test failures in the latest PR those 2 failed tests are flaky on my local and there are issue/PR related to them. @junrao @ijuma @hachikuji

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-29 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-665149860 ```streams_standby_replica_test``` -> https://issues.apache.org/jira/browse/KAFKA-10287 I will take a look at ```streams_broker_bounce_test```

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-25 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-663818439 @junrao I have rebased this PR to include fix of ```group_mode_transactions_test```. Could you run system tests again? Except for ```streams_eos_test``` and transaction

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-24 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-663618045 @junrao thanks for the reports. Unfortunately, the failed tests are totally different in both results :( This PR has been rebased so the fix for

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-22 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-662810186 > Could you rebase again? I will run the system tests after that. Thanks. done. the known flaky ```group_mode_transactions_test.py``` is traced by #9095

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-21 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-661862318 @junrao Could you run system tests on both trunk and this PR? the tests which fail frequently on my local are shown below. 1. transactions_test.py #9026 1.

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-14 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-658294532 > The number of failures went up to 17. ok. Let me dig in them again :) there are 3 failed tests related to ```security_rolling_upgrade_test``` and I have filed

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-14 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-658027392 ``` org.apache.kafka.streams.integration.EosIntegrationTest.shouldNotViolateEosIfOneTaskFailsWithState[exactly_once] ``` It is traced by

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-13 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-657739902 @junrao could you trigger system tests again? This is an automated message from the Apache Git Service. To

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-09 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-656382338 > Do you want to rebase again so that I can run system tests one more time? sure! This is an automated

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-05 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-653956055 > Are the unit test failures also due to flaky tests? I think so. However, I’m trying to resolve all of them so as to make this PR more safe :) #8981 #8974 #8913

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-02 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-653232232 > Overall, do you think there is no new system test failure introduced by this PR? I did not observe obvious relation between flaky and this PR. I have filed 4 PRs to fix

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-07-02 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-653224351 > core/transactions_test core/downgrade_test core/upgrade_test those are flaky on my local as well. @junrao Should I fix all flaky before merging this PR?

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-28 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-650923843 connect_rest_test.py is traced by #8944 This is an automated message from the Apache Git Service. To respond to

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-27 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-650586275 > Perhaps you can rebase again. done This is an automated message from the Apache Git Service. To respond

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-27 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-650503360 > For comparison, 29 test failures in trunk http://testing.confluent.io/confluent-kafka-system-test-results/?prefix=2020-06-25--001.1593132363--apache--trunk--3348fc49d/

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-25 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-649635872 > I will trigger a system tests for trunk. @junrao big thanks ! This is an automated message from the

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-23 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-648254987 > Could you rebase your PR? done This is an automated message from the Apache Git Service. To respond to

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-22 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-647352301 I have filed https://issues.apache.org/jira/browse/KAFKA-10191 to trace ```StreamsOptimizedTest``` This is an

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-21 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-647233460 > This is a second run of the system tests.

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-19 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-646718970 > Could you take a look and see if it's related to this PR? I will trigger another run just to see if any of those failures were transient. I compare the failed tests to

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-19 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-646487562 @junrao thanks for all your reviews. > I triggered a system test on this PR. https://jenkins.confluent.io/job/system-test-kafka-branch-builder/3970/ How to see the

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-15 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-644521032 > It will be helpful if you could preserve the commit history in future updates to the PR since that makes it easier to identify the delta changes. my bad :( I'll

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-14 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-643890644 > Another way that doesn't require checking lock.isHeldByCurrentThread is the following. But your approach seems simpler. So... could we keep it simpler?

[GitHub] [kafka] chia7712 commented on pull request #8657: KAFKA-8334 Make sure the thread which tries to complete delayed reque…

2020-06-02 Thread GitBox
chia7712 commented on pull request #8657: URL: https://github.com/apache/kafka/pull/8657#issuecomment-637668386 @junrao Could you take a look? This is an automated message from the Apache Git Service. To respond to the