[kafka] branch 2.6 updated: KAFKA-10274; Consistent timeouts in transactions_test (#9026)

junrao Wed, 22 Jul 2020 12:15:14 -0700

This is an automated email from the ASF dual-hosted git repository.

junrao pushed a commit to branch 2.6
in repository https://gitbox.apache.org/repos/asf/kafka.git



The following commit(s) were added to refs/heads/2.6 by this push:
     new 49ce932  KAFKA-10274; Consistent timeouts in transactions_test (#9026)
49ce932 is described below

commit 49ce932187b98fa989d3e6940f4bbcb7e7d93c9a
Author: Jason Gustafson <[email protected]>
AuthorDate: Wed Jul 22 12:06:47 2020 -0700

    KAFKA-10274; Consistent timeouts in transactions_test (#9026)
    
    KAFKA-10235 fixed a consistency issue with the transaction timeout and the 
progress timeout. Since the test case relies on transaction timeouts, we need 
to wait at last as long as the timeout in order to ensure progress. However, 
having a low transaction timeout makes the test prone to the issue identified 
in KAFKA-9802, in which the coordinator timed out the transaction while the 
producer was awaiting a Produce response.
    
    Reviewers: Chia-Ping Tsai <[email protected]>,  Boyang Chen 
<[email protected]>, Jun Rao <[email protected]>
---
 tests/kafkatest/tests/core/transactions_test.py | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/tests/kafkatest/tests/core/transactions_test.py 
b/tests/kafkatest/tests/core/transactions_test.py
index 8cd1892..9aeff23 100644
--- a/tests/kafkatest/tests/core/transactions_test.py
+++ b/tests/kafkatest/tests/core/transactions_test.py
@@ -47,11 +47,15 @@ class TransactionsTest(Test):
         self.num_output_partitions = 3
         self.num_seed_messages = 100000
         self.transaction_size = 750
-        # The timeout of transaction should be lower than the timeout of 
verification. The transactional message sent by
-        # client may be not correctly completed in hard_bounce mode. The 
pending transaction (unstable offset) stored by
-        # broker obstructs TransactionMessageCopier from getting offset of 
partition which is used to calculate
-        # remaining messages after restarting.
-        self.transaction_timeout = 5000
+
+        # The transaction timeout should be lower than the progress timeout, 
but at
+        # least as high as the request timeout (which is 30s by default). When 
the
+        # client is hard-bounced, progress may depend on the previous 
transaction
+        # being aborted. When the broker is hard-bounced, we may have to wait 
as
+        # long as the request timeout to get a `Produce` response and we do not
+        # want the coordinator timing out the transaction.
+        self.transaction_timeout = 40000
+        self.progress_timeout_sec = 60
         self.consumer_group = "transactions-test-consumer-group"
 
         self.zk = ZookeeperService(test_context, num_nodes=1)
@@ -119,9 +123,9 @@ class TransactionsTest(Test):
         for _ in range(3):
             for copier in copiers:
                 wait_until(lambda: copier.progress_percent() >= 20.0,
-                           timeout_sec=30,
-                           err_msg="%s : Message copier didn't make enough 
progress in 30s. Current progress: %s" \
-                           % (copier.transactional_id, 
str(copier.progress_percent())))
+                           timeout_sec=self.progress_timeout_sec,
+                           err_msg="%s : Message copier didn't make enough 
progress in %ds. Current progress: %s" \
+                           % (copier.transactional_id, 
self.progress_timeout_sec, str(copier.progress_percent())))
                 self.logger.info("%s - progress: %s" % 
(copier.transactional_id,
                                                         
str(copier.progress_percent())))
                 copier.restart(clean_shutdown)

[kafka] branch 2.6 updated: KAFKA-10274; Consistent timeouts in transactions_test (#9026)

Reply via email to