[
https://issues.apache.org/jira/browse/FLINK-18071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128338#comment-17128338
]
Yun Gao commented on FLINK-18071:
---------------------------------
Hi [~sewen], I also debugged the case today, it seems come from the concurrency
between the timer thread to send event and the thread to reset the buffered
events. The timer thread may scheduling the event sending, but before the
sending finished, failover happens and another thread called resetForTask and
cleared the buffered events. However, the first thread will be awaked later and
continue to send the outdated event, which will be inserted to the valve
buffers and cause repeat.
> CoordinatorEventsExactlyOnceITCase.checkListContainsSequence fails on CI
> ------------------------------------------------------------------------
>
> Key: FLINK-18071
> URL: https://issues.apache.org/jira/browse/FLINK-18071
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination, Tests
> Affects Versions: 1.11.0, 1.12.0
> Reporter: Robert Metzger
> Assignee: Yun Gao
> Priority: Critical
> Labels: test-stability
> Fix For: 1.11.0, 1.12.0
>
>
> CI:
> https://dev.azure.com/georgeryan1322/Flink/_build/results?buildId=330&view=logs&j=6e58d712-c5cc-52fb-0895-6ff7bd56c46b&t=f30a8e80-b2cf-535c-9952-7f521a4ae374
> {code}
> [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 8.795
> s <<< FAILURE! - in
> org.apache.flink.runtime.operators.coordination.CoordinatorEventsExactlyOnceITCase
> [ERROR]
> test(org.apache.flink.runtime.operators.coordination.CoordinatorEventsExactlyOnceITCase)
> Time elapsed: 4.647 s <<< FAILURE!
> java.lang.AssertionError: List did not contain expected sequence of 200
> elements, but was: [152, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
> 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
> 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52,
> 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,
> 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
> 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107,
> 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122,
> 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137,
> 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152,
> 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167,
> 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182,
> 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197,
> 198, 199]
> at org.junit.Assert.fail(Assert.java:88)
> at
> org.apache.flink.runtime.operators.coordination.CoordinatorEventsExactlyOnceITCase.failList(CoordinatorEventsExactlyOnceITCase.java:160)
> at
> org.apache.flink.runtime.operators.coordination.CoordinatorEventsExactlyOnceITCase.checkListContainsSequence(CoordinatorEventsExactlyOnceITCase.java:148)
> at
> org.apache.flink.runtime.operators.coordination.CoordinatorEventsExactlyOnceITCase.test(CoordinatorEventsExactlyOnceITCase.java:143)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)