[ 
https://issues.apache.org/jira/browse/FLINK-22923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359326#comment-17359326
 ] 

Robert Metzger commented on FLINK-22923:
----------------------------------------

The test has a simple pipeline that counts the number of unique keys, and 
outputs that count on snapshot. Once a  10 checkpoints have been created, the 
taskmanager gets killed and the number of unique keys is stored in a variable 
(in the failure case, it was 42).

Now a TM is added again, and we wait for 5 more checkpoints to complete. Now we 
query (using SQ) the number of unique keys in the map and compare it to the 
value stored (we expect it to be higher, because more keys should have been 
seen by now). In the failure case, the number of keys is 20.

> Queryable state (rocksdb) with TM restart end-to-end test unstable
> ------------------------------------------------------------------
>
>                 Key: FLINK-22923
>                 URL: https://issues.apache.org/jira/browse/FLINK-22923
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Queryable State
>    Affects Versions: 1.14.0
>            Reporter: Robert Metzger
>            Priority: Critical
>              Labels: test-stability
>             Fix For: 1.14.0
>
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=9119&view=logs&j=9401bf33-03c4-5a24-83fe-e51d75db73ef&t=72901ab2-7cd0-57be-82b1-bca51de20fba
> (This failure happened on my personal CI after upgrading our base image to 
> ubuntu 20.04, this change is now merged to master)
> {code}
> Jun 04 19:39:12 16/17 completed checkpoints
> Jun 04 19:39:14 16/17 completed checkpoints
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
> details.
> Jun 04 19:39:17 after: 20
> Jun 04 19:39:17 An error occurred
> Jun 04 19:39:17 [FAIL] Test script contains errors.
> Jun 04 19:39:17 Checking of logs skipped.
> Jun 04 19:39:17 
> Jun 04 19:39:17 [FAIL] 'Queryable state (rocksdb) with TM restart end-to-end 
> test' failed after 0 minutes and 48 seconds! Test exited with exit code 1
> Jun 04 19:39:17 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to