[jira] [Commented] (FLINK-8973) End-to-end test: Run general purpose job with failures in standalone mode

ASF GitHub Bot (JIRA) Tue, 27 Mar 2018 02:13:34 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16415279#comment-16415279
 ]


ASF GitHub Bot commented on FLINK-8973:
---------------------------------------

Github user twalthr commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5750#discussion_r177341094
  
    --- Diff: flink-end-to-end-tests/test-scripts/common.sh ---
    @@ -59,9 +162,42 @@ function start_cluster {
       done
     }
     
    +function jm_watchdog() {
    +    expectedJms=$1
    +    ipPort=$2
    +
    +    while true; do
    +        runningJms=`jps | grep -o 'StandaloneSessionClusterEntrypoint' | 
wc -l`;
    +        missingJms=$((expectedJms-runningJms))
    +        for (( c=0; c<missingJms; c++ )); do
    +            "$FLINK_DIR"/bin/jobmanager.sh start "localhost" ${ipPort}
    --- End diff --
    
    Does it makes sense to start multiple job managers with the same `ipPort`?


> End-to-end test: Run general purpose job with failures in standalone mode
> -------------------------------------------------------------------------
>
>                 Key: FLINK-8973
>                 URL: https://issues.apache.org/jira/browse/FLINK-8973
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Tests
>    Affects Versions: 1.5.0
>            Reporter: Till Rohrmann
>            Assignee: Kostas Kloudas
>            Priority: Blocker
>             Fix For: 1.5.0
>
>
> We should set up an end-to-end test which runs the general purpose job 
> (FLINK-8971) in a standalone setting with HA enabled (ZooKeeper). When 
> running the job, the job failures should be activated. 
> Additionally, we should randomly kill Flink processes (cluster entrypoint and 
> TaskExecutors). When killing them, we should also spawn new processes to make 
> up for the loss.
> This end-to-end test case should run with all different state backend 
> settings: {{RocksDB}} (full/incremental, async/sync), {{FsStateBackend}} 
> (sync/async)
> We should then verify that the general purpose job is successfully recovered 
> without data loss or other failures.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-8973) End-to-end test: Run general purpose job with failures in standalone mode

Reply via email to