[jira] [Updated] (FLINK-9004) Cluster test: Run general purpose job with failures with Yarn session

Gary Yao (JIRA) Mon, 16 Jul 2018 02:29:21 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Gary Yao updated FLINK-9004:
----------------------------
    Description: 
Similar to FLINK-8973, we should run the general purpose job (FLINK-8971) on a 
Yarn session cluster and simulate failures.

The job jar should be ill-packaged, meaning that we include too many 
dependencies in the user jar. We should include the Scala library, Hadoop and 
Flink itself to verify that there are no class loading issues.

The general purpose job should run with misbehavior activated. Additionally, we 
should simulate at least the following failure scenarios:
* Kill Flink processes
* Kill connection to storage system for checkpoints and jobs
* Simulate network partition

We should run the test at least with the following state backend: RocksDB 
incremental async and checkpointing to HDFS.

  was:
Similar to FLINK-8973, we should run the general purpose job (FLINK-8971) on a 
Yarn session cluster and simulate failures.

The job jar should be ill-packaged, meaning that we include too many 
dependencies in the user jar. We should include the Scala library, Hadoop and 
Flink itself to verify that there are no class loading issues.

The general purpose job should run with misbehavior activated. Additionally, we 
should simulate at least the following failure scenarios:
* Kill Flink processes
* Kill connection to storage system for checkpoints and jobs
* Simulate network partition

We should run the test at least with the following state backend: RocksDB 
incremental async and checkpointing to S3.


> Cluster test: Run general purpose job with failures with Yarn session
> ---------------------------------------------------------------------
>
>                 Key: FLINK-9004
>                 URL: https://issues.apache.org/jira/browse/FLINK-9004
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Tests
>    Affects Versions: 1.5.0
>            Reporter: Till Rohrmann
>            Assignee: Gary Yao
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.6.0
>
>
> Similar to FLINK-8973, we should run the general purpose job (FLINK-8971) on 
> a Yarn session cluster and simulate failures.
> The job jar should be ill-packaged, meaning that we include too many 
> dependencies in the user jar. We should include the Scala library, Hadoop and 
> Flink itself to verify that there are no class loading issues.
> The general purpose job should run with misbehavior activated. Additionally, 
> we should simulate at least the following failure scenarios:
> * Kill Flink processes
> * Kill connection to storage system for checkpoints and jobs
> * Simulate network partition
> We should run the test at least with the following state backend: RocksDB 
> incremental async and checkpointing to HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-9004) Cluster test: Run general purpose job with failures with Yarn session

Reply via email to