[GitHub] flink pull request #6240: [FLINK-9004][tests] Implement Jepsen tests to test...

2018-07-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/6240


---


[GitHub] flink pull request #6240: [FLINK-9004][tests] Implement Jepsen tests to test...

2018-07-10 Thread GJL
Github user GJL commented on a diff in the pull request:

https://github.com/apache/flink/pull/6240#discussion_r201312394
  
--- Diff: flink-jepsen/src/jepsen/flink/db.clj ---
@@ -175,7 +175,7 @@
   (c/su
 (c/exec (c/lit (str "HADOOP_CLASSPATH=`" hadoop/install-dir 
"/bin/hadoop classpath` "
 "HADOOP_CONF_DIR=" hadoop/hadoop-conf-dir
-" " install-dir "/bin/yarn-session.sh -d 
-jm 2048 -tm 2048")))
+" " install-dir "/bin/yarn-session.sh -d 
-jm 2048m -tm 2048m")))
--- End diff --

See https://issues.apache.org/jira/browse/FLINK-9777


---


[GitHub] flink pull request #6240: [FLINK-9004][tests] Implement Jepsen tests to test...

2018-07-06 Thread GJL
Github user GJL commented on a diff in the pull request:

https://github.com/apache/flink/pull/6240#discussion_r200667325
  
--- Diff: jepsen-flink/.gitignore ---
@@ -0,0 +1,17 @@
+*.class
+*.iml
+*.jar
+*.retry
+.DS_Store
+.hg/
+.hgignore
+.idea/
+/.lein-*
+/.nrepl-port
+/checkouts
+/classes
+/target
+pom.xml
+pom.xml.asc
+store
+bin/DataStreamAllroundTestProgram.jar
--- End diff --

Good point. I fixed it.


---


[GitHub] flink pull request #6240: [FLINK-9004][tests] Implement Jepsen tests to test...

2018-07-04 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/6240#discussion_r200127094
  
--- Diff: jepsen-flink/.gitignore ---
@@ -0,0 +1,17 @@
+*.class
+*.iml
+*.jar
+*.retry
+.DS_Store
+.hg/
+.hgignore
+.idea/
+/.lein-*
+/.nrepl-port
+/checkouts
+/classes
+/target
+pom.xml
+pom.xml.asc
+store
+bin/DataStreamAllroundTestProgram.jar
--- End diff --

Maybe we could ignore the complete `bin/` folder. I put, for example, my 
`flink-dist.tgz` there and git reports it now as an untracked file.


---


[GitHub] flink pull request #6240: [FLINK-9004][tests] Implement Jepsen tests to test...

2018-07-03 Thread GJL
GitHub user GJL opened a pull request:

https://github.com/apache/flink/pull/6240

[FLINK-9004][tests] Implement Jepsen tests to test job availability.

## What is the purpose of the change

*Use the Jepsen framework (https://github.com/jepsen-io/jepsen) to implement
tests that verify Flink's HA capabilities under real-world faults, such as
sudden TaskManager/JobManager termination, HDFS NameNode unavailability, 
network
partitions, etc. The Flink cluster under test is automatically deployed on 
YARN
(session & job mode) and Mesos.*

Previous PR got closed accidentally: 
https://github.com/apache/flink/pull/6239

## Brief change log

  - *Implement Jepsen tests.*


## Verifying this change

This change added tests and can be verified as follows:

  - *The changes themselves are tests.*
  - *Run Jepsen tests in docker containers.*
  - *Run unit tests with `lein test`*

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes / **no** (at 
least not to Flink))
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
  - The serializers: (yes / **no** / don't know)
  - The runtime per-record code paths (performance sensitive): (yes / 
**no** / don't know)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** (but it will 
as soon as test failures appear) / don't know)
  - The S3 file system connector: (yes / **no** / don't know)

## Documentation

  - Does this pull request introduce a new feature? (yes / **no**)
  - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)

cc: @tillrohrmann @cewood @zentol @aljoscha 




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/GJL/flink FLINK-9004

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/6240.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6240


commit 063e4621a5982b55ee7f7b0935290bbc717a5a45
Author: gyao 
Date:   2018-03-05T21:23:33Z

[FLINK-9004][tests] Implement Jepsen tests to test job availability.

Use the Jepsen framework (https://github.com/jepsen-io/jepsen) to implement
tests that verify Flink's HA capabilities under real-world faults, such as
sudden TaskManager/JobManager termination, HDFS NameNode unavailability, 
network
partitions, etc. The Flink cluster under test is automatically deployed on 
YARN
(session & job mode) and Mesos.

Provide Dockerfiles for local test development.

commit 46f0ea7b14c9c59d6cc40903486978f4fd8354d3
Author: gyao 
Date:   2018-07-02T12:21:18Z

fixup! [FLINK-9004][tests] Implement Jepsen tests to test job availability.




---