[
https://issues.apache.org/jira/browse/FLINK-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622537#comment-16622537
]
ASF GitHub Bot commented on FLINK-10311:
----------------------------------------
GJL commented on a change in pull request #6712: [FLINK-10311][tests] HA
end-to-end/Jepsen tests for standby Dispatchers
URL: https://github.com/apache/flink/pull/6712#discussion_r219282159
##########
File path: flink-jepsen/scripts/run-tests.sh
##########
@@ -36,8 +36,15 @@ do
lein run test "${common_jepsen_args[@]}" --nemesis-gen kill-task-managers
--deployment-mode yarn-session
lein run test "${common_jepsen_args[@]}" --nemesis-gen kill-job-managers
--deployment-mode yarn-session
lein run test "${common_jepsen_args[@]}" --nemesis-gen
fail-name-node-during-recovery --deployment-mode yarn-session
+
lein run test "${common_jepsen_args[@]}" --nemesis-gen kill-task-managers
--deployment-mode yarn-job
lein run test "${common_jepsen_args[@]}" --nemesis-gen kill-job-managers
--deployment-mode yarn-job
lein run test "${common_jepsen_args[@]}" --nemesis-gen
fail-name-node-during-recovery --deployment-mode yarn-job
+
+ lein run test "${common_jepsen_args[@]}" --nemesis-gen kill-task-managers
--deployment-mode mesos-session
+ lein run test "${common_jepsen_args[@]}" --nemesis-gen kill-job-managers
--deployment-mode mesos-session
+
+ lein run test "${common_jepsen_args[@]}" --nemesis-gen kill-job-managers
--deployment-mode standalone-session
+ lein run test "${common_jepsen_args[@]}" --nemesis-gen kill-job-managers
--client-gen cancel-job --deployment-mode standalone-session
Review comment:
It is asserted that the job must not be running after cancellation. Here is
the unit test: a74d25cb5243f77d98b7e2720183ae50R76
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> HA end-to-end/Jepsen tests for standby Dispatchers
> --------------------------------------------------
>
> Key: FLINK-10311
> URL: https://issues.apache.org/jira/browse/FLINK-10311
> Project: Flink
> Issue Type: Improvement
> Components: Tests
> Affects Versions: 1.7.0
> Reporter: Till Rohrmann
> Assignee: Gary Yao
> Priority: Critical
> Labels: pull-request-available
> Fix For: 1.7.0, 1.6.2, 1.5.5
>
>
> We should add end-to-end or Jepsen tests to verify the HA behaviour when
> using multiple standby Dispatchers. In particular, we should verify that jobs
> get properly cleaned up after they finished successfully (see FLINK-10255 and
> FLINK-10011):
> 1. Test that a standby Dispatcher does not affect job execution and resource
> cleanup
> 2. Test that a standby Dispatcher recovers all submitted jobs after the
> leader loses leadership
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)