[ https://issues.apache.org/jira/browse/FLINK-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168098#comment-15168098 ]
ASF GitHub Bot commented on FLINK-3517: --------------------------------------- GitHub user uce opened a pull request: https://github.com/apache/flink/pull/1716 [FLINK-3517] [dist] Only count active PIDs in start script ```bash $ bin/start-cluster.sh Starting cluster. Starting jobmanager daemon on host pablo. Starting taskmanager daemon on host pablo. $ bin/taskmanager.sh start [INFO] 1 instance(s) of taskmanager are already running on pablo. Starting taskmanager daemon on host pablo. $ bin/taskmanager.sh start [INFO] 2 instance(s) of taskmanager are already running on pablo. Starting taskmanager daemon on host pablo. $ bin/taskmanager.sh start [INFO] 3 instance(s) of taskmanager are already running on pablo. Starting taskmanager daemon on host pablo. $ jps 27328 TaskManager 27140 TaskManager 26949 TaskManager 26523 JobManager 26716 TaskManager $ kill -9 27140 $ bin/taskmanager.sh start >>> [INFO] 3 instance(s) of taskmanager are already running on pablo <<< Correct now Starting taskmanager daemon on host pablo. $ bin/stop-cluster.sh Stopping taskmanager daemon (pid: 27545) on host pablo. Stopping jobmanager daemon (pid: 26523) on host pablo. $ bin/taskmanager.sh stop Stopping taskmanager daemon (pid: 27328) on host pablo. $ bin/taskmanager.sh stop No taskmanager daemon (pid: 27140) is running anymore on pablo. $ bin/taskmanager.sh stop Stopping taskmanager daemon (pid: 26949) on host pablo. $ bin/taskmanager.sh stop Stopping taskmanager daemon (pid: 26716) on host pablo. $ bin/taskmanager.sh stop No taskmanager daemon to stop on host pablo. ``` We can further improve the stop part by repeatedly the PIDs in the pid file if a value is not matching an active PID. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/flink 3517-scripts Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/1716.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1716 ---- commit e037c89404704b8f8bd02911e65dc1dd24b1e836 Author: Ufuk Celebi <u...@apache.org> Date: 2016-02-25T23:11:48Z [FLINK-3517] [dist] Only count active PIDs in start script ---- > Number of job and task managers not checked in scripts > ------------------------------------------------------ > > Key: FLINK-3517 > URL: https://issues.apache.org/jira/browse/FLINK-3517 > Project: Flink > Issue Type: Test > Components: Start-Stop Scripts > Reporter: Ufuk Celebi > Assignee: Ufuk Celebi > Priority: Minor > > The start up scripts determine whether a job or task manager is running via a > pids file. If a process, which is part of the pid file, is destroyed (for > example on failure) outside of the scripts, a warning for multiple job > managers are printed even though they are not running. -- This message was sent by Atlassian JIRA (v6.3.4#6332)