[jira] [Comment Edited] (TS-3104) traffic_cop can't restart traffic_manager properly

Victor (JIRA) Wed, 01 Oct 2014 05:23:17 -0700

    [ 
https://issues.apache.org/jira/browse/TS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14154736#comment-14154736
 ]


Victor edited comment on TS-3104 at 10/1/14 12:21 PM:
------------------------------------------------------

When the issue was reproduced one could see it in syslog (journalctl): numerous 
messages "unable to retrieve manager_binary". After applying the attached 
patches the issue was gone, the processes were restarted correctly by 
traffic_cop. The following tests were made:


* kill `pgrep traffic_manager`
* kill -9 `pgrep traffic_manager`
* kill `pgrep traffic_server`
* kill -9 `pgrep traffic_server`
* kill `pgrep traffic_manager`; kill `pgrep traffic_server`
* kill -9 `pgrep traffic_manager`; kill -9 `pgrep traffic_server`


In all cases both manager and traffic_server were restarted correctly, no 
endless loop of traffic_cop trying to restart manager was seen.


was (Author: vleschuk):
Whe the issue was reproduced one could see it in syslog (journalctl): numerous 
messages "unable to retrieve manager_binary". After applying the attached 
patches the issue was gone, the processes were restarted correctly by 
traffic_cop. The following tests were made:


* kill `pgrep traffic_manager`
* kill -9 `pgrep traffic_manager`
* kill `pgrep traffic_server`
* kill -9 `pgrep traffic_server`
* kill `pgrep traffic_manager`; kill `pgrep traffic_server`
* kill -9 `pgrep traffic_manager`; kill -9 `pgrep traffic_server`


In all cases both manager and traffic_server were restarted correctly, no 
endless loop of traffic_cop trying to restart manager was seen.

> traffic_cop can't restart traffic_manager properly
> --------------------------------------------------
>
>                 Key: TS-3104
>                 URL: https://issues.apache.org/jira/browse/TS-3104
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Cop
>            Reporter: Victor
>         Attachments: ts-0022-fix-lockfile-killgroup.patch, 
> ts-0023-cop-reinit-mgr-api-on-failure.patch
>
>
> In some cases traffic_cop can't restart traffic_manager properly. We met 
> these issues at "Ashmanov and partners" (http://en.ashmanov.com/). There are 
> two places in code which in my opinion need corrections:
> 1) The logic which decides whether to kill process or group.
> 2) The main traffic_cop loop: it doesn't reinitialize manager API in case of 
> failure and this fact leads to constant attempts to connect to manager using 
> socket id == -1. 
> I have prepared patches for both issues. Please kindly take a look at them 
> and let me know your thoughts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (TS-3104) traffic_cop can't restart traffic_manager properly

Reply via email to