[
https://issues.apache.org/jira/browse/TS-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870553#action_12870553
]
Mladen Turk commented on TS-348:
--------------------------------
We should make a way to tell both to the traffic_manager and traffic_cop that
the application they manage (traffic_manager->traffic_server and
traffic_cop->traffic_manager) is in the unrecoverable state.
Currently the child process is automatically restarted (with or without delay)
without considering the cause of child process death.
This causes miss-configured child application endlessly restarted without the
chance to propagate that cause to the monitor application which should exit
in such cases as well.
I presume that restart mechanism is used to cover transient memory errors
and reconfiguration restarts.
The easiest solution would be that parent considers the child's exit value, and
on some predetermined defined number stops respawning it's child.
This would however require to carefully select fatal errors and process exit
values which are now pretty heuristic,
> Infinite core file creation
> ---------------------------
>
> Key: TS-348
> URL: https://issues.apache.org/jira/browse/TS-348
> Project: Traffic Server
> Issue Type: Bug
> Affects Versions: 2.1.0
> Reporter: Mladen Turk
> Priority: Critical
>
> If traffic server is started with non root user account the launch script
> endlessly loops in start attempts
> generating core.PID file on each iteration.
> This creates 80+ MB core file about each second until the disk gets full.
> The following log entry is added on each iteration:
> E. Mgmt] log ==> [TrafficManager] using root directory
> '/home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local'
> [May 13 12:50:18.299] {3086546656} STATUS: opened
> var/log/trafficserver/manager.log
> [TrafficServer] using root directory
> '/home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local'
> [May 13 12:50:20.830] {1074246896} STATUS: opened
> var/log/trafficserver/diags.log
> FATAL: Can't change group to user: nobody, gid: 99
> /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server
> - STACK TRACE:
> /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server(ink_fatal_va+0x8f)[0x83451c7]
> /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server(ink_fatal_die+0x1d)[0x83451f7]
> /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server(_Z14change_uid_gidPKc+0xd8)[0x8152a52]
> /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server(main+0x1296)[0x8153e68]
> /lib/libc.so.6(__libc_start_main+0xdc)[0x7bee9c]
> /home/mturk/tmp/trafficserver-trunk/trunk-svn/release1/usr/local/bin/traffic_server[0x80f3b31]
> [May 13 12:50:21.176] Manager {3086546656} ERROR:
> [LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 6:
> Aborted
> [May 13 12:50:21.176] Manager {3086546656} ERROR: (last system error 2: No
> such file or directory)
> [May 13 12:50:21.176] Manager {3086546656} ERROR: [Alarms::signalAlarm]
> Server Process was reset
> [May 13 12:50:21.176] Manager {3086546656} ERROR: (last system error 2: No
> such file or directory)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.