Sounds like you have very bad code running... Do you have a reverse proxy infront of hypnotoad? If you have nginx, I would probably count how many 499, 500, 502 and 504 you got in the access log, compared to 200. I suspect the number is very high.
About "no heartbeat": https://metacpan.org/pod/Mojo::Server::Prefork#heartbeat_interval So, if your worker are doing some blocking work for more than 20 seconds (default), the manager (parent hypnotoad process) will kill the child process. And by "restarted" it means killing a child and starting a new one to do the same task. On Friday, April 3, 2015 at 5:56:02 PM UTC+2, Nathan Waddell wrote: > > I've got an app that is served up by Hypnotoad, with no reverse proxy.It > has 15 workers, with 2 clients allowed apiece. The app is launched via > hypnotoad in foreground mode, as it is being run underneath a superdaemon > service manager. > > I am seeing the following in the log/production.log: > > [Wed Apr 1 16:28:12 2015] [error] Worker 119914 has no heartbeat, restarting. > [Wed Apr 1 16:28:21 2015] [error] Worker 119910 has no heartbeat, restarting. > [Wed Apr 1 16:28:21 2015] [error] Worker 119913 has no heartbeat, restarting. > [Wed Apr 1 16:28:22 2015] [error] Worker 119917 has no heartbeat, restarting. > [Wed Apr 1 16:28:22 2015] [error] Worker 119909 has no heartbeat, restarting. > [Wed Apr 1 16:28:27 2015] [error] Worker 119907 has no heartbeat, restarting. > [Wed Apr 1 16:28:34 2015] [error] Worker 119905 has no heartbeat, restarting. > [Wed Apr 1 16:28:42 2015] [error] Worker 119904 has no heartbeat, restarting. > [Wed Apr 1 16:30:12 2015] [error] Worker 119912 has no heartbeat, restarting. > [Wed Apr 1 16:31:23 2015] [error] Worker 119918 has no heartbeat, restarting. > [Wed Apr 1 16:32:18 2015] [error] Worker 119911 has no heartbeat, restarting. > [Wed Apr 1 16:32:22 2015] [error] Worker 119916 has no heartbeat, restarting. > > The workers are killed, however, the workers are never restarted. > > When I run an strace, the manager process appears to be valiantly trying > to kill the (now expired) workers: > > Process 119878 attached - interrupt to quit > restart_syscall(<... resuming interrupted call ...>) = 0 > kill(119906, SIGKILL) = 0 > kill(119917, SIGKILL) = 0 > kill(119905, SIGKILL) = 0 > kill(119910, SIGKILL) = 0 > kill(119904, SIGKILL) = 0 > kill(119914, SIGKILL) = 0 > kill(119916, SIGKILL) = 0 > kill(119908, SIGKILL) = 0 > kill(119913, SIGKILL) = 0 > kill(119915, SIGKILL) = 0 > kill(119918, SIGKILL) = 0 > kill(119912, SIGKILL) = 0 > kill(119909, SIGKILL) = 0 > kill(119911, SIGKILL) = 0 > kill(119907, SIGKILL) = 0 > stat("/xxx/xxx/xxx/hypnotoad.pid", {st_mode=S_IFREG|0644, st_size=6, ...}) = 0 > poll([{fd=4, events=POLLIN|POLLPRI}], 1, 1000) = 0 (Timeout) > kill(119906, SIGKILL) = 0 > kill(119917, SIGKILL) = 0 > kill(119905, SIGKILL) = 0 > kill(119910, SIGKILL) = 0 > kill(119904, SIGKILL) = 0 > kill(119914, SIGKILL) = 0 > kill(119916, SIGKILL) = 0 > kill(119908, SIGKILL) = 0 > kill(119913, SIGKILL) = 0 > kill(119915, SIGKILL) = 0 > kill(119918, SIGKILL) = 0 > kill(119912, SIGKILL) = 0 > kill(119909, SIGKILL) = 0 > kill(119911, SIGKILL) = 0 > kill(119907, SIGKILL) = 0 > stat("/xxx/xxx/xxx/hypnotoad.pid", {st_mode=S_IFREG|0644, st_size=6, ...}) = 0 > poll([{fd=4, events=POLLIN|POLLPRI}], 1, 1000^C <unfinished ...> > Process 119878 detached > > How can I troubleshoot this further to determine: > > 1. Why does Hypnotoad think it still needs to kill non-existent > processes? > 2. Why isn't it starting new ones? > > -- You received this message because you are subscribed to the Google Groups "Mojolicious" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/mojolicious. For more options, visit https://groups.google.com/d/optout.
