Am 04.03.2016 um 20:35 schrieb Max Lynch:
Hi there,

We have a very heavily used implementation of modjk 1.2.35 running in
Apache 2.2.15 i686 on CentOS 6.7 x86_64. After Apache startup, our system
will perform optimally with no errors for about 24 hours, after which we
begin to see this message:

connecting to backend failed. Tomcat is probably not started or is
listening on the wrong port (errno=115)

Errno 115 on RHEL is EINPROGRESS. That means the call didn't finish but one could retry it. This indicates we might be able to improve the code, but it is also possible that e.g. a configured socket_connect_timeout was reached. To check, we would need the full mod_jk log lines (and if several different log lines show up for one event all of them) including the columns with source file name and line number etc.

It would also be very useful to see your configuration. You can remove IP adresses, ports, secrets etc. and rename your workers but we should see the timeout setting, cping settings and so on.

Once we start seeing this error for one backend/worker, we begin seeing the
same errors for eventually all workers. This problem doesn't go away until
we restart Apache. Our setup consists of 720 workers per apache, with
multiple apache servers, and each apache server also has several other
sites configured with modjk serving other tomcat backends. It should be
noted that we do not see the same error with other sites, nor do we have so
many workers defined for any other site.

We've searched through past mailings to try and find the same issue. The
couple times we saw it brought up the error code 110 was also mentioned. It
should be noted we do not see the same pattern. errno 110 does show but
outside of the window when the problem begins and is at its worst. We
believe this issue is not configuration related.

We're posting this here to try and gather more data. Our process prohibits
an upgrade of any kind without plenty of evidence supporting our position.
Hoping that some individuals might have seen this particular issue, or if
there is any data on whether this could be a bug. Hopefully we're correct
in thinking this issue is not a configuration problem, but we'll help to
rule out.

It could also be interesting to capture the output of "netstat -an" during the time that the problem happens. And finally the same on the Tomcat side as well as a thread dump of the Tomcat JVM.

Once you provide the full log line, I can check, whether the 1.2.41 code actually has improvements related to errno 115 or not. Knowing the place in the code where the error occurs might also give us an idea, what might have happened and how to check further (e.g. Tomcat not accepting connections, firewall idle connection drop between mod_jk and Tomcat etc. etc.).

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to