> On 1 Sep 2020, at 7:06 am, David White <dswhit...@gmail.com> wrote:
> 
> Unfortunately the main Apache log shows nothing except normal 
> startup/shutdown messages.  If the "sgm-prod" threads are being terminated 
> and restarted, they are leaving no indication of that in the main server 
> logs.  
> 
> The only clue appears to be that the crashing occurs during more heavy loads. 
>  The application does not often strain the CPU/RAM of the underlying host, 
> but the Apache vhost logs show the crashes seem to occur when multiple 
> requests are being handled nearly simultaneously.  
> 
> [Mon Aug 31 11:11:07.504787 2020] [wsgi:error] [pid 35980:tid 
> 140546546816768] [client 10.83.210.200:47266 <http://10.83.210.200:47266/>] 
> Truncated or oversized response headers received from daemon process 
> 'sgm-prod': /apps/www/sgm/wsgi/SGM/run.py
> [Mon Aug 31 11:11:07.504806 2020] [wsgi:error] [pid 35980:tid 
> 140546571994880] [client 10.31.82.142:35810 <http://10.31.82.142:35810/>] 
> Truncated or oversized response headers received from daemon process 
> 'sgm-prod': /apps/www/sgm/wsgi/SGM/run.py 
> 
> Am I likely to see any benefit to either:
> Reducing the process or thread count passed to WSGIDaemonProcess, or
> Putting Apache into "prefork" MPM?  
Never use prefork MPM if you can avoid it.

Also ensure that outside of VirtualHost you have:

   WSGIRestrictEmbedded On

to ensure you are never accidentally running in the main Apache child worker 
processes.

If there is no evidence of a crash, then another cause can be that you are 
using an external system to trigger log file rotation, rather than recommended 
method of using Apache's own log file rotation method.

An external log file rotation system usually only fires once per day, but maybe 
yours is set up differently based on size of logs.

The problem with an external log file rotation system is that it signals Apache 
to restart. Although the main Apache child worker process are given a grace 
period to finish requests, the way Apache libraries implement management of 
third party processes such as the mod_wsgi daemon processes is that it will 
kill them after 5 seconds if they don't shutdown quick enough. This results in 
requests being proxied by Apache child worker processes being chopped off and 
you see that message.

If it is this though, you should see clear messages at info level in logs from 
mod_wsgi that the daemon processes were being shutdown and restarted. This will 
appear some time before you see that message.

Only other thing can think of is that you have requests that are getting stuck, 
and eventually you hit that socket timeout, although you have that set very 
large, so would take a request to be stuck for 6000 seconds before you saw it.

Only way to tell if may be that is to use:

https://modwsgi.readthedocs.io/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces
 
<https://modwsgi.readthedocs.io/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces>

so can signal mod_wsgi to dump Python stack traces periodically and see if a 
specific request is stuck some where.

Graham

> Thanks, Graham.
> 
> 
> On Monday, August 31, 2020 at 4:56:46 PM UTC-4 Graham Dumpleton wrote:
> Look in the main Apache error log, not the virtual host, and you will 
> probably find a message about a segmentation fault or other error.
> 
> Where it is quite random like this, unless you can enable capture of the core 
> dump file and can run a debugger (gdb) on that to get a stack trace, is going 
> to be hard to track it down.
> 
> Only other thing can suggest is watching the process size to see if getting 
> so large that you are running out of memory and a failure to allocate memory 
> causes something to crash.
> 
> Graham
> 
> 
>> On 1 Sep 2020, at 3:28 am, David White <dswh...@gmail.com 
>> <applewebdata://7521E0B3-8937-4251-BD3F-116CE579C8BA>> wrote:
>> 
> 
>> 
>> Hello.  I am running Apache 2.4.43 with mod_wsgi 4.71 compiled against 
>> Python 3.8.3 (all manually compiled, not part of the RHEL 8.1 distro).  
>> Apache MPM is "event".
>> 
>> The application running in one of my virtual hosts will occasionally crash, 
>> but randomly.  Repeating the same request immediately will usually succeed.  
>> The application may continue working for hours before randomly crashing 
>> again.
>> 
>> This started happening recently with an upgrade of the Flask and SQLAlchemy 
>> modules.  (again, all manually installed via pip)
>> 
>> A crash is reported this way in the vhost's error_log:
>> 
>> [Mon Aug 31 11:11:07.504787 2020] [wsgi:error] [pid 35980:tid 
>> 140546546816768] [client 10.83.210.200:47266 <http://10.83.210.200:47266/>] 
>> Truncated or oversized response headers received from daemon process 
>> 'sgm-prod': /apps/www/sgm/wsgi/SGM/run.py
>> [Mon Aug 31 11:11:07.504806 2020] [wsgi:error] [pid 35980:tid 
>> 140546571994880] [client 10.31.82.142:35810 <http://10.31.82.142:35810/>] 
>> Truncated or oversized response headers received from daemon process 
>> 'sgm-prod': /apps/www/sgm/wsgi/SGM/run.py
>> [Mon Aug 31 11:11:08.512262 2020] [wsgi:info] [pid 40601:tid 
>> 140547110185856] mod_wsgi (pid=40601): Attach interpreter ''.
>> [Mon Aug 31 11:11:08.514415 2020] [wsgi:info] [pid 40601:tid 
>> 140547110185856] mod_wsgi (pid=40601): Adding '/apps/www/sgm/wsgi/SGM' to 
>> path.
>> [Mon Aug 31 11:11:08.514526 2020] [wsgi:info] [pid 40601:tid 
>> 140547110185856] mod_wsgi (pid=40601): Adding 
>> '/apps/vmscan/.virtualenvs/sgm/lib/python3.8/site-packages' to path.
>> [Mon Aug 31 11:11:08.515520 2020] [wsgi:debug] [pid 40601:tid 
>> 140546715096832] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 0 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.515575 2020] [wsgi:debug] [pid 40601:tid 
>> 140546706704128] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 1 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.515629 2020] [wsgi:debug] [pid 40601:tid 
>> 140546698311424] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 2 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.515702 2020] [wsgi:debug] [pid 40601:tid 
>> 140546689918720] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 3 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.515737 2020] [wsgi:debug] [pid 40601:tid 
>> 140546681526016] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 4 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.515766 2020] [wsgi:debug] [pid 40601:tid 
>> 140546673133312] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 5 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.515795 2020] [wsgi:debug] [pid 40601:tid 
>> 140546664740608] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 6 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.515821 2020] [wsgi:debug] [pid 40601:tid 
>> 140546656347904] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 7 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.515853 2020] [wsgi:debug] [pid 40601:tid 
>> 140546647955200] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 8 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.515921 2020] [wsgi:debug] [pid 40601:tid 
>> 140546639562496] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 9 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.515948 2020] [wsgi:debug] [pid 40601:tid 
>> 140546631169792] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 10 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.515972 2020] [wsgi:debug] [pid 40601:tid 
>> 140546622777088] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 11 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.516002 2020] [wsgi:debug] [pid 40601:tid 
>> 140546614384384] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 12 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.516036 2020] [wsgi:debug] [pid 40601:tid 
>> 140546605991680] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 13 in daemon process 'sgm-prod'.
>> [Mon Aug 31 11:11:08.516061 2020] [wsgi:debug] [pid 40601:tid 
>> 140546597598976] src/server/mod_wsgi.c(9118): mod_wsgi (pid=40601): Started 
>> thread 14 in daemon process 'sgm-prod'.
>> 
>> The WSGI portion of the configuration for the vhost in Apache looks like 
>> this:
>> 
>> WSGIPassAuthorization On
>> LogLevel info wsgi:trace6
>> SetEnv SGM_PRODUCTION  1
>> SetEnv SGM_USE_ORACLE  1
>> WSGIDaemonProcess sgm-prod user=sgmuser group=sgmgroup threads=15 
>> python-home=/apps/sgmuser/.virtualenvs/sgm 
>> python-path=/apps/www/sgm/wsgi/SGM:/apps/sgmuser/.virtualenvs/sgm/lib/python3.8/site-packages
>>  socket-timeout=6000
>> WSGIScriptAlias / /apps/www/sgm/wsgi/SGM/run.py
>> <Directory /apps/www/sgm/wsgi/SGM>
>>     Require all granted
>>     AllowOverride AuthConfig
>>     WSGIProcessGroup sgm-prod
>>     WSGIApplicationGroup %{GLOBAL}
>> </Directory>
>> 
>> The main Apache error log shows nothing relevant, and the 
>> application-specific logs (with Python logging messages) show only that the 
>> request was made.  There is no exception traceback data being logged (I'm 
>> guessing the app is crashing before that can happen.)
>> 
>> Given how transitory this error is, I'm not sure how else to configure 
>> Apache or the SGM app itself to get more details about why the "sgm-prod" 
>> process seems to be crashing.
>> 
>> Any help would be appreciated!  Thanks.
>> 
> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "modwsgi" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to modwsgi+u...@googlegroups.com 
>> <applewebdata://7521E0B3-8937-4251-BD3F-116CE579C8BA>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/modwsgi/8df00cf3-abeb-43a8-b14a-6666af322a9an%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/modwsgi/8df00cf3-abeb-43a8-b14a-6666af322a9an%40googlegroups.com?utm_medium=email&utm_source=footer>.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to modwsgi+unsubscr...@googlegroups.com 
> <mailto:modwsgi+unsubscr...@googlegroups.com>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/modwsgi/c5769203-50ef-4501-a3ac-ada2ddc6caaen%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/modwsgi/c5769203-50ef-4501-a3ac-ada2ddc6caaen%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to modwsgi+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/modwsgi/6E40925F-8FD9-44EF-B59D-A84ABCA90F27%40gmail.com.

Reply via email to