I was grepping all the log files for any messages within a minute or two 
before the segfaults so if a request was logged anywhere I should have seen 
it.  

I'll mention the LogLevel Debug setting to them, but there were complaints 
before that LogLevel Info was too noisy so I'm not sure that will fly.

I'll look into the possibility of errant app threads and post back if 
anything turns up.  Thanks very much for your help with this.
On Tuesday, October 25, 2022 at 11:05:28 PM UTC-6 Graham Dumpleton wrote:

> The only way I can think of that you may get a request which wasn't 
> logged, is if an internal Apache request was triggered via an internal 
> redirect from another Apache module. There still has to be an original 
> request, but it would be logged as different request URL to where it got 
> internally redirected.
>
> Since you are using mod_wsgi daemon mode you can likely see better 
> evidence of all requests being handled if turn on verbose debugging mode, 
> but would be quite noisy.
>
>     LogLevel debug
>     WSGIVerboseDebugging On
>
> Graham
>
> On 26 Oct 2022, at 3:33 pm, stuart mcgraw <smcg...@gmail.com> wrote:
>
> I didn't compile mod_wsgi myself so I can't say 100% but the person who 
> did said so and there is a source directory on the machine named 
> mod_wsgi-4.9.4 with a mod_wsgi.so file whose sha1 checksum matches that of 
> the file in the apache modules/ directory, so I'd say I'm 99.9% sure.
>
> But the chances of >1Gi requests being made seem pretty small.  The urls 
> haven't been publicized, there are only a handful of known users accessing 
> the urls infrequently as testers and nothing in the application would 
> generate requests of that magnitude.
>
> This may be out of scope for you, but are you aware of any (reasonably 
> normal) circumstances under which a mod_wsgi process could receive a 
> request that wasn't logged by Apache?  Or perhaps I could modify the 
> mod_wsgi source code to print a message to a file when a request was 
> received (which I could then correlate with the Apache logs to answer the 
> question.)  Because usage is very light and this is only for short term 
> debugging, I don't think locking or anything fancy would be needed?
>
> And I am still wondering about library mismatches or conflicts since 
> Apache, Python, mod_wsgi and C-based Python modules (eg psyocopg2) used by 
> the app were all built from source.  It is possible that some version 
> mismatch there causes some memory corruption that is later manifest when 
> one of the mod_wsgi housekeeping threads runs?  I would like if possible to 
> rule this out or at least put at the bottom of the list.  
> On Tuesday, October 25, 2022 at 8:35:07 PM UTC-6 Graham Dumpleton wrote:
>
>> I know you said you were using mod_wsgi/4.9.4, but are you absolutely 
>> sure?  Apache/2.4.54 made a breaking change by changing the default for 
>> LimitRequestBody directive, which would cause mod_wsgi daemon process to 
>> crash when there were sent large request bodies over 1Gi. This was fixed in 
>> version 4.9.4, but am wondering whether your production system has older 
>> version than your development systems use and you just aren't aware of that.
>>
>>
>> https://modwsgi.readthedocs.io/en/master/release-notes/version-4.9.4.html#bugs-fixed
>>
>> As to back ground threads, mod_wsgi has a couple of background threads 
>> which check for idle activity, deadlocks and things, but they touch so 
>> little they have never caused issues in the past. Beyond that, the request 
>> handler threads themselves should be stuck on a select loop if no requests 
>> are happening.
>>
>> On 26 Oct 2022, at 1:12 pm, stuart mcgraw <smcg...@gmail.com> wrote:
>>
>> Again, thanks for those suggestions.
>>
>> The OOM killer seems not to be an issue.  I've been told there are no 
>> signs of it in the system logs and no signs of memory problems via 
>> monitoring during nomal operations.
>>
>> Nor did "WSGIDestroyInterpreter Off" have any effect, the segfaults are 
>> still occurring after that was added and Apache restarted.
>>
>> My understanding of how mod_wsgi works is pretty sketchy.  IIUC you are 
>> saying that the mod_wsgi processes are sitting there, waiting on a select() 
>> call or the like, to receive a request from the mod_wsgi code within 
>> Apache; and in that state they cannot simply spontaneously crash -- it must 
>> be that either that the process received request from Apache (via the 
>> mod_wsgi module) or there is some independent thread running in the Python 
>> part of the mod_wsgi process (which is running my wsgi app) that is causing 
>> the crash?
>>
>> I based my claim that there were no requests coincidental with the 
>> segfaults based on the lack of log messages within a second or two for some 
>> of the segfaults.  (Its a moderately busy server so of course there were 
>> also some close in time but for seemingly unrelated pages: eg, python, php 
>> or c cgi, or html.)  Is it possible that the mod_wsgi processes are getting 
>> woken up by something that does not produce an apache access log entry?
>>
>> I'm still working on the python thread hypothesis (this is a production 
>> server so changes aren't easy.)
>> On Sunday, October 23, 2022 at 2:12:02 PM UTC-6 Graham Dumpleton wrote:
>>
>>> How much memory do the processes use? Maybe the system OOM process 
>>> killer is killing the processes as they consume lots of memory and the 
>>> system thinks it is running low. There were some potential problems 
>>> introduced with Python 3.9 with how process are shutdown and that causes 
>>> embedded systems to fail on shutdown.
>>>
>>> See:
>>>
>>>
>>> https://modwsgi.readthedocs.io/en/master/release-notes/version-4.9.1.html#features-changed
>>>
>>> You can try setting:
>>>
>>> WSGIDestroyInterpreter Off
>>>
>>> as mentioned in those change notes and see if it goes away.
>>>
>>> Other than that, if you are confident that no new requests are arriving, 
>>> can only suggest you work out if there are background threads running in 
>>> Python.
>>>
>>> You can do that be adding code as described in:
>>>
>>>
>>> https://modwsgi.readthedocs.io/en/master/user-guides/debugging-techniques.html#extracting-python-stack-traces
>>>
>>> and triggering a dump of running threads by touching a file in the file 
>>> system.
>>>
>>> It might also be helpful if you can work out how to have the system 
>>> preserve core dumps from Apache so they can be used to extract a true 
>>> process stack trace as that may give a clue.
>>>
>>> Graham
>>>
>>> On 24 Oct 2022, at 3:51 am, stuart mcgraw <smcg...@gmail.com> wrote:
>>>
>>> Thanks for that suggestion.  I passed it on to the site admin made and 
>>> he made the "application-group=%{GLOBAL}" change, but unfortunately it made 
>>> no difference, the segfaults are still occurring as before.  Is there 
>>> anything else I can look at?  The current configuration is:
>>>
>>> WSGIDaemonProcess jmwsgi processes=2 threads=10 \
>>>     display-name=apache2-jmwsgi locale=en_US.UTF-8 lang=en_US.UTF-8
>>> WSGIScriptAlias /jmwsgi 
>>> /usr/local/apache2/jmdictdb/wsgifiles/jmdictdb.wsgi \
>>>     process-group=jmwsgi application-group=%{GLOBAL}
>>>
>>> Would changing to "process=N threads=1" or "processes=1 threads=N" 
>>> provide any useful info?  Apache, mod_wsgi and the other web server 
>>> components were all built there (ie, they are not from distro-supplied 
>>> packages.)  Are the symptoms consistent with a mismatched library or some 
>>> other build configuration issue?  Or conversely, maybe they make that 
>>> unlikely? 
>>> On Friday, October 21, 2022 at 11:48:51 PM UTC-6 Graham Dumpleton wrote:
>>>
>>>> Try changing it to:
>>>>
>>>> WSGIScriptAlias /jmwsgi 
>>>> /usr/local/apache2/jmdictdb/wsgifiles/jmdictdb.wsgi \
>>>>       process-group=jmwsgi application-group=%{GLOBAL}
>>>>
>>>> You are possibly using a third party Python module which isn't designed 
>>>> to work in Python sub interpreters. That application group value forces 
>>>> the 
>>>> main Python interpreter context to be used, which can avoid problems with 
>>>> crashes, or thread deadlocks when such broken modules are used.
>>>>
>>>>
>>>> https://modwsgi.readthedocs.io/en/master/user-guides/application-issues.html#python-simplified-gil-state-api
>>>>
>>>> That option on WSGIScriptAlias has same affect as WSGIAplicationGroup 
>>>> but is more specific. For same reason, your use of WSGIProcessGroup is 
>>>> redundant as process group setting on WSGIScriptAlias takes precedence.
>>>>
>>>> Graham
>>>>
>>>> On 22 Oct 2022, at 2:35 pm, stuart mcgraw <smcg...@gmail.com> wrote:
>>>>
>>>> My apologies for the delayed response, I thought I had my google email 
>>>> forwarded to my main email account but... :-(
>>>>
>>>> My intent was that the processes run in daemon mode.  I had missed the 
>>>> info about the WSGIRestrictEmbedded directive when I went through the doc, 
>>>> I'll ask the admin there to add that.  The full configuration for wsgi is:
>>>>
>>>>   WSGIDaemonProcess jmwsgi processes=2 threads=10 \
>>>>       display-name=apache2-jmwsgi locale=en_US.UTF-8 lang=en_US.UTF-8
>>>>   WSGIProcessGroup jmwsgi
>>>>   WSGIScriptAlias /jmwsgi 
>>>> /usr/local/apache2/jmdictdb/wsgifiles/jmdictdb.wsgi \
>>>>       process-group=jmwsgi
>>>>     # Serve static files directly without using the app.
>>>>   Alias /jmwsgi/web/ /usr/local/apache2/jmdictdb/
>>>>   <Directory /usr/local/apache2/jmdictdb>
>>>>       DirectoryIndex disabled
>>>>       Require all granted
>>>>       </Directory>
>>>>
>>>> The server has a number of virtual hosts and there were a few mod_wsgi 
>>>> "Loading Python" messages in the error log for one of them (for ssl) but 
>>>> nothing looking errorish and only a few, nowhere near the number of 
>>>> segfault messages:
>>>>
>>>>   [Sat Oct 01 07:50:12.090697 2022] [wsgi:info] [pid 731154:tid 
>>>> 140442461062912] [remote *.*.*.*:40566] mod_wsgi (pid=731154, 
>>>> process='jmwsgi', application='www.edrdg.org|/jmwsgi'): Loading Python 
>>>> script file '/usr/local/apache2/jmdictdb/wsgifiles/jmdictdb.wsgi'.
>>>>
>>>> But the wsgi configuration stuff is outside all the virtutal hosts.
>>>>
>>>> When the server starts, there are a couple messages in the main error 
>>>> log file like:
>>>>
>>>>   [Sat Oct 01 06:42:26.499086 2022] [wsgi:info] [pid 731041:tid 
>>>> 140442622753728] mod_wsgi (pid=731041): Starting process 'jmwsgi' with 
>>>> uid=33, gid=33 and threads=10.
>>>>   [Sat Oct 01 06:42:26.499518 2022] [wsgi:info] [pid 731039:tid 
>>>> 140442622753728] mod_wsgi (pid=731039): Starting process 'jmwsgi' with 
>>>> uid=33, gid=33 and threads=10.
>>>>
>>>> and these are followed/interleaved with the "Initializing Python" and 
>>>> "Attach interpreter" messages but after server startup the messages are 
>>>> limited to the sets of three I showed: "Initializing Python" and "Attach 
>>>> interpreter" followed sometime later by the Segmentation fault.
>>>>
>>>> Does any of that help?
>>>> On Sunday, October 16, 2022 at 4:16:09 PM UTC-6 Graham Dumpleton wrote:
>>>>
>>>>> What other mod_wsgi configuration is there besides 
>>>>> the WSGIDaemonProcess directive? That alone only creates a mod_wsgi 
>>>>> daemon 
>>>>> process group, but does not tell mod_wsgi to use it. Thus cannot tell 
>>>>> whether you are using embedded mode or daemon mode. The logs are also odd 
>>>>> in that would expect to see other messages in there around when processes 
>>>>> are created if using daemon mode, plus an indication of whether a message 
>>>>> is being generated from an Apache child process or mod_wsgi daemon 
>>>>> process.
>>>>>
>>>>> So can you supply the other parts of the mod_wsgi configuration so can 
>>>>> see if properly using daemon mode or not. Also look for logs from 
>>>>> mod_wsgi 
>>>>> in any per virtual host specific error log file and not just main Apache 
>>>>> error log if you separate them. Finally, if you are only intending to use 
>>>>> mod_wsgi daemon mode, ensure you add the directive:
>>>>>
>>>>>     WSGIRestrictEmbedded On
>>>>>
>>>>> outside of all VirtualHost definitions so that any attempt to 
>>>>> intitialise/use Python in main Apache child processes is disabled.
>>>>>
>>>>> Graham
>>>>>
>>>>> On 17 Oct 2022, at 8:07 am, stuart mcgraw <smcg...@gmail.com> wrote:
>>>>>
>>>>> I am author of a Flask application running under Linux/Apache mod_wsgi 
>>>>> that is experiencing intermittent, random segmentation faults.  
>>>>>
>>>>> What is unusual is that the mod_wsgi process segfaults are occurring 
>>>>> not at startup when mod_wsgi is loaded, or at when an incoming request 
>>>>> accesses the app, but when the wsgi processes are just sitting there, 
>>>>> quiescent.
>>>>>
>>>>> From a user's point of view, everything looks fine, the mod_wsgi 
>>>>> processes and the app respond with the right results with no sign of 
>>>>> trouble at the client's browser.  But looking at the Apache logs shows 
>>>>> the 
>>>>> wsgi processes periodically segfaulting and getting restarted with no 
>>>>> correlated incoming requests.  They die sometimes after running for a few 
>>>>> minutes, sometimes after a few hours.  There are no incoming requests to 
>>>>> the the wsgi app logged near the time of these crashes.
>>>>>
>>>>> For example:
>>>>> [Mon May 30 22:35:43.040387 2022] [wsgi:info] [pid 2575903:tid 
>>>>> 139929303559104] mod_wsgi (pid=2575903): Initializing Python.
>>>>> [Mon May 30 22:35:43.099053 2022] [wsgi:info] [pid 2575903:tid 
>>>>> 139929303559104] mod_wsgi (pid=2575903): Attach interpreter ''.
>>>>> [Tue May 31 01:29:06.434000 2022] [core:notice] [pid 2876203:tid 
>>>>> 139929303559104] AH00052: child pid 2511562 exit signal Segmentation 
>>>>> fault 
>>>>> (11)
>>>>> [Tue May 31 01:29:07.466268 2022] [wsgi:info] [pid 2605661:tid 
>>>>> 139929303559104] mod_wsgi (pid=2605661): Initializing Python.
>>>>> [Tue May 31 01:29:07.517413 2022] [wsgi:info] [pid 2605661:tid 
>>>>> 139929303559104] mod_wsgi (pid=2605661): Attach interpreter ''.
>>>>> [Tue May 31 04:14:59.405491 2022] [core:notice] [pid 2876203:tid 
>>>>> 139929303559104] AH00052: child pid 2575903 exit signal Segmentation 
>>>>> fault 
>>>>> (11)
>>>>>
>>>>> My wsgi app is still being tested so other than infrequent requests 
>>>>> generated by me and a few other people there is very little traffic to 
>>>>> it.  
>>>>> However the web server itself is handling some continuous moderate volume 
>>>>> of traffic to other apps including to C, Python and PHP CGI apps.
>>>>>
>>>>> What I know about the environment (if any other info would be useful 
>>>>> I'll try and dig it up):
>>>>>
>>>>> $ cat /etc/*release
>>>>> PRETTY_NAME="Debian GNU/Linux 11 (bullseye)
>>>>>
>>>>> Apache, mod_wsgi, python were all built from source by the site's 
>>>>> administrator.
>>>>>
>>>>> There are (at least) two Python's on the system:
>>>>>  /usr/bin/python3 -- 3.9.2
>>>>>  /usr/local/bin/python3 -- 3.10.1
>>>>>
>>>>> Apachche/mod_wsgi is was supposedly built against python-3.10.  From 
>>>>> the http server header:
>>>>>   Apache/2.4.54 (Unix) OpenSSL/1.1.1n mod_wsgi/4.9.4 Python/3.10 
>>>>> PHP/7.4.23
>>>>>
>>>>> The Apache .conf file uses:
>>>>>   WSGIDaemonProcess myapp processes=2 threads=10 \
>>>>>     display-name=apache2-myapp locale=en_US.UTF-8 lang=en_US.UTF-8
>>>>>
>>>>> $ /usr/local/apache2/bin/httpd -V
>>>>> Server version: Apache/2.4.54 (Unix)
>>>>> Server built:   Oct 13 2022 00:07:38
>>>>> Server's Module Magic Number: 20120211:124
>>>>> Server loaded:  APR 1.6.5, APR-UTIL 1.6.1, PCRE 10.36 2020-12-04
>>>>> Compiled using: APR 1.6.5, APR-UTIL 1.6.1, PCRE 10.36 2020-12-04
>>>>> Architecture:   64-bit
>>>>> Server MPM:     event
>>>>>   threaded:     yes (fixed thread count)
>>>>>     forked:     yes (variable process count)
>>>>> Server compiled with....
>>>>>  -D APR_HAS_SENDFILE
>>>>>  -D APR_HAS_MMAP
>>>>>  -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
>>>>>  -D APR_USE_SYSVSEM_SERIALIZE
>>>>>  -D APR_USE_PTHREAD_SERIALIZE
>>>>>  -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
>>>>>  -D APR_HAS_OTHER_CHILD
>>>>>  -D AP_HAVE_RELIABLE_PIPED_LOGS
>>>>>  -D DYNAMIC_MODULE_LIMIT=256
>>>>>  -D HTTPD_ROOT="/usr/local/apache2"
>>>>>  -D SUEXEC_BIN="/usr/local/apache2/bin/suexec"
>>>>>  -D DEFAULT_PIDLOG="logs/httpd.pid"
>>>>>  -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
>>>>>  -D DEFAULT_ERRORLOG="logs/error_log"
>>>>>  -D AP_TYPES_CONFIG_FILE="conf/mime.types"
>>>>>  -D SERVER_CONFIG_FILE="conf/httpd.conf"
>>>>>
>>>>> $ bin/httpd -M
>>>>> Loaded Modules:
>>>>>  core_module (static)
>>>>>  so_module (static)
>>>>>  http_module (static)
>>>>>  mpm_event_module (static)
>>>>>  authz_core_module (shared)
>>>>>  authz_host_module (shared)
>>>>>  unixd_module (shared)
>>>>>  dir_module (shared)
>>>>>  access_compat_module (shared)
>>>>>  env_module (shared)
>>>>>  alias_module (shared)
>>>>>  log_config_module (shared)
>>>>>  ssl_module (shared)
>>>>>  mime_module (shared)
>>>>>  socache_shmcb_module (shared)
>>>>>  setenvif_module (shared)
>>>>>  cgid_module (shared)
>>>>>  userdir_module (shared)
>>>>>  headers_module (shared)
>>>>>  rewrite_module (shared)
>>>>>  autoindex_module (shared)
>>>>>  negotiation_module (shared)
>>>>>  dav_module (shared)
>>>>>  deflate_module (shared)
>>>>>  info_module (shared)
>>>>>  status_module (shared)
>>>>>  wsgi_module (shared)
>>>>>  evasive24_module (shared)
>>>>>  php7_module (shared)
>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "modwsgi" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to modwsgi+u...@googlegroups.com.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/modwsgi/e06f789f-6023-417e-8b10-1f570adc069cn%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/modwsgi/e06f789f-6023-417e-8b10-1f570adc069cn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>>
>>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "modwsgi" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to modwsgi+u...@googlegroups.com.
>>>>
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/modwsgi/093152d7-26a6-4d52-8c7b-0d4cb643fa95n%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/modwsgi/093152d7-26a6-4d52-8c7b-0d4cb643fa95n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>>
>>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "modwsgi" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to modwsgi+u...@googlegroups.com.
>>>
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/modwsgi/363f423b-5be1-4c33-8783-638c0cd72512n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/modwsgi/363f423b-5be1-4c33-8783-638c0cd72512n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>>
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "modwsgi" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to modwsgi+u...@googlegroups.com.
>>
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/modwsgi/e51f7b06-b0e5-42b3-ac9c-3cc3cb89070en%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/modwsgi/e51f7b06-b0e5-42b3-ac9c-3cc3cb89070en%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>>
>>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to modwsgi+u...@googlegroups.com.
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/modwsgi/cecd0f01-2344-466f-9ed1-4fae73dc2762n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/modwsgi/cecd0f01-2344-466f-9ed1-4fae73dc2762n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to modwsgi+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/modwsgi/6ff197f1-c73a-42c4-b849-9418370fc9e3n%40googlegroups.com.

Reply via email to