Graham,
It is people like you that make the open-source world go round. Thank you
for pointing out my errors and solving them at the same time.
I implemented the new config and the app is running as expected in terms of
functionality, which is great.
I still have a bit of an issue with speed, but nothing near as bad as it
was before. I have at times seen performance in the sub 0.5 second per file
range. Same hardware, same app, same code. And it would run for hours at
that speed, but gradually slow down. I am now seeing performance in the 0.8
to 1 second per file range. I do understand the issues with python
multi-threading and have successfully implemented multi-processing in other
(albeit simpler) standalone applications. But it seems odd that I would
have had better performance under the same conditions and see a gradual
slow down. Rebooting the server made no difference. If it had been a memory
leak or something of that nature one would think rebooting would make a
difference.
If it is a simple matter to implement celery in my app then I may go that
route. Unfortunately I'm not familiar enough with it yet to make an
accurate assessment. It may not be worth the effort. My app is working as
it is and the performance may be adequate. If I could achieve consistent
0.5 seconds per file performance it would be nice and would utilize my
hardware, which is basically loafing along right now. CPU nevers gets above
about 30% and there are other services running on this machine! So there's
plenty of performance to be gained... And there is a phenomena where it
seems to slow down over time.
But I'm now into straight python and off topic for this mailing list.
If you have any ideas I welcome them. But once again thanks for your help.
Really appreciated.
Best,
Gary
On Wednesday, December 16, 2020 at 4:45:35 PM UTC-8 Graham Dumpleton wrote:
> You have an incorrect path in the configuration meaning you are actually
> running embedded mode and not daemon mode. This means you are subject to
> Apache dynamic process management and thus you can end up with more than
> one process.
>
> Your config is:
>
> Listen 82
>
> #/etc/apache2/sites-available/phrasea_upload_flask.conf
> <VirtualHost *:82>
> ServerName phrasea_upload
> ServerAlias phrasea_upload.avlib.net
>
> WSGIDaemonProcess phrasea_uploadapp user=apache group=apache threads=15
> home=/var/www/upload-flask
> WSGIScriptAlias / /var/www/upload-flask/phrasea_upload_flask.wsgi
>
> Alias /static /var/www/upload-flask/static
> Alias /templates /var/www/upload-flask/templates
>
> <Directory "/var/www/phrasea_upload_flask">
> WSGIProcessGroup phrasea_uploadapp
> Require all granted
> WSGIScriptReloading On
> </Directory>
>
> LogLevel info
> ErrorLog /etc/httpd/logs/phrasea_upload_flask.log
> </VirtualHost>
>
>
> The path which is wrong is:
>
> /var/www/phrasea_upload_flask
>
> given in the Directory directive. It doesn't actually match the path for
> the WSGI script file, meaning that the WSGIProcessGroup directive was
> ignored.
>
> Change this to:
>
> Listen 82
>
> # Ensure that mod_wsgi embedded mode is disabled so don't accidentally run
> stuff in embedded mode.
> WSGIRestrictEmbedded On
>
> #/etc/apache2/sites-available/phrasea_upload_flask.conf
> <VirtualHost *:82>
> ServerName phrasea_upload
> ServerAlias phrasea_upload.avlib.net
>
> WSGIDaemonProcess phrasea_uploadapp user=apache group=apache threads=15
> home=/var/www/upload-flask
> # Set the daemon mode process group and application interpreter context
> here explicitly.
> WSGIScriptAlias / /var/www/upload-flask/phrasea_upload_flask.wsgi
> process-group=phrasea_uploadapp application-group=%{GLOBAL}
>
> Alias /static /var/www/upload-flask/static
> Alias /templates /var/www/upload-flask/templates
>
> <Directory "/var/www/upload-flask">
> Require all granted
> </Directory>
>
> LogLevel info
> ErrorLog /etc/httpd/logs/phrasea_upload_flask.log
> </VirtualHost>
>
>
> Rather than use WSGIProcessGroup, have set process-group on
> WSGIScriptAlias instead. Also set application-group to uses the main
> interpreter context and not a sub interpreter. This can avoid problems with
> third party Python modules that don't work in sub interpreters properly.
>
> You also didn't need WSGIScriptReloading as that is default for daemon
> mode.
>
> That the path didn't match meant that "Require all granted" wasn't being
> applied either. That it worked without that being applied means you are
> likely on one of the Linux distributions which somehow break Apache access
> controls and set access for the whole filesystem or URL namespace at higher
> scope somewhere.
>
> Since you have "info" for LogLevel, with daemon mode now being properly
> applied, you should clearly see WSGI script file being loaded in daemon
> mode process. Right now you have:
>
> [Sun Dec 13 03:29:39.291197 2020] [wsgi:info] [pid 16283:tid
> 139637154567936] [client 10.12.17.31:41458] mod_wsgi (pid=16283,
> process='', application='phrasea_upload|'): Loading Python script file
> '/var/www/upload-flask/phrasea_upload_flask.wsgi'.
>
> See how "process" is an empty string. This means that it was using
> embedded mode. With fixed config that "process" should show
> "phrasea_uploadapp" and application should be empty string, with the latter
> indicating main interpreter context rather than sub interpreter.
>
> Graham
>
> On 17 Dec 2020, at 10:53 am, Gary Conley <[email protected]>
> wrote:
>
> Hi Graham
>
> Thanks for the rapid reply.
>
> You will have to bear with me a bit as this is my first flask app... I've
> attached what I think you are asking for, plus a log file and my main
> app.py file. Probably a bit rough around the edges in places. The file
> attached is a zip file.
>
> This is the only app running under httpd on this server by the way. And in
> checking the logs I noticed I have 4 PIDs, tending to indicate I have 4
> processes running. You'll probably see from my app that it really can only
> have one process running as I'm really only trying to do one thing, process
> images for upload to a DAM, and there is only one user doing this, me.
>
> You will see the entries in the log file from yesterday morning showing an
> abort requested. I was the only user and had only one page open to run the
> app. You will see at 9.47 am the app reports there are 0 jobs running, and
> then 5 minutes later reports 1 job running 3 in the queue. That is
> accessing two global variables in app.py, phrasea_upload_runnables and
> upload_queue. What makes no sense at all is that both calls, 5 minutes
> apart should have been addressing the same variables with the same values.
> The PIDs on both those calls are the same.
>
> One other thing I should mention. When we first ran into the performance
> issues caused by the huge directories, we tried running the app under
> nginx, which we never got working. nginx is installed but not running on
> the server. I included my nginx ini file which specifies 4 processes.
>
> Thanks for taking the time to look this over.
>
> Best regards,
>
> Gary
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
>
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/modwsgi/0ad301ab-ff44-4b83-b261-4d9a2db0e33an%40googlegroups.com
>
> <https://groups.google.com/d/msgid/modwsgi/0ad301ab-ff44-4b83-b261-4d9a2db0e33an%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> <phrasea_upload>
>
>
>
--
You received this message because you are subscribed to the Google Groups
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/modwsgi/4ed52559-f19d-4023-be7a-bffd6cce4af3n%40googlegroups.com.