Thanks Graham,

I'll do as you suggest and see what we get.

I suspect I have some sort of issue with large directories. As a workaround 
I've been breaking the directories down into 4000 images at a time and the 
performance is acceptable. So, while image processing may not be a great 
idea, it is working well for me provided I don't have huge directories. I 
had one as large as 10,000 that also ran fine, but 30,000+ was a total bust 
with performance rapidly going from 2 images per second to 7 seconds per 
image. With 4000 images in a directory I get consistent performance of 1-2 
images per second.

Not sure if that helps narrow down the investigation, but thought I'd 
mention it.

Gary

On Tuesday, December 8, 2020 at 4:02:37 PM UTC-8 Graham Dumpleton wrote:

> Performing a lot of image processing in the web application processes is 
> generally a bad idea. Usually you would do this using a backend task 
> queuing system like Celery.
>
> The main reason it is a bad idea is that Python does not perform very well 
> when you have multiple threads and there are heavily CPU bound. This is 
> because the Python global interpreter lock will result in Python 
> application code effectively being serialised even though you have multiple 
> threads. So for CPU bound work, multithreading is a convenience, but not a 
> performant solution.
>
> First thing I would do is confirm how many threads you actually have 
> running in the process and what they are doing, when the process slows 
> down. For this you can employ the code at:
>
>
> https://modwsgi.readthedocs.io/en/master/user-guides/debugging-techniques.html#extracting-python-stack-traces
>
> You will need to update the example code to Python 3 as is still Python 2.
>
> With that code in place, you can trigger a dump of what all the threads 
> are up to.
>
> Look for threads being stuck in code which may not be performant in 
> handling large directory listings or general image processing.
>
> Also look for more threads than expected, perhaps because where you are 
> starting worker threads is getting executed more than once for some reason.
>
> Graham
>
> On 9 Dec 2020, at 3:17 am, Gary Conley <ga...@goldeneraproductions.org> 
> wrote:
>
> Hi Graham,
>
> Thanks for the reply.
>
> All processing is done within the web application processes, if I 
> understand your question correctly.
>
> This is my first web app, I've done all my prior development as PySide 
> desktop applications and actually migrated this app from a desktop app.
>
> From the upload.html template the user sees a lists of directories that 
> can be uploaded. These are all in a central "in box" directory which is 
> updated every 10 seconds. The user selects directories and clicks "Upload 
> Selected" which sends a request containing the selected paths to a 
> '_launch_upload' route in the Flask app.py. 
>
> In my Flask app.py the '_launch_upload' route adds the selected paths to a 
> Python queue. I then start a new python Thread targeting a _start_queue 
> method. _start_queue in turn takes each path and instantiates a "loader" 
> class which subclasses Thread and then calls the run() method on the loader 
> object, which performs the actual upload of that directory. 
>
> The loader object puts the path for each image into a queue (self.q) from 
> which they are processed in parallel up to 24 at a time (user 
> configurable). This is done using a queue/worker configuration as in:
>
> for _ in range(threadcount):
>     t = Thread(target=self.worker)
>     t.daemon = True
>     t.start()
> self.q.join()
>
> The worker method is where all the image processing and upload to the DAM 
> occurs.
>
> The workers take an image off the self.q queue and process them until the 
> queue is empty.
>
> All image processing is done using subprocess.run calling the appropriate 
> app.
>
> When self.q.join() returns the first directory has been fully processed 
> and the run method of the loader object returns. This returns control to 
> the start_upload method in app.py which calls run on the next loader object 
> and so on until all directories are processed.
>
> I may have implied that the app works on a watch folder basis but this is 
> not the case. It is entirely based on the user selection, for various 
> reasons.
>
> I also realized I failed to mention that we are running in wsgi daemon 
> mode.
>
> As a final note, there is an abort_upload route which can access the 
> loader objects and call a stop_upload method on the object which changes 
> the value of a keep_loading flag to false, which stops all processing on 
> the loader object. It also empties the queue of any unprocessed 
> directories. For some reason this method also stopped working. If I call it 
> within the first few seconds of calling launch_upload it works fine, but if 
> I let it go for a bit it no longer has any effect. This had been working 
> well, but has since stopped and yet I didn't change anything in this part 
> of the code. It seems to be related to this other problem with the slow 
> uploads, but I can't be 100% certain.
>
> I hope that is all clear.
>
> Let me know if any logs or code would be helpful.
>
> I really appreciate your help. I'm somewhat new to web development as I 
> said, but learning fast!
>
> Best,
>
> Gary
> On Monday, December 7, 2020 at 10:48:03 PM UTC-8 Graham Dumpleton wrote:
>
>> Is the image processing being done within the web application processes, 
>> or in a separate set of processes which operate only based on seeing what 
>> is stuck in the upload directory used to queue up images? 
>>
>> Just trying to understand better how the work is broken up.
>>
>> On 8 Dec 2020, at 9:29 am, Gary Conley <ga...@goldeneraproductions.org> 
>> wrote:
>>
>> I have a flask app running with mod_wsgi version 4.7.1 on Centos7 
>> (httpd). Python 3.6.
>>
>> The server is brand new, 64GB RAM, Dual Xeon 3Ghz, all SSD drives, 8gbs 
>> fiber to a Stornext SAN. All very fast hardware.
>>
>> I have a problem where the app tends to slow down dramatically under 
>> certain circumstances and have had no success to date in finding the cause.
>>
>> The app itself is fairly simple. It processes images from a directory and 
>> uploads them to a DAM (php based - under NGINX) running on another server. 
>> The processing consists of generating preview and thumbnail images from the 
>> original images (mostly raw file formats - NEF, CR2 etc) using imageMagick, 
>> ufraw and exiftool, extracting xmp data using exiftool and uploading these 
>> elements to the DAM through the DAM's API.
>>
>> The processing is multithreaded using python queues and workers, nothing 
>> fancy.
>>
>> The app gathers data about the images from a mysql database and persists  
>> transactional data to mysql for workflow management purposes.
>>
>> There is a simple html template with controls for selecting which 
>> directories to upload and provide the user with feedback on progress, 
>> errors and so on, using ajax calls. The user can also abort the upload 
>> process.
>>
>> Under normal circumstances the app will process 2 images per second. 
>> There have been instances however where the app slows way down, taking 7-10 
>> seconds to upload 1 image, a factor of 15 to 20 times slower.
>>
>> I have run metrics on the various steps of the upload procedure and it 
>> appears that every aspect of the app slows down. Generating a preview 
>> image, which normally takes less than a second takes 40 seconds, extracting 
>> metadata with exiftool, typically less than half a second takes 7-10 
>> seconds. Database response seems to remain constant. Upload to the DAM also 
>> takes much longer.
>>
>> When the slow down occurs requests from the browser to the app time out. 
>> Aborting the upload procedure becomes impossible and the only way to stop 
>> it is to stop Apache (httpd) directly on the server.
>>
>> We have checked CPU usage (less than 10%), memory usage (less than 8 GB 
>> on a 64GB machine) and also checked IO, confirming that we were able read 
>> and write at up to 9.5gbs while the app was running at a snail's pace.
>>
>> The only thing we have been able to isolate as having any effect on the 
>> upload speed is the size of the upload queue. When a user selects a 
>> directory to be uploaded the files in that directory (and all 
>> sub-directories) are sorted by size and put into a python queue from which 
>> they are then uploaded with between 10 and 20 threads, which is user 
>> configurable. We have tested with queues up to 10,000 files with no issue 
>> at all. We had a slow down with a queue that was over 35,000 images.
>>
>> The content of the images makes little to no difference in speed. 
>> Processing huge image files (such as 5GB PSB files) does slow down the 
>> upload process, but only to 1.5 seconds per file. The last instance of slow 
>> down occurred on 5MB jpgs. The speed on 60MB NEF and 200KB jpgs is 
>> virtually the same under normal circumstances.
>>
>> We suspect a memory issue, but don't see any increase in memory usage 
>> using htop, top or glances.
>>
>> We restarted httpd in the hopes it would clear up any memory leak, with 
>> no improvement. We even rebooted the machine with similar hopes, again with 
>> no improvement.
>>
>> We tried change the number of threads in our wsgi config (from 5 to 15) 
>> and also changed the number of processes to 2. We even tried setting 
>> threads to 1, which had disastrous effects. None of this made any 
>> improvement. We put our settings back to 1 process and 5 threads.
>>
>> Any clues on what could be causing this slow down or ideas on how to 
>> isolate what is causing it would be much appreciated. We've spent days 
>> trying to track it down with the only solution being to break up our jobs 
>> into smaller chunks, which is very non-optimum for our workflow as it has 
>> to be done manually due to the nature of the content.
>>
>> We can send files if needed, but are not sure what to send.
>>
>> Thank you.
>>
>> Gary
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "modwsgi" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to modwsgi+u...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/modwsgi/389a39bd-2ad1-4c65-a784-0762029d0825n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/modwsgi/389a39bd-2ad1-4c65-a784-0762029d0825n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>>
>>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to modwsgi+u...@googlegroups.com.
>
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/modwsgi/29f12479-ad2d-4eba-9dde-b893d010cc41n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/modwsgi/29f12479-ad2d-4eba-9dde-b893d010cc41n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to modwsgi+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/modwsgi/bed0387a-920e-4753-8738-26c8f037e749n%40googlegroups.com.

Reply via email to