On 08/12/20, Gary Conley ([email protected]) wrote:
> I suspect I have some sort of issue with large directories. As a workaround
> I've been breaking the directories down into 4000 images at a time and the
> performance is acceptable. So, while image processing may not be a great
> idea, it is working well for me provided I don't have huge directories. I
> had one as large as 10,000 that also ran fine, but 30,000+ was a total bust
> with performance rapidly going from 2 images per second to 7 seconds per
> image. With 4000 images in a directory I get consistent performance of 1-2
> images per second.
Off topic, but I suggest not having more than 1,000 files per directory
if you can manage it, as running "ls" against a directory with more
images than than on cloud storage or indifferent storage backends will
cause a noticeable lag.
A common scheme is to work out how many images you might receive within
a peak time. If, for instance, you never receive more than 1000 images
in an hour, it is worth considering a date-based subdirectory structure
based on date, for example:
./images
2020120811
2020120811-01.jpg
2020120811-02.jpg
2020120812
2020120812-01.jpg
2020120812-02.jpg
2020120823
2020120823-01.jpg
2020120907
2020120907-01.jpg
2020120907-02.jpg
2020120907-03.jpg
etc.
Other directory structures based on exif data, image type, natural image
naming structures and so on can work too. Also, if the location of each
image is in a database you can avoid doing a directory scan if you
*know* where each images is. Still, the subdirectory approach is still a
good idea for maintenance and backup purposes.
Rory
--
You received this message because you are subscribed to the Google Groups
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/modwsgi/20201209073942.GA23162%40campbell-lange.net.