Hello,

thanks to everyone who replied. Here are some conclusions of mine:

Today's filebased-cache code seems to be suffering from the same problems it was
suffering 7 years ago. Every time you .set() the cache it asks the OS to provide
a list of files, just for counting them (for the purpose of culling). This is
slow. The culling strategy is to delete a random sample of cache entries. So
Russell's comment seems valid today, at least with respect to culling. Of
Django's included cache backends, apparently only memcached is suitable for a
large cache in production. Redis could be a good idea for adding persistence,
but it is non-standard (not included with Django).

Redis is anyway not appropriate for my use case because I don't need the speed,
so storing the information in RAM, which has a larger cost than the filesystem,
is suboptimal.

The fact that a cache knows how to get the information if it doesn't have it is
an interesting observation that I hadn't thought about, but appears to be true
for most uses of "cache" that I can think of (it doesn't apply to write caches).
Therefore I'm using the cache for a different purpose than the one for which it
was designed, which can create all sorts of problems (such as a new
administrator—or even an old one—not knowing or forgetting they can't just
delete the cache). However I will take my risks and continue using it for a
while, as for these two small projects implementing a more complicated solution,
or adding another component and thus raising the bar for other people to replace
me, isn't worth it.

Antonis Christofides
http://djangodeployment.com


On 2017-05-27 12:25, Antonis Christofides wrote:
>
> Hello all,
>
> I have an application that calculates and tells you whether a specific crop at
> a specific piece of land needs to be irrigated, and how much. The calculation
> lasts for a few seconds, so I'm doing it offline with Celery. Every two hours
> new meteorological data comes in and all the pieces of land are recalculated.
>
> The question is where to store the results of the calculation. I thought that
> since they are re-creatable, the cache would be the appropriate place.
> However, there is a difference with the more common use of the cache: they are
> re-creatable, but they are also necessary. You can't just go and delete any
> item in the cache. This will cripple the website, which expects to find the
> calculation results in the cache. Viewing something on the site will never
> trigger a recalculation (and if I make it trigger, it will be a safety
> procedure for edge cases and not the normal way of doing things). The results
> must also survive reboots, so I chose the file-based cache.
>
> I didn't know about culling, so when the pieces of land grew to 100, and the
> items in the cache to 400 (4 items need to be stored for each piece of land),
> I spent a few hours trying to find out what the heck is going on. I solved the
> problem by tweaking the culling parameters. However all this has raised a few
> issues:
>
>  1. The filesystem cache can't grow too much because of issue 11260
>     <https://code.djangoproject.com/ticket/11260>, which is marked wontfix.
>     According to Russell Keith-Magee
>     <https://code.djangoproject.com/ticket/11260#comment:7>,
>
>         "the filesystem cache is intended as an easy way to test caching, not
>         as a serious caching strategy. The default cache size and the cull
>         strategy implemented by the file cache should make that obvious. If
>         you need a cache capable of holding 100000 items, I strongly recommend
>         you look at memcache. If you insist on using the filesystem as a
>         cache, it isn't hard to subclass and extend the existing cache."
>
>     If these comments are correct, then the documentation needs some fixing,
>     because not only does in not say that the filesystem cache is not for
>     serious use, but it implies the opposite:
>
>         "Without a really compelling reason, ... you should stick to the cache
>         backends included with Django. They’ve been well-tested and are easy
>         to use."
>
>     Is Russell not entirely correct perhaps, or is the documentation? Or am I
>     missing something?
>
>  2. In the end, is it a bad idea to use the cache for this particular case? I
>     also have a similar use case in an unrelated app: a page that needs about
>     a minute to render. Although I've implemented a quick-and-dirty solution
>     of increasing the web server's timeout and caching the page, I guess the
>     correct way would be to produce that page offline with Celery or so. Where
>     would I store such a page if not in the cache?
>
> -- 
> Antonis Christofides
> http://djangodeployment.com
> -- 
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected]
> <mailto:[email protected]>.
> To post to this group, send email to [email protected]
> <mailto:[email protected]>.
> Visit this group at https://groups.google.com/group/django-users.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-users/a5a8d1ab-f4e0-a6b5-b1da-acc9dc2dbf9d%40djangodeployment.com
> <https://groups.google.com/d/msgid/django-users/a5a8d1ab-f4e0-a6b5-b1da-acc9dc2dbf9d%40djangodeployment.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/ed2e1418-44ec-8a4b-d642-7b004d875325%40djangodeployment.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to