On Mon, Mar 28, 2011 at 5:16 PM, -RAX- <michele.s...@gmail.com> wrote:
>> One query for each model
>> containing one or more FileFields is enough to build a list of the files
>> that ought to exist, and any file not in that list can presumably be
>> removed.
>
> How can I sleep at night knowing that there is a maintenance cron job
> deleting files which can be "presumably be removed"?
>
> My point is that yes it great that you have removed a data loss from
> FileField, but you have moved all the complexity to the developers and
> they will end up having even bigger data loss.
> Assuming that the purpose of improving django is to make it safer and
> easier to be used this "improvement" is a double bladed knife.
>
> Correct me if I am wrong:
> - Subclassing FileField will restore the previous issue.
> - A post delete signal will restore the previous issue.

This is correct -- but this is only a problem if you were affected by
the previous issue. That's not necessarily the case. This is a
situation where Django has to work for *every* case, but it's entirely
possible that one specific site may not be affected. We have to err on
the side of caution because we support everyone's usage, not just one
particular use pattern.

> Having a maintenance job deleting file not listed will require a
> serious maintenance.
> Suppose a developer adding a file field and forgetting to update the
> maintenance script will cause all the file of that field to be
> deleted.
> Files which for a bad design are in the same folder as the file
> pointed by the file field will be removed.
> Thumbnails and other files eventually not pointed by a file field will
> be removed.
>
> And there will be more serious failures depending from the
> implementation.
> What I am trying to say is that removing orphaned files, even if with
> a cron job, should be done by django automatically and not assuming
> that the developers will take care of that.
>
> Said so I will start implementing such a maintenance job, and I am
> willing to share it so maybe we could include it in a future release
> of django.

If you can propose such a cleanup task, it's certainly worth
considering for inclusion into Django. There's precedent for Django to
include such cleanup tools -- for example, we include a cron task for
cleaning up stale session entries.

However, the session cron task is a 100% reliable solution that can be
implemented efficiently, and there's only one use pattern (you write
new session to the table, you delete old sessions from the table, and
that's it).

The problem with a FileFIeld cleanup task as an idea is that the
original bug with FileField exists because there are many different
ways that FileFields can be used, and we have to support *all* of
them. At the very least, a cleanup task included as part of Django
core would need to work for a significant and easy to identify subset
of uses, and be able to self-identify the cases where it will or won't
work (e.g., as a validation step). The one outcome we *won't* allow is
the case where someone adds a cleanup cron task "because the docs told
me to", and file data gets lost as a result.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to