Re: Complains about FileField not deleting files in 1.3.

2012-07-14 Thread -RAX-
It took me a while but I have opensourced a tool to clean the media folder 
from all the
leftovers. It's a cron job with a simple admin gui. Also handles multiple 
upload folders.

We have been using something similar to this internally for more than a 
year,
and we are pretty happy about it.

https://github.com/PuzzleDev/django-uploadcleaner

Just out of curiosity, how do you normally handle file removal?

On Tuesday, March 29, 2011 3:43:13 PM UTC+2, Carl Meyer wrote:
>
> Hi Alex,
>
> On 03/29/2011 01:36 AM, Alex Kamedov wrote:
> > I think, cron jobs is an overhead in many simple cases where old
> > behaviour was useful and more simpler.
> > Why you don't want include DeletingFileField[1] in django?
> > 
> > [1] https://gist.github.com/889692
>
> Because, as mentioned above, it is known to cause data loss in certain
> situations (rolled-back transactions, overlapping upload-to
> directories), and we are not very fond of including things in Django
> that cause some Django users to lose their data. If you understand those
> risks and want to use DeletingFileField in your projects, it's not hard
> to do so.
>
> Carl
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-developers/-/wpPZcb_BKr0J.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Complains about FileField not deleting files in 1.3.

2011-03-29 Thread Carl Meyer
Hi Alex,

On 03/29/2011 01:36 AM, Alex Kamedov wrote:
> I think, cron jobs is an overhead in many simple cases where old
> behaviour was useful and more simpler.
> Why you don't want include DeletingFileField[1] in django?
> 
> [1] https://gist.github.com/889692

Because, as mentioned above, it is known to cause data loss in certain
situations (rolled-back transactions, overlapping upload-to
directories), and we are not very fond of including things in Django
that cause some Django users to lose their data. If you understand those
risks and want to use DeletingFileField in your projects, it's not hard
to do so.

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Complains about FileField not deleting files in 1.3.

2011-03-28 Thread Alex Kamedov
I think, cron jobs is an overhead in many simple cases where old behaviour
was useful and more simpler.
Why you don't want include DeletingFileField[1] in django?

[1] https://gist.github.com/889692

On Mon, Mar 28, 2011 at 9:07 PM, Jacob Kaplan-Moss wrote:

> On Mon, Mar 28, 2011 at 4:16 AM, -RAX-  wrote:
> > Said so I will start implementing such a maintenance job, and I am
> > willing to share it so maybe we could include it in a future release
> > of django.
>
> Sounds good -- I look forward to seeing your code!
>
> Jacob
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To post to this group, send email to django-developers@googlegroups.com.
> To unsubscribe from this group, send email to
> django-developers+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/django-developers?hl=en.
>
>


-- 
Alex Kamedov
skype: kamedovwww: kamedov.ru

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Complains about FileField not deleting files in 1.3.

2011-03-28 Thread Jacob Kaplan-Moss
On Mon, Mar 28, 2011 at 4:16 AM, -RAX-  wrote:
> Said so I will start implementing such a maintenance job, and I am
> willing to share it so maybe we could include it in a future release
> of django.

Sounds good -- I look forward to seeing your code!

Jacob

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Complains about FileField not deleting files in 1.3.

2011-03-28 Thread Russell Keith-Magee
On Mon, Mar 28, 2011 at 5:16 PM, -RAX-  wrote:
>> One query for each model
>> containing one or more FileFields is enough to build a list of the files
>> that ought to exist, and any file not in that list can presumably be
>> removed.
>
> How can I sleep at night knowing that there is a maintenance cron job
> deleting files which can be "presumably be removed"?
>
> My point is that yes it great that you have removed a data loss from
> FileField, but you have moved all the complexity to the developers and
> they will end up having even bigger data loss.
> Assuming that the purpose of improving django is to make it safer and
> easier to be used this "improvement" is a double bladed knife.
>
> Correct me if I am wrong:
> - Subclassing FileField will restore the previous issue.
> - A post delete signal will restore the previous issue.

This is correct -- but this is only a problem if you were affected by
the previous issue. That's not necessarily the case. This is a
situation where Django has to work for *every* case, but it's entirely
possible that one specific site may not be affected. We have to err on
the side of caution because we support everyone's usage, not just one
particular use pattern.

> Having a maintenance job deleting file not listed will require a
> serious maintenance.
> Suppose a developer adding a file field and forgetting to update the
> maintenance script will cause all the file of that field to be
> deleted.
> Files which for a bad design are in the same folder as the file
> pointed by the file field will be removed.
> Thumbnails and other files eventually not pointed by a file field will
> be removed.
>
> And there will be more serious failures depending from the
> implementation.
> What I am trying to say is that removing orphaned files, even if with
> a cron job, should be done by django automatically and not assuming
> that the developers will take care of that.
>
> Said so I will start implementing such a maintenance job, and I am
> willing to share it so maybe we could include it in a future release
> of django.

If you can propose such a cleanup task, it's certainly worth
considering for inclusion into Django. There's precedent for Django to
include such cleanup tools -- for example, we include a cron task for
cleaning up stale session entries.

However, the session cron task is a 100% reliable solution that can be
implemented efficiently, and there's only one use pattern (you write
new session to the table, you delete old sessions from the table, and
that's it).

The problem with a FileFIeld cleanup task as an idea is that the
original bug with FileField exists because there are many different
ways that FileFields can be used, and we have to support *all* of
them. At the very least, a cleanup task included as part of Django
core would need to work for a significant and easy to identify subset
of uses, and be able to self-identify the cases where it will or won't
work (e.g., as a validation step). The one outcome we *won't* allow is
the case where someone adds a cleanup cron task "because the docs told
me to", and file data gets lost as a result.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Complains about FileField not deleting files in 1.3.

2011-03-28 Thread -RAX-
> One query for each model
> containing one or more FileFields is enough to build a list of the files
> that ought to exist, and any file not in that list can presumably be
> removed.

How can I sleep at night knowing that there is a maintenance cron job
deleting files which can be "presumably be removed"?

My point is that yes it great that you have removed a data loss from
FileField, but you have moved all the complexity to the developers and
they will end up having even bigger data loss.
Assuming that the purpose of improving django is to make it safer and
easier to be used this "improvement" is a double bladed knife.

Correct me if I am wrong:
- Subclassing FileField will restore the previous issue.
- A post delete signal will restore the previous issue.

Having a maintenance job deleting file not listed will require a
serious maintenance.
Suppose a developer adding a file field and forgetting to update the
maintenance script will cause all the file of that field to be
deleted.
Files which for a bad design are in the same folder as the file
pointed by the file field will be removed.
Thumbnails and other files eventually not pointed by a file field will
be removed.

And there will be more serious failures depending from the
implementation.
What I am trying to say is that removing orphaned files, even if with
a cron job, should be done by django automatically and not assuming
that the developers will take care of that.

Said so I will start implementing such a maintenance job, and I am
willing to share it so maybe we could include it in a future release
of django.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Complains about FileField not deleting files in 1.3.

2011-03-27 Thread Jacob Kaplan-Moss
On Sun, Mar 27, 2011 at 5:42 AM, -RAX-  wrote:
> I am referring to this: 
> http://docs.djangoproject.com/en/dev/releases/1.3/#filefield-no-longer-deletes-files
> Instead of preventing the data loss from happening a very usefull
> feature has been removed.

I'm sorry this caused an problem for you. Hopefully it's not *too* big
a deal: a FileField subclass along the lines Carl suggested should be
able to provide the same behavior has you've seen before at minimal
effort. I'd do something like: https://gist.github.com/889692.

But just for the record, we're pretty much always going to make calls
like this. Data loss is one of the few places I'm happy to break
backwards compatibility. When it comes down to unexpected data
retention versus unexpected data loss we're always going to try to err
on the site of retention. Too much data's easy to deal with: delete
some stuff. But lost data means a trip to the backups if you're lucky,
and a really bad day  (or week, or month, ...) if you're not.

Jacob

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Complains about FileField not deleting files in 1.3.

2011-03-27 Thread Carl Meyer
On 03/27/2011 06:42 AM, -RAX- wrote:
> I am referring to this: 
> http://docs.djangoproject.com/en/dev/releases/1.3/#filefield-no-longer-deletes-files
> Instead of preventing the data loss from happening a very usefull
> feature has been removed.

Well, it does also prevent the data loss from happening ;) This data
loss is not a hypothetical problem; we had bug reports from users
affected by it.

> Why not simply letting the developer decide when to enable or disable
> it with a constructor boolean parameter?
> 
> My company sells multimedia web applications normally handling over
> 1 files over various models.
> I am sorry to say that but to me the idea of running a cron job to
> remove orphaned files does not seam to be practical. Shall I make a
> query for each file?

I don't see why that would be necessary. One query for each model
containing one or more FileFields is enough to build a list of the files
that ought to exist, and any file not in that list can presumably be
removed.

> The roll back data loss problem could have been solved by copying the
> file into a temporary file and by restoring it if necessary.

Emulating the transactional behavior of a relational database is not
that trivial. We considered this approach carefully and decided that if
we tried to go down that road, we'd be continually finding and fixing
edge-case bugs in it, and any bug in it would be likely to be a
data-loss bug. Deleting files when we can't be sure it's the right thing
to do is a very dangerous business to be in.

> Am I the only one who would like to see the previous behaviour
> restored? Can we at least re-enable this feature from the file-field
> constructor?

If you want the previous behavior, it's not at all difficult to restore
it with a post-save signal handler. You can make your own trivial
subclass of FileField that attaches this post-save handler in the
contribute_to_class method: that's precisely what FileField used to do.

Carl

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.