On Sun, Jul 18, 2010 at 1:59 AM, Waldemar Kornewald
<wkornew...@gmail.com> wrote:
> Hi Russell,
> so, after our chat on IRC I've finally found the time to implement a
> real proposal including unit tests. I've attached the patch to this
> ticket:
> http://code.djangoproject.com/ticket/13960
>
> Now there is just one backend type with a single setting:
>
> FILE_BACKENDS = (
>    'path.to.Backend',
>    'path.to.Backend2',
> )

Bikeshedding, but FILE_BACKENDS is a little generic for my taste; it's
not specifying anything about *files* per-se, it's about the upload
and download of files.

> --------
>
> The end-user will normally use the API via ModelForm like this when uploading:
> form = MyModelForm()
> form.prepare_upload(request)

I accept the need for this, but this seems like a bit of a wart. This
method wouldn't be required at all if the Form took a request
argument. This isn't an unusual requirement, either -- perhaps we
should introduce a RequestForm/RequestModelForm that formalizes the
availability of the request object during form processing.

This would also have an analog with Context/RequestContext. In most
practical situations, you need to use RequestContext, but the raw
context is there if you need it.

> By default, prepare_upload() will prepare an upload to the current URL
> (and thus current view). Optionally, you can specify a different url
> via form.prepare_upload(request, url='/other/url').

In which case, this should be an argument or modifier to the RequestForm. e.g.,

form = MyRequestModelForm(request)
form.upload_location = '/other/url'

> The prepare_upload function is necessary because we need to support
> services which first send the file and form data to a different URL
> (e.g., to S3 or '/_ah/upload/...' on GAE). Once the upload is
> finished, the service or backend is responsible for sending a request
> to the actual target URL (some Django view).
>
> So, in the template you have to use the generated upload url:
> <form action="{{ form.upload_url }}" ...>
> <table>{{ form }}</table>

Sure. With the caveat that if we're doing a RequestForm, then this
attribute should *always* be available.

> FYI, behind the scenes prepare_upload() might also add hidden input
> fields to your form, depending on the service you're using.

I'm not wild about the idea that a function post-modifies the fields
on a form. This strikes me as something that should be predictable at
time of form construction.

> ------
>
> Downloads are split into two functions which both exist on the file
> object returned by FileField.
>
> With file.serve(request) which returns an HttpResponse you can handle
> the file download from a Django view. This also allows app developers
> to check permissions in the view before calling file.serve(request).
>
> Additionally, there's file.download_url() which deprecates file.url
> (because that uses the storage backend which is the wrong place for
> this feature). This function will always return an URL. If none of
> your backends provides an URL we use "MEDIA_URL + file.name" as a
> fallback.
>
> In some cases file.download_url() will point to a Django view which
> calls file.serve(request) (and which can do some permission checks if
> necessary). Authors of reusable Django apps are responsible for
> providing a default backend which generates the URL to their app's
> built-in download view. End-users can overrides this.
>
> In your templates you'd always use code like this:
> <a href="{{ entity.file.download_url }}">Download</a>

No problem with the general principles here; however, there is some
overlap with the WSGI-improvements GSoC work from last year, which
included much more complete handling of streaming file downloads.

> --------
>
> Backends
>
> Every backend derives from BaseFileBackend. Every backend instance has
> these members:
> * self.model: the model that own the FileFields to upload to or download from

Except that there's no guarantee that a Form will be backed by a
model. You can upload a file without a model being involved.

> * self.fields: the list of FileField instances to upload to or
> download from (in case of a download there is just one entry)
> For convenience, there is also self.field which makes sure that there
> is just one field in self.fields and returns that field.
>
> Every backend can override the functions mentioned above
> (prepare_upload, file.serve, file.download_url). You can take a look
> at the source for more details. FYI, the prepare_upload() function is
> pretty much the same as before, just without the private=True/False
> parameter.
>
> Additionally, backends can define get_storage_backend() which returns
> a storage backend for the given model/field combination. As a fallback
> DEFAULT_STORAGE_BACKEND is used.
>
> The API is also similar to DB routers. If any of those functions
> returns None the next backend is tried (as defined in
> settings.FILE_BACKENDS).

The backend API itself is starting to look reasonably good -- there
are a lot less moving parts, which is good. However, when I alluded to
Routers in our previous discussions, I didn't mean you should be quite
so literal.

There are three pieces to this puzzle.
 a) The mechanics for how to do a file transfer to S3, etc
 b) Configuring a specific instance of an S3 transfer backend to point
at a particular path
 c) Configuring which backend should be used for which form.

At the moment, it appears that your convolving all three roles into
one backend class. To my mind, (b) should be an instantiation of (a),
and (c) is a separate issue for determining which (b) instance is
required at any given time. That is, the routing behavior and the file
upload logic shouldn't be part of the same class as the backend
itself. (By analogy, the database routers determine which database
will be used, but contain no logic to actually write to a database).

What we need to set up is a way of clearly identifying a particular
form (which is to say, a file upload point) so that as part of system
configuration, we can associate a particular uploading behavior. In
the parlance of Aspect-oriented programming, we're specifying a join
point.

In the case of database routers, the model/instance is what we're
actually operating upon, so that is the object that is appropriate for
making a routing decision. A big part of the router selection comes
from the hint -- extra information associated with the activity that
requires a router decision. In most cases, this is the 'other' object
that is involved in the process.

It's not clear to me that the file backends provide a similarly
unambiguous mechanism for making a decision, or *any* analogous
mechanism for 'hinting'. You've attempted to do this with a
combination of model and field, but I'm not convinced that this is
sufficient -- and even if it is sufficient, it's more than a little
bit fragile. In the case of a file upload, the model is both optional,
and not the primary focus of the activity you're configuring.

It may be as simple as providing a simple tag (a-la named URLs) that
can be used as a namespace to describe the file uploads that are
necessary.

Once such a name is available, you may also find that the cascading
behavior isn't actually required. Cascading is necessary in the case
of database routers because you may need to consider several factors
before deciding which database needs to be used; however, in the case
of file uploads, the uploading behavior isn't really dynamic -- form X
will always be backed by upload mechanism Y. All you really need is a
way of performing that mapping.

> Please provide some feedback. Does this solve all issues you had with the API?

It's not perfect, but It's certainly looking a lot better.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to