#26058: Custom storage backend's not entirely decoupled from FileField
-------------------------------------+-------------------------------------
     Reporter:  Korijn               |                    Owner:  nobody
         Type:  Bug                  |                   Status:  new
    Component:  File                 |                  Version:  1.9
  uploads/storage                    |
     Severity:  Normal               |               Resolution:
     Keywords:  custom storage       |             Triage Stage:
  filefield                          |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Changes (by Korijn):

 * needs_better_patch:   => 0
 * needs_tests:   => 0
 * needs_docs:   => 0


Old description:

> Let me start by saying that I've implemented custom FileFields that can
> handle Numpy objects, to provide some convenience methods to access the
> typed objects.
>
> Later, when implementing a custom storage backend for Azure Storage, I
> encountered some issues when handling filenames. Cloud storage doesn't
> behave like local file system storage does. Most of Django's code is
> agnostic of this problem, except for a small section of FileField:
>
> https://github.com/django/django/blob/77b8d8cb6d6d6345f479c68c4892291c1492ba7e/django/db/models/fields/files.py#L306-L320
>
> In Azure, blobs are stored in containers, and are optionally stored in
> subfolders (known as prefixes in Azure-speak). To be able to use the
> upload_to property, I need to be able to pass the full path, e.g.:
>
> container/blob
> container/sub/blob
> container/sub/sub/blob
>
> Unfortunately, FileField strips the directory part and only passes the
> blob name to my custom storage backend's get_valid_name implementation,
> where I really need to validate the whole string. I ended up subclassing
> FileField:
>
> {{{
> class CloudStorageFileField(FileField):
>     """
>     Provides some overrides to enable cloud storage backends
>     """
>
>     def get_directory_name(self):
>         return
> force_text(datetime.datetime.now().strftime(force_str(self.upload_to)))
>
>     def get_filename(self, filename):
>         return filename
>
>     def generate_filename(self, instance, filename):
>         if callable(self.upload_to):
>             filename = self.upload_to(instance, filename)
>             filename = self.storage.get_valid_name(filename)
>             return filename
>
>         return self.get_directory_name() +
> self.storage.get_valid_name(filename)
> }}}
>
> As you can see, all I did was remove the local file system related logic.
> Unfortunately, my custom file fields need to inherit from this class to
> work with Azure storage, and when I switch to local storage in a
> different environment, they have to inherit from the regular FileField!
> This is obviously an undesirable situation, imposed by the otherwise-
> great tight coupling to local file system logic in the three methods of
> FileField.
>
> In order to truly decouple this, FielField would need to be a little bit
> more agnostic about the storage backend. This could be done by moving the
> generate_filename method to the backend entirely, for example.
>
> I classified this as a bug, as this case shows that you're not "entirely"
> able to do anything you want when implementing custom storage backends.
>
> You can see the full implementation of both the backend and the
> numpyfilefield here: https://gist.github.com/Korijn/e0bcbdcedb494509973a
>
> Your thoughts?

New description:

 Let me start by saying that I've implemented custom FileFields that can
 handle Numpy objects, to provide some convenience methods to access the
 typed objects.

 Later, when implementing a custom storage backend for Azure Storage, I
 encountered some issues when handling filenames. Cloud storage doesn't
 behave like local file system storage does. Most of Django's code is
 agnostic of this problem, except for a small section of FileField:

 
https://github.com/django/django/blob/77b8d8cb6d6d6345f479c68c4892291c1492ba7e/django/db/models/fields/files.py#L306-L320

 In Azure, blobs are stored in containers, and are optionally stored in
 subfolders (known as prefixes in Azure-speak). To be able to use the
 upload_to property, I need to be able to pass the full path, e.g.:

 container/blob
 container/sub/blob
 container/sub/sub/blob

 Unfortunately, FileField strips the directory part and only passes the
 blob name to my custom storage backend's get_valid_name implementation,
 where I really need to validate the whole string. I ended up subclassing
 FileField:

 {{{
 class CloudStorageFileField(FileField):
     """
     Provides some overrides to enable cloud storage backends
     """

     def get_directory_name(self):
         return
 force_text(datetime.datetime.now().strftime(force_str(self.upload_to)))

     def get_filename(self, filename):
         return filename

     def generate_filename(self, instance, filename):
         if callable(self.upload_to):
             filename = self.upload_to(instance, filename)
             filename = self.storage.get_valid_name(filename)
             return filename

         return self.get_directory_name() +
 self.storage.get_valid_name(filename)
 }}}

 As you can see, all I did was remove the local file system related logic.
 Unfortunately, my custom file fields need to inherit from this class to
 work with Azure storage, and when I switch to local storage in a different
 environment, they have to inherit from the regular FileField! This is
 obviously an undesirable situation, imposed by the tight coupling to local
 file system logic in the three methods of FileField.

 In order to truly decouple this, FileField would need to be a little bit
 more agnostic about the storage backend. This could be done by moving the
 generate_filename method to the backend entirely, for example.

 I classified this as a bug, as this case shows that you're not
 ''entirely'' able to do anything you want when implementing custom storage
 backends.

 You can see the implementation of both the custom storage backend and the
 NumpyFileField here: https://gist.github.com/Korijn/e0bcbdcedb494509973a

 Your thoughts?

--

--
Ticket URL: <https://code.djangoproject.com/ticket/26058#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/064.2788ac4ae4e175e7a329938ecd15b7dc%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to