Hi all,

Can anyone confirm that the following is a bug in Django (and that I'm not
simply missing the correct way to do this)?

I'm using Django 1.7 on Python 3.4.

I'm new to Django but a longtime Python programmer.  I've been digging into
a ValueError I get when I supply a BytesIO instance as the `file` parameter
to an ImageFile instance, then supply that ImageFile instance as the
`content` parameter to `FieldFile.save`.

In summary:  The method `FieldFile.save` saves `content` to `self.storage`
(which causes `content` to be closed) but then attempts to access the size
of `content`.  This causes a ValueError("I/O operation on closed file.") to
be raised if the internal file inside the ImageFile instance is a BytesIO
instance.  Here is `FieldFile.save`:

https://github.com/django/django/blob/master/django/db/models/fields/files.py#L86

----

This appears to be an interaction between the following two changesets:
  - 8 months ago:
https://github.com/django/django/commit/30fc49a7ca0d030c7855f31ed44395903fa6abdd
  - 3 years ago:
https://github.com/django/django/commit/88ff6567310d809abddb97adebf04d5e9403ca8a

This changeset also touches relevant code, but I don't think it altered any
logic relevant to this problem:
  - 2 months ago:
https://github.com/django/django/commit/96fc3908ad25f40b4c6e2f76be2639f665223227

----

Here's the detailed line-by-line trace:

In "django/db/models/fields/files.py", in method `FieldFile.save(self,
name, content, save=True)`:

https://github.com/django/django/blob/master/django/db/models/fields/files.py#L86
this function invokes `self.storage.save(name, content)` [line 88],
then assigns `content.size` to `self._size` [line 92].

In "django/core/files/storage.py", in method `Storage.save(self, name,
content)`:

https://github.com/django/django/blob/master/django/core/files/storage.py#L37
this function invokes `self._save(name, content)` [line 51].

In "django/core/files/storage.py", in method `FileSystemStorage._save(self,
name, content)`:

https://github.com/django/django/blob/master/django/core/files/storage.py#L175
this function invokes `content.close()` [line 231]:

https://github.com/django/django/blob/master/django/core/files/storage.py#L231

At this point, `content` is now closed.

But then back to method `FieldFile.save(self, name, content, save=True)`,
line 92:

https://github.com/django/django/blob/master/django/db/models/fields/files.py#L92
we next access the `size` attribute of `content`.

The type of `content` is ImageFile, which inherits File:
  - class ImageFile:
https://github.com/django/django/blob/master/django/core/files/images.py#L11
  - class File:
https://github.com/django/django/blob/master/django/core/files/base.py#L13

Accessing the `size` attribute of `content` accesses the `size` property:
  https://github.com/django/django/blob/master/django/core/files/base.py#L64
which invokes the `File._get_size(self)` method:
  https://github.com/django/django/blob/master/django/core/files/base.py#L55

At this point, `content` (a recently-created ImageFile instance) doesn't
have a `_size` attribute, so `self._get_size_from_underlying_file()` is
invoked [line 58]:

In method `File._get_size_from_underlying_file()`, line 39:
  https://github.com/django/django/blob/master/django/core/files/base.py#L39
the function checks for various attributes of the underlying Python file in
a chained if-statement.

In my case, the underlying Python file is a BytesIO instance:
  https://docs.python.org/3/library/io.html#io.BytesIO
which inherits io.BufferedIOBase:
  https://docs.python.org/3/library/io.html#io.BufferedIOBase
which in turn inherits io.IOBase:
  https://docs.python.org/3/library/io.html#io.IOBase

The attributes checked, and the results when the file is a BytesIO instance:
  - `size`:  no
  - `name`:  no
  - `tell` and `seek`:  yes, in io.IOBase

Because the `tell` and `seek` attributes were found, the function attempts
to determine the file size using `self.file.tell()` [line 48]:
  https://github.com/django/django/blob/master/django/core/files/base.py#L48

However, at this point, because the BytesIO instance is closed, this will
result in a ValueError("I/O operation on closed file.") being raised.

This ValueError exception is raised in the Python 3.4 source, in the file
"Python-3.4.0/Modules/_io/bytesio.c", when the macro `CHECK_CLOSED(self)`
[line 22] is invoked by function `bytesio_tell(bytesio *self)` [line 254].

----

At this point, my work-around is to set the `_size` attribute of the
ImageFile manually (to the BytesIO size calculated using the same method as
in `File._get_size_from_underlying_file()`) before invoking
`FieldFile.save`.

        def get_bytesio_size(buf):
            """Calculate the size of BytesIO instance `buf`.
            Note: Ensure `buf` is not already closed!
            """
            # This approach was copied from "django/core/files/base.py"
            pos = buf.tell()
            buf.seek(0, os.SEEK_END)
            size = buf.tell()
            buf.seek(pos)
            return size

        img_file._size = get_bytesio_size(bytesio)

        # Then img_field.save(img_name, img_file), etc...


It works, but obviously it's hacky to do this.

Can anyone confirm that this problem is indeed a bug in Django (and that
I'm not simply missing the correct way to do this)?

Thanks,
jb

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/CABtX16GWKggatron%3Dm1R%2B2La4-m8Gu%2B%2BP7ZjZtWAyN205Z1ihA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to