On Sun, 19 Jun 2011, francescortiz wrote:

Faheem,

In your post you write:

"This works, but I traced through the code, and found that the code
was using streaming, when clearly the optimal thing to do in this case
would be to just copy the file from the temporary location to the
permanent location"

In java copying files is performed by redericting data from a file
input stream to a file ouput stream, so your sentence sounded to me
like "Instead of copying a file I want to just copy a file" :P

Though I am no python expert, I suspect that using streams in python
is fine as well.

You could do some performance tests of stream copy vs copyfile
function in python. If you do, please share results!

Hope I help,

Francesc

Hi Francesc,

Thanks for the helpful comments. As I mention in the post, the Django
code (in the `_save` function of the `FileSystemStorage` class,
located in the file 'django/core/files/storage.py') appears to be
looking for the `temporary_file_path` attribute. If it finds it, it
executes `file_move_safe`, from 'django/core/files/move.py'. I've
included move.py below for reference.

Following your comments, I looked more carefully at what
`file_move_safe` does. This code first checks origin and destination
are not the same file. If that is so, it attempts `os.rename`, which I
think you will agree, is optimal if both the origin and destination
are on the same filesystem. If not, it copies the data across in
chunks (apparently what is referred to as streaming). This is exactly
the same code used in `FileSystemStorage.save`. So basically the only
thing missing is `os.rename` is not called if `temporary_file_path` is
not seen. Also, if the origin and destination are the same path,
currently the streaming code has to deal with this, and I'm not sure
what happens then.

I thought unix `cp` or similar was used somewhere, but no.

Anyway, the upshot is - I think you are right that there isn't a big
issue here. Regardless, the point of my post is that it didn't look
like the `_save` function of the `FileSystemStorage` class would ever
encounter the `temporary_file_path` attribute, so the bit of the code
that looks for it and executes `file_move_safe` seems like dead code,
so should be either fixed or removed.

Also thank you for the timing information. My concerns were/are more
about excessive memory usage, since I'm dealing with large files.

                                                     Regards, Faheem

###############################################################################

"""
Move a file in the safest way possible::

    >>> from django.core.files.move import file_move_safe
    >>> file_move_safe("/tmp/old_file", "/tmp/new_file")
"""

import os
from django.core.files import locks

try:
    from shutil import copystat
except ImportError:
    import stat
    def copystat(src, dst):
        """Copy all stat info (mode bits, atime and mtime) from src to dst"""
        st = os.stat(src)
        mode = stat.S_IMODE(st.st_mode)
        if hasattr(os, 'utime'):
            os.utime(dst, (st.st_atime, st.st_mtime))
        if hasattr(os, 'chmod'):
            os.chmod(dst, mode)

__all__ = ['file_move_safe']

def _samefile(src, dst):
    # Macintosh, Unix.
    if hasattr(os.path,'samefile'):
        try:
            return os.path.samefile(src, dst)
        except OSError:
            return False

    # All other platforms: check for same pathname.
    return (os.path.normcase(os.path.abspath(src)) ==
            os.path.normcase(os.path.abspath(dst)))

def file_move_safe(old_file_name, new_file_name, chunk_size = 1024*64, 
allow_overwrite=False):
    """
    Moves a file from one location to another in the safest way possible.

    First, tries ``os.rename``, which is simple but will break across 
filesystems.
    If that fails, streams manually from one file to another in pure Python.

    If the destination file exists and ``allow_overwrite`` is ``False``, this
    function will throw an ``IOError``.
    """

    # There's no reason to move if we don't have to.
    if _samefile(old_file_name, new_file_name):
        return

    try:
        os.rename(old_file_name, new_file_name)
        return
    except OSError:
        # This will happen with os.rename if moving to another filesystem
        # or when moving opened files on certain operating systems
        pass

    # first open the old file, so that it won't go away
    old_file = open(old_file_name, 'rb')
    try:
        # now open the new file, not forgetting allow_overwrite
        fd = os.open(new_file_name, os.O_WRONLY | os.O_CREAT | getattr(os, 
'O_BINARY', 0) |
                                    (not allow_overwrite and os.O_EXCL or 0))
        try:
            locks.lock(fd, locks.LOCK_EX)
            current_chunk = None
            while current_chunk != '':
                current_chunk = old_file.read(chunk_size)
                os.write(fd, current_chunk)
        finally:
            locks.unlock(fd)
            os.close(fd)
    finally:
        old_file.close()
    copystat(old_file_name, new_file_name)

    try:
        os.remove(old_file_name)
    except OSError, e:
        # Certain operating systems (Cygwin and Windows)
        # fail when deleting opened files, ignore it.  (For the
        # systems where this happens, temporary files will be auto-deleted
        # on close anyway.)
        if getattr(e, 'winerror', 0) != 32 and getattr(e, 'errno', 0) != 13:
            raise

--
You received this message because you are subscribed to the Google Groups "Django 
users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to