On Sun, 19 Jun 2011, francescortiz wrote:
Faheem, In your post you write:
"This works, but I traced through the code, and found that the code was using streaming, when clearly the optimal thing to do in this case would be to just copy the file from the temporary location to the permanent location"
In java copying files is performed by redericting data from a file input stream to a file ouput stream, so your sentence sounded to me like "Instead of copying a file I want to just copy a file" :P
Though I am no python expert, I suspect that using streams in python is fine as well.
You could do some performance tests of stream copy vs copyfile function in python. If you do, please share results!
Hope I help, Francesc
Hi Francesc, Thanks for the helpful comments. As I mention in the post, the Django code (in the `_save` function of the `FileSystemStorage` class, located in the file 'django/core/files/storage.py') appears to be looking for the `temporary_file_path` attribute. If it finds it, it executes `file_move_safe`, from 'django/core/files/move.py'. I've included move.py below for reference. Following your comments, I looked more carefully at what `file_move_safe` does. This code first checks origin and destination are not the same file. If that is so, it attempts `os.rename`, which I think you will agree, is optimal if both the origin and destination are on the same filesystem. If not, it copies the data across in chunks (apparently what is referred to as streaming). This is exactly the same code used in `FileSystemStorage.save`. So basically the only thing missing is `os.rename` is not called if `temporary_file_path` is not seen. Also, if the origin and destination are the same path, currently the streaming code has to deal with this, and I'm not sure what happens then. I thought unix `cp` or similar was used somewhere, but no. Anyway, the upshot is - I think you are right that there isn't a big issue here. Regardless, the point of my post is that it didn't look like the `_save` function of the `FileSystemStorage` class would ever encounter the `temporary_file_path` attribute, so the bit of the code that looks for it and executes `file_move_safe` seems like dead code, so should be either fixed or removed. Also thank you for the timing information. My concerns were/are more about excessive memory usage, since I'm dealing with large files. Regards, Faheem ############################################################################### """ Move a file in the safest way possible:: >>> from django.core.files.move import file_move_safe >>> file_move_safe("/tmp/old_file", "/tmp/new_file") """ import os from django.core.files import locks try: from shutil import copystat except ImportError: import stat def copystat(src, dst): """Copy all stat info (mode bits, atime and mtime) from src to dst""" st = os.stat(src) mode = stat.S_IMODE(st.st_mode) if hasattr(os, 'utime'): os.utime(dst, (st.st_atime, st.st_mtime)) if hasattr(os, 'chmod'): os.chmod(dst, mode) __all__ = ['file_move_safe'] def _samefile(src, dst): # Macintosh, Unix. if hasattr(os.path,'samefile'): try: return os.path.samefile(src, dst) except OSError: return False # All other platforms: check for same pathname. return (os.path.normcase(os.path.abspath(src)) == os.path.normcase(os.path.abspath(dst))) def file_move_safe(old_file_name, new_file_name, chunk_size = 1024*64, allow_overwrite=False): """ Moves a file from one location to another in the safest way possible. First, tries ``os.rename``, which is simple but will break across filesystems. If that fails, streams manually from one file to another in pure Python. If the destination file exists and ``allow_overwrite`` is ``False``, this function will throw an ``IOError``. """ # There's no reason to move if we don't have to. if _samefile(old_file_name, new_file_name): return try: os.rename(old_file_name, new_file_name) return except OSError: # This will happen with os.rename if moving to another filesystem # or when moving opened files on certain operating systems pass # first open the old file, so that it won't go away old_file = open(old_file_name, 'rb') try: # now open the new file, not forgetting allow_overwrite fd = os.open(new_file_name, os.O_WRONLY | os.O_CREAT | getattr(os, 'O_BINARY', 0) | (not allow_overwrite and os.O_EXCL or 0)) try: locks.lock(fd, locks.LOCK_EX) current_chunk = None while current_chunk != '': current_chunk = old_file.read(chunk_size) os.write(fd, current_chunk) finally: locks.unlock(fd) os.close(fd) finally: old_file.close() copystat(old_file_name, new_file_name) try: os.remove(old_file_name) except OSError, e: # Certain operating systems (Cygwin and Windows) # fail when deleting opened files, ignore it. (For the # systems where this happens, temporary files will be auto-deleted # on close anyway.) if getattr(e, 'winerror', 0) != 32 and getattr(e, 'errno', 0) != 13: raise -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.