[issue8604] Adding an atomic FS write API

2016-01-20 Thread Thomas Kluyver

Thomas Kluyver added the comment:

For reference, we added our own atomic_writing() context manager in 
Jupyter/IPython, after someone lost work when save failed on a full disk. We 
initially went for the same approach - write to a temporary file, fsync, rename 
over the original file - but it caused issues with network filesystems and 
folders synced by Dropbox or similar services.

As a result we switched to an approach that copies the old file to a temporary 
name, writes the new data directly to the old name, then removes the temporary 
copy on success, or renames it back to the old name on failure. This is 
certainly less rigorous, but it means that we're not replacing the file on 
every save, and it's less of a problem if copying file attributes with 
copystat() fails.

I'm sure there are use cases for atomic writing where these problems won't come 
up, but I hope this is a useful data point, at least for documenting any atomic 
writing solution.

--
nosy: +takluyver

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2016-01-20 Thread STINNER Victor

STINNER Victor added the comment:

It looks like each platform and each filesystem behave differently, with subtle 
differencies with different kernel versions. I'm not more sure that the Python 
stdlib is the best place to host an implementation of an "atomic write" API.

I now suggest to create a project on PyPI, play with it, try to handle as much 
platforms as possible, etc. Then when it is considered as stable and robust 
enough, propose to integrate it to the Python stdlib.

Python stdlib is not the best place for best effort algorithm, we prefer to 
only add reliable and portable code. And leave "other code" outside the stdlib.

--
resolution:  -> out of date
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2015-12-22 Thread Марк Коренберг

Марк Коренберг added the comment:

instead of realpath, os.lstat() may be used in order to detect if target file 
object is not plain file. Also, os.rename() cannot be used since it follows 
symlinks by default.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2015-12-22 Thread Марк Коренберг

Марк Коренберг added the comment:

Also, modern kernels allows to replace entire directory!

renameat2() + RENAME_EXCHANGE

So, there should be more atomic operations

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2015-12-22 Thread Марк Коренберг

Марк Коренберг added the comment:

You also forgot about two things:

1. set temporary file permissions before rename
2. fsync(open(os.dirname(temporaryfile)))
3. if original file name is symlink, replace will works wrong. os.realpath 
should be used.

So, here are real life function, that we use in production:


# TODO: save_mtime, save_selinux, extended attr, chattrs and so on...
# TODO: malicious user may replace directory with another while writing to file,
#   so we should use open directory first, and then use openat() and 
renameat()
#   with relative filename instead of specifying absolute path.
#   also NameTemporaryFile should allow to specify dir=
@contextmanager
def replace_file(path, save_perms=False, fsync=True, **kwargs):
realpath = os.path.realpath(path)
operating_dir = os.path.dirname(realpath)
uid = None
gid = None
mode = None
if save_perms:
try:
stinfo = os.lstat(realpath)
uid = stinfo.st_uid
gid = stinfo.st_gid
mode = stinfo.st_mode
except OSError as e:
if e.errno != errno.ENOENT:
raise

with NamedTemporaryFile(dir=operating_dir, **kwargs) as fff:
filedes = fff.fileno()
if None not in (uid, gid, mode):
os.fchown(filedes, uid, gid)
os.fchmod(filedes, mode & 0o)

yield fff

# survive application crash
fff.flush()
if fsync:
# survive power outage, is not required if that is temporary file
os.fsync(filedes)
os.rename(fff.name, realpath)

# see http://bugs.python.org/issue21579
fff._closer.delete = False

if fsync:
# Sync directory: 
http://stackoverflow.com/questions/3764822/how-to-durably-rename-a-file-in-posix
dirfd = os.open(operating_dir, os.O_RDONLY | os.O_CLOEXEC | 
os.O_DIRECTORY)
try:
os.fsync(dirfd)
finally:
os.close(dirfd)

--
nosy: +mmarkk

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-16 Thread Charles-François Natali

Charles-François Natali added the comment:

 (Note that the Beaker version would need to be enhanced with the extra API 
 parameters from Victor's version, as well as updated to use the exclusive 
 open and close-on-exec flags)

I think the API would be nicer if it was just a wrapper around the
underlying temporary file object, delegating all methods except
close() (sort of like the _TemporaryFileWrapper in tempfile).

Something like (untested, not even foolproof-read) :

class AtomicFile:

def __init__(self, path):
self.dest_path = path
dirname, basename = os.path.split(self.dest_path)
fd, temp_path = tempfile.mkstemp(prefix='.' + basename, dir=dirname)
try:
self.f = os.fdopen(fd, 'w')
except:
os.unlink(temp_path)
raise
self.temp_path = temp_path

def close(self):
self.f.close()
os.rename(self.temp_path, self.dest_path)

# copied from tempfile
def __getattr__(self, name):
# Attribute lookups are delegated to the underlying file
# and cached for non-numeric results
# (i.e. methods are cached, closed and friends are not)
file = self.__dict__['file']
a = getattr(file, name)
if not issubclass(type(a), type(0)):
setattr(self, name, a)
return a

def __enter__self():
self.file.__enter__()
return f

def __exit__(self, exc, value, tb):
   if exc is None:
   self.close()
   else:
   self.file.close()
   os.remove(self.temp_path)

This way, one just has to do:
f = AtomicFile(path)
and then use it as a normal file; it will automatically be committed
(or rollback) when the file is closed, either because you're leaving
the context manager, or because an explicit() close is called.
It also makes it easier to use with legacy code/libraries accepting an
open file (since no modification is needed), is less error-prone
(since you don't have to remember to call special methods like
destroy_temp/replace_dest), and leads to a more compact API (exactly
the same as regular files).

Otherwise, I think the calls to fsync() should be optional (maybe an
option passed to the constructor): most of the time, you want
atomicity but not durability (i.e. you don't really care if data is
committed to disk), and you don't want to pay for the performance hit
incurred by fsync().

Also, fsync() should also be done on the containing directory (which
is not the case in Victor's version).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-16 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:


--
nosy: +Arfrever

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-16 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
stage:  - patch review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-16 Thread Barry A. Warsaw

Barry A. Warsaw added the comment:

Of course, I have my own atomic-rename thing, but I'm not going to post the 
code here.  It's fairly limited to my own use case and I have no interest in 
making it cross platform.

That being said, I personally found that a context manager with a signature 
identical to built-in open() was the most convenient API.  It looks natural:

with atomic(filename, 'w', encoding='utf-8') as fp:
   data = do_a_bunch_of_things_that_might_fail()
   fp.write(data)

If any of that fails, the temporary file is guaranteed to be cleaned up, 
otherwise, filename (which is the ultimate destination) will be guaranteed to 
exist.

Another reason why context managers are useful is with ExitStack(), where you 
might have a bunch of conditions that you want to guarantee get cleaned up 
properly if any of them fail.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-15 Thread Barry A. Warsaw

Changes by Barry A. Warsaw ba...@python.org:


--
nosy: +barry

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-15 Thread STINNER Victor

STINNER Victor added the comment:

Here is a patch based on the new os.replace() function. I only tested the patch 
on Linux.

My patch must be tested on Windows, I don't know what happens when the file 
object is closed when the NamedTemporaryFile.delete attribute is set to False. 
Does close() raise an exception?

I hesitate between prefix and dir parameters for NamedTemporaryFile. If prefix 
is equal to filename, is it possible for NamedTemporaryFile to create a file 
called filename? Or does it always add a suffix? If the program crashs before 
rename, I prefer to leave file.XXX temporary files (ex: file.tmp, 
file.tmp.2, ...) instead of random names like tmpXXX.

It is important to create the temporary file in the same directory to not 
create a file on a different file system. os.replace() fails on POSIX if the 
source and the destination are on two different file systems.

If importing tempfile in shutil is a problem, it may be done in atomic_write() 
(maybe using a global variable to ensure that the module is only imported once).

--

First, I tried to handle the temporary file manually (without the tempfile 
module) with a constant suffix. I only created a temporary file if the creation 
of the file (in exclusive mode, using x mode) failed.

But Antoine pointed me on IRC that the function is not atomic because the file 
may be invalid before the file content is completly written and flushed. If two 
threads try to create the same file, the second thread gets a FileExistsError. 
In this case, the caller has to handle FileExistsError. The caller may remove 
the temporary file of the first thread, which may make the situation even worse.

I prefer to use tempfile.NamedTemporaryFile because it is portable and well 
tested. It also uses a random suffix. On Windows, the O_TEMPORARY flag is 
passed to os.open(), so the file is removed by the OS when the file is closed, 
or when the program does crash.

--
keywords: +patch
Added file: http://bugs.python.org/file30273/atomic_write.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-15 Thread STINNER Victor

STINNER Victor added the comment:

+   # Flush Python buffers to system buffers
+   fileobj.flush()
+
+   if hasattr(os, 'fsync'):
+   # Flush system buffers to disk
+   fd = fileobj.fileno()
+   os.fsync(fd)

A fsync=True paramater may be added if calling os.fsync() is an issue.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-15 Thread STINNER Victor

STINNER Victor added the comment:

I prefer to use tempfile.NamedTemporaryFile because it is portable and well 
tested. It also uses a random suffix. On Windows, the O_TEMPORARY flag is 
passed to os.open(), ...

Oh, and it sets also the close-on-exec safe, which is also more secure.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-15 Thread STINNER Victor

STINNER Victor added the comment:

atomic_write.patch calls os.replace(src, dst) whereas the src file is open. It 
works on Linux, but it sounds a little bit strange to me and may fail on other 
platforms.

Here is another patch (atomic_write_mkstemp.patch) using tempfile.mkstemp() 
instead of tempfile.NamedTemporaryFile(delete=True) to control when the file is 
closed and removed.

--
Added file: http://bugs.python.org/file30275/atomic_write_mkstemp.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-15 Thread STINNER Victor

STINNER Victor added the comment:

 Here's the part of the code that does the open/close part:
 http://bazaar.launchpad.net/~exabyte/blackherd/async-refactor/view/61/blackherd/misc.py#L498

This code contains a bug: hasattr('os', 'fsync') is never True :-)

This part is interesting:

# fsync on Mac OS X doesn't work, it requires using the
# F_FULLSYNC fcntl
if hasattr(fcntl, 'F_FULLFSYNC'):
fcntl.fcntl(self._fd, fcntl.F_FULLFSYNC)

= see also #11877

 http://trac.edgewall.org/browser/trunk/trac/util/__init__.py?#L145

This class copies file owner, permissions, flags, etc. atomic_write() should 
probably also call copystat() on the temporary file.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-15 Thread Nick Coghlan

Nick Coghlan added the comment:

We have one of these in Beaker (we didn't need to wait for os.replace, since 
Beaker only runs on Linux): 
http://git.beaker-project.org/cgit/beaker/tree/Common/bkr/common/helpers.py?h=develop#n90

It turns out to be beneficial to separate the three operations in order to 
cooperate more cleanly with other operations on the same file system (in our 
case, we needed to be able to write data uploaded over a network connection to 
the temporary file *before* acquiring an flock when we did the rename).

We also create the file adjacent to the destination, as creating it under the 
/tmp heirarchy and then renaming it is invalid under SELinux (since the 
security context ends up being incorrect).

While Beaker is GPLv2, I'd easily be able to get permission to extract this and 
contribute it to the PSF.

Incidentally, I'm thinking tempfile might actually be a more sensible home for 
this than shutil.

--
assignee: tarek - 

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2013-05-15 Thread Nick Coghlan

Nick Coghlan added the comment:

(Note that the Beaker version would need to be enhanced with the extra API 
parameters from Victor's version, as well as updated to use the exclusive open 
and close-on-exec flags)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2012-11-02 Thread Martin Panter

Changes by Martin Panter vadmium...@gmail.com:


--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2012-10-29 Thread Nick Coghlan

Nick Coghlan added the comment:

Since it may not be clear from the rest of the thread, the os.replace() 
cross-platform atomic renaming building block was added in 3.3 (in #8828, which 
Victor linked above)

Another stdlib use case now also exists in importlib (see 
http://hg.python.org/cpython/file/3.3/Lib/importlib/_bootstrap.py#l121). 
However, due to bootstrapping issues, importlib would not be able to use a 
version that was provided in a module like shutil.

--
nosy: +ncoghlan
versions: +Python 3.4 -Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2012-02-15 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

I'm opposed to adding atomic files to the standard library. This has nothing 
to do with the original issue; anybody contributing to the issue is making up 
user needs that have not been sufficiently demonstrated.

It would be best to release such code on PyPI first, and wait a few years to 
see whether there really is an end-user demand for the feature.

--
nosy: +loewis

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 I wish we had something like:
 io.file
 io.file.tempfile
 io.file.path
 io.file.atomicfile

I adhere to flat is better than nested. e.g. it always irks me to type 
urllib.request instead of urllib. A 3-level deep nesting in the stdlib 
would certainly shock many people :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-23 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

 I adhere to flat is better than nested. e.g. it always irks me to type
 urllib.request instead of urllib.

Well, that's what (selective) imports are for, no ?
from urllib import request
from file.path import exists
import xml.etree.cElementTree as ET

Anyway, my thoughts? wasn't referring to this namespace hierarchy
proposal (which really isn't one), but rather to the best module
which could host an AtomicFile class.
Still shutil?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 Anyway, my thoughts? wasn't referring to this namespace hierarchy
 proposal (which really isn't one), but rather to the best module
 which could host an AtomicFile class.

Does it have to be a class? What would be the operations apart from
write()?

 Still shutil?

I think that's our best compromise.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-23 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

 Does it have to be a class? What would be the operations apart from
 write()?

Well, I thought that making it a file-like object could be useful:
that way, one could pass it to pickle.dump(), logging.StreamHandler or
any method expecting a file-like object, and would gain atomicity
(persistency) transparently, without refactoring.
My quick and dirty AtomicFile implementation reused the
_TemporaryFileWrapper attribute delagation trick:

def __getattr__(self, name):
# Attribute lookups are delegated to the underlying file
# and cached for non-numeric results
# (i.e. methods are cached, closed and friends are not)
file = self.__dict__['file']
a = getattr(file, name)
if not issubclass(type(a), type(0)):
setattr(self, name, a)
return a


 Still shutil?

 I think that's our best compromise.

OK.
I don't have a strong opinion about this, and you've got much more
experience than me, so I trust your judgement ;-)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

  Does it have to be a class? What would be the operations apart from
  write()?
 
 Well, I thought that making it a file-like object could be useful:
 that way, one could pass it to pickle.dump(), logging.StreamHandler or
 any method expecting a file-like object, and would gain atomicity
 (persistency) transparently, without refactoring.

Mmmh... I might have misunderstood the proposal then. I thought this
atomic write API was about writing the file contents in one go and
immediately closing the file. If you want atomicity to apply to logging,
you must instead guarantee the durability of each write() call, meaning
calling fsync() on each logging call, which would be very expensive.

Am I missing something?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-23 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

 If you want atomicity to apply to logging,
 you must instead guarantee the durability of each write() call, meaning
 calling fsync() on each logging call

Why so?
There's no point in calling fsync() after each write, since data is
written to a temporary file which won't be visible before rename():
even if you call fsync() after each write, if the file is not properly
closed, it won't be committed (i.e.renamed).
You just need to call fsync() once before closing the file, and then
rename it, to ensure that the committed file version is consistent.
In the meantime, you could do whatever you want with the stream (the
hypothetical AtomicFile): seek(), read(), etc. I'm not sure many
methods besides write will be useful, but for example one could easily
imagine that the library expects the passed object to have a fileno()
method.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-23 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 There's no point in calling fsync() after each write, since data is
 written to a temporary file which won't be visible before rename():
 even if you call fsync() after each write, if the file is not properly
 closed, it won't be committed (i.e.renamed).

Ah, it's a temporary file indeed.
But does that mean you can't inspect your logs in real time? All log
files I have ever seen are available under their final name while they
are still being written to.

 In the meantime, you could do whatever you want with the stream (the
 hypothetical AtomicFile): seek(), read(), etc.

I see. Yes, then an AtomicFile class makes sense (not for logging
though, IMHO :-)).

And I understand your point about it being available in io or some io
submodule (not a 3-level deep one though :-)). Perhaps you want to ask
on python-dev?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-23 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

 Ah, it's a temporary file indeed.
 But does that mean you can't inspect your logs in real time? All log
 files I have ever seen are available under their final name while they
 are still being written to.


Yes, logging was a poor example :-)

 And I understand your point about it being available in io or some io
 submodule (not a 3-level deep one though :-)). Perhaps you want to ask
 on python-dev?

OK, I'll try that.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-22 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 I'm not sure about the best module to host this, though: os.path ?

os.path is mostly about path manipulation functions, although it also performs 
directory I/O (e.g. realpath()).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-22 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

 I prefer to write a best-effort function

I disagree. People who explicitely use an atomic file API want 
atomicity/persistency, otherwise they wouldn't use it. Exposing a function that 
may, or may not, be atomic is just plain wrong.
IMHO, the right thing to do on OSes that don't provide atomic rename (and I 
doubt there are many of them, see below) is to raise an exception, so that the 
user can fall back to whatever he thinks is best (bail out, rollback, etc).

 and so I consider that shutil is the best place for such function.

As noted by Jean-Paul, shutil stands for shell utils: that would be a rather 
poor choice: atomicfile fits in shutil as much as tempfile would :-)

 Some OS don't provide atomic rename.

Which one?
See Antoine's message:
http://bugs.python.org/issue8828#msg146274

Apparently, Windows = XP does have an atomic rename(), and every POSIX 
compliant OS rename(2) should be atomic.

 os.path is mostly about path manipulation functions

I agree.
I wish we had something like:
io.file
io.file.tempfile
io.file.path
io.file.atomicfile

Thoughts?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-21 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

I'd like to make this move forward.
Milko, I see you have an implementation which looks OK at a first glance. Would 
like to propose a patch against the default branch? The patch should include 
documentation and tests.
I'm not sure about the best module to host this, though: os.path ?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-12-21 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

 I'm not sure about the best module to host this, though: os.path ?

Some OS don't provide atomic rename. If we only define a function when it is 
atomic (as we do in the posix module, only expose functions available on the 
OS), programs will have to write two versions of their function. I prefer to 
write a best-effort function, and so I consider that shutil is the best place 
for such function.

If this function requires a function specific to the OS, the specific function 
can be added to posix (or another module).

Example:

with shutil.write_file(filename) as f:
  f.write(I hope it will be atomic!)

We need an atomic rename file function to implement atomic write file: see 
#8828.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-05-13 Thread Charles-François Natali

Charles-François Natali neolo...@free.fr added the comment:

Something's missing in all the implementations presented:
to make sure that the new version of the file is available afer a crash, fsync 
must be called on the containing directory after the rename.

--
nosy: +neologix

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-05-13 Thread Milko Krachounov

Milko Krachounov pyt...@milko.3mhz.net added the comment:

 Something's missing in all the implementations presented:
 to make sure that the new version of the file is available afer 
 a crash, fsync must be called on the containing directory after 
 the rename.

I upgraded my proposed approach to include dir fsync, and to do a copy when 
backing up instead of rename (is that good?)

http://bazaar.launchpad.net/~exabyte/blackherd/async-refactor/view/66/blackherd/misc.py#L498

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-05-12 Thread Milko Krachounov

Milko Krachounov pyt...@milko.3mhz.net added the comment:

I have a class for overwriting files properly that I use in three of my 
projects. It aims to be atomic as possible, supports creating backups, but it 
doesn't have functions to set or retain permissions when requested (which might 
be desirable if such a function is added to stdlib). I'd give it here for 
reference and ideas.

- It's a context manager acting like a normal file object so you can use it 
with e.g. json.dump. In CM mode, if an error is caught, you end up with the old 
file automatically. If you use it as a file, the 'close' method has a 'cancel' 
argument to achieve the same.
- Normal overwrite on POSIX uses flush, fsync, rename as it should.
- Since fsync doesn't work on Mac OS X, it takes care of calling the Mac OS X 
specific F_FULLFSYNC fcntl.
- On POSIX, if a backup is requested, an attempt is made to do it with a 
hardlink, otherwise do two renames (killing the atomicity). Maybe a copy with 
shutil would be a better choice though.
- On Windows it uses two renames - the old file is backed up to a temporary 
name, and then the new file is renamed over it. If a backup wasn't requested, 
the temporary name is deleted.

I also have a simple unit test for it, but I've ran it on POSIX only.

Here's the part of the code that does the open/close part:
http://bazaar.launchpad.net/~exabyte/blackherd/async-refactor/view/61/blackherd/misc.py#L498

And the unit test:
http://bazaar.launchpad.net/~exabyte/blackherd/async-refactor/view/61/tests/test_misc.py#L473

I hope that's useful.

--
nosy: +milko.krachounov

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2011-05-12 Thread Milko Krachounov

Milko Krachounov pyt...@milko.3mhz.net added the comment:

Well, since I had a typo in the main method which called the Windows 
implementation on POSIX, the unit test works on the code for Windows when ran 
on POSIX. Heh, I'm sorry for the noise, but it seems that re-reading the code 
four times and running the unit tests is not enough, corrected revision:

http://bazaar.launchpad.net/~exabyte/blackherd/async-refactor/view/62/blackherd/misc.py#L498

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2010-06-09 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Trac has an AtomicFile class which copies file mode, owner, etc.
http://trac.edgewall.org/browser/trunk/trac/util/__init__.py?#L145

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2010-05-26 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

 This sounds silly to me. You can write a file in two lines:

 with open(foo, wb) as f:
f.write(contents)

If the disk is full, write fails and the new file only contains a part of 
'contents'. If the file does already exist and the write fails, the original 
content is lost.

The correct pattern is something like:

@contextlib.contextmanager
def atomic_write(filename):
  tempname = filename + .tmp
  out = open(tempname, w)
  try:
 yield out
 if hasattr('os', 'fsync'):
os.fsync(out.fileno())
 out.close()
 if os.name in ('nt', 'ce'):
os.unlink(filename)
# ... hope that it doesn't fail here ...
 os.rename(tempname, filename)
  except:
 out.close()
 os.unlink(tempname)
 raise

Remarks:
 - call fsync() to ensure that the new file content is written on disk. it does 
nothing on Mac OS X
 - on Windows, it's not possible to rename a to b if b does already exist. New 
Windows versions has an atomic function: MoveFileTransacted(). Older versions 
have MoveFileEx(MOVEFILE_REPLACE_EXISTING | MOVEFILE_WRITE_THROUGH) and 
ReplaceFile()

This context manager is *not* atomic on any OS. It's only atomic on some OS, 
and it may depends on the kernel version (see recent discussions about 
ext3/ext4, fsync and write barrier).

--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2010-05-26 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

An article about ext3 and fsync: Solving the ext3 latency problem
http://lwn.net/Articles/328363/

Another good article, written by the maintainer of ext4 (Theodore Ts'o): Don’t 
fear the fsync!
http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2010-05-26 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

We need an atomic rename function to implement atomic_write(): I opened a new 
issue, #8828.

--
dependencies: +Atomic function to rename a file

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2010-05-07 Thread Éric Araujo

Changes by Éric Araujo mer...@netwok.org:


--
nosy: +merwok

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2010-05-03 Thread Tarek Ziadé

Changes by Tarek Ziadé ziade.ta...@gmail.com:


--
title: Alias for distutils.file_util.write_file in e.g. shutils - Adding an 
atomic FS write API

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8604] Adding an atomic FS write API

2010-05-03 Thread Giampaolo Rodola'

Changes by Giampaolo Rodola' g.rod...@gmail.com:


--
nosy: +giampaolo.rodola

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8604
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com