New submission from Mingye Wang <[email protected]>:
Consider this interaction:
cmd> echo > 1.txt
cmd> python -c "__import__('os').truncate('1.txt', 1024 ** 3)"
cmd> fsutil sparse queryFlag 1.txt
Not only takes a long time as is typical for a zero-write, but also reports
non-sparse as an actual write would suggest. This is because internally,
_chsize_s and friends enlarges files using a loop.[1]
[1]: https://github.com/leelwh/clib/blob/master/c/chsize.c
On Unix systems, ftruncate for enlarging is described as "... as if the extra
space is zero-filled", but this is not to be taken literally. In practice,
sparse files are used whenever available (GNU dd expects that) and people do
expect the operation to be very fast without a lot of real writes. A FreeBSD
bug exists around how ftruncate is too slow on UFS.
The aria2 downloader gives a good example of how to truncate into a sparse file
on Windows.[2] First a FSCTL_SET_SPARSE control is issued, and then a seek +
SetEndOfFile would finish the job. Of course, a lseek to the end would be
required to first determine the size of the file, so we know whether we are
enlarging (sparse) or shrinking (don't sparse).
[2]: https://github.com/aria2/aria2/blob/master/src/AbstractDiskWriter.cc#L507
----------
components: Library (Lib)
messages: 363717
nosy: Artoria2e5, steve.dower
priority: normal
severity: normal
status: open
title: os.ftruncate on Windows should be sparse
versions: Python 3.8, Python 3.9
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue39910>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com