indygreg created this revision. Herald added a subscriber: mercurial-devel. Herald added a reviewer: hg-reviewers.
REVISION SUMMARY unlink() on Windows is complicated by the fact that Windows doesn't support removing an open file. Our unlink wrapper currently goes through a dance where it generates a random filename, renames the file, then attempts to unlink. This is a lot of extra work for what is likely the special case of attempting to unlink a file that is open or can't be simply unlinked. In this commit, we refactor the code to use fewer system calls in the common cases of 1) unlink just works 2) file doesn't exist. For the file doesn't exist case (before) 1. stat() to determine if path is directory 2. generate random filename 3. rename() after: 1. stat() to get path info For the file not open case (before) 1. stat() to determine if path is directory 2. generate random filename 3. rename() 4. unlink() after: 1. stat() 2. unlink() For the file open case (before) 1. stat() to determine if path is directory 2. generate random filename 3. rename() 4. unlink() after: 1. stat() 2. unlink() 3. generate random filename 4. rename() 5. unlink() There is also a scenario where the unlink fails due to the file being marked as read-only. In this case we also introduce an extra unlink() call. However, I reckon the common case is the file isn't marked as read-only and skipping the random generation and rename is worth it. I /think/ this change makes bulk file writing on Windows faster. This is because vfs.__call__ calls unlink() when a file is opened for writing. When writing a few hundred files files as part of a clone or working directory update, the extra system calls can matter. win32.unlink() did show up in performance profiles, which is what caused me to look at this code. But I/O performance is hard to measure and I'm not sure if the ~30s off of ~620s for a stream clone unbundle on the Firefox repo is indicative of real world performance. I do know the new code uses fewer system calls and shouldn't be slower. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D588 AFFECTED FILES mercurial/win32.py CHANGE DETAILS diff --git a/mercurial/win32.py b/mercurial/win32.py --- a/mercurial/win32.py +++ b/mercurial/win32.py @@ -12,6 +12,7 @@ import msvcrt import os import random +import stat import subprocess from . import ( @@ -522,11 +523,26 @@ def unlink(f): '''try to implement POSIX' unlink semantics on Windows''' - if os.path.isdir(f): - # use EPERM because it is POSIX prescribed value, even though - # unlink(2) on directories returns EISDIR on Linux - raise IOError(errno.EPERM, - "Unlinking directory not permitted: '%s'" % f) + # If the path doesn't exist, raise that exception. + # If it is a directory, emulate POSIX behavior. + try: + st = os.stat(f) + if stat.S_ISDIR(st.st_mode): + # use EPERM because it is POSIX prescribed value, even though + # unlink(2) on directories returns EISDIR on Linux + raise IOError(errno.EPERM, + "Unlinking directory not permitted: '%s'" % f) + except OSError as e: + if e.errno == errno.ENOENT: + raise + + # In the common case, a normal unlink will work. Try that first and fall + # back to more complexity if and only if we need to. + try: + os.unlink(f) + return + except (IOError, OSError) as e: + pass # POSIX allows to unlink and rename open files. Windows has serious # problems with doing that: To: indygreg, #hg-reviewers Cc: mercurial-devel _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel