mbthomas created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  As mentioned in https://phab.mercurial-scm.org/D1222, the recent 
pathconflicts change regresses update
  performance in large repositories when many files are being updated.
  
  To mitigate this, we introduce two caches of directories that have
  already found to be either:
  
  - unknown directories, but which are not aliased by files and so don't need 
to be checked if they are files again; and
  - missing directores, which cannot cause path conflicts, and cannot contain a 
file that causes a path conflict.
  
  When checking the paths of a file, testing against this caches means we can
  skip tests that involve touching the filesystem.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1224

AFFECTED FILES
  mercurial/merge.py

CHANGE DETAILS

diff --git a/mercurial/merge.py b/mercurial/merge.py
--- a/mercurial/merge.py
+++ b/mercurial/merge.py
@@ -653,23 +653,40 @@
         and repo.dirstate.normalize(f) not in repo.dirstate
         and mctx[f2].cmp(wctx[f]))
 
-def _checkunknowndirs(repo, f):
+def _checkunknowndirs(repo, f, unknowndircache, missingdircache):
     """
     Look for any unknown files or directories that may have a path conflict
     with a file.  If any path prefix of the file exists as a file or link,
     then it conflicts.  If the file itself is a directory that contains any
     file that is not tracked, then it conflicts.
 
+    `unknowndircache` is a set of paths known to be good.  This prevents
+    repeated checking of dirs.  It will be updated with any new dirs that
+    are checked and found to be safe.
+
+    `missingdircache` is a set of paths that are known to be absent.  This
+    prevents repeated checking of subdirectories that are known not to exist.
+    It will be updated with any new dirs that are checked and found to be
+    absent.
+
     Returns the shortest path at which a conflict occurs, or None if there is
     no conflict.
     """
 
     # Check for path prefixes that exist as unknown files.
     for p in reversed(list(util.finddirs(f))):
-        if (repo.wvfs.audit.check(p)
-                and repo.wvfs.isfileorlink(p)
-                and repo.dirstate.normalize(p) not in repo.dirstate):
-            return p
+        if p in missingdircache:
+            return
+        if p in unknowndircache:
+            continue
+        if repo.wvfs.audit.check(p):
+            if (repo.wvfs.isfileorlink(p)
+                    and repo.dirstate.normalize(p) not in repo.dirstate):
+                return p
+            if not repo.wvfs.lexists(p):
+                missingdircache.add(p)
+                return
+            unknowndircache.add(p)
 
     # Check if the file conflicts with a directory containing unknown files.
     if repo.wvfs.audit.check(f) and repo.wvfs.isdir(f):
@@ -700,12 +717,15 @@
             elif config == 'warn':
                 warnconflicts.update(conflicts)
 
+        unknowndircache = set()
+        missingdircache = set()
         for f, (m, args, msg) in actions.iteritems():
             if m in ('c', 'dc'):
                 if _checkunknownfile(repo, wctx, mctx, f):
                     fileconflicts.add(f)
                 elif f not in wctx:
-                    path = _checkunknowndirs(repo, f)
+                    path = _checkunknowndirs(repo, f,
+                                             unknowndircache, missingdircache)
                     if path is not None:
                         pathconflicts.add(path)
             elif m == 'dg':



To: mbthomas, #hg-reviewers
Cc: mercurial-devel
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to