On 09/16/2016 10:51 PM, FUJIWARA Katsunori wrote:
# HG changeset patch
# User FUJIWARA Katsunori <fo...@lares.dti.ne.jp>
# Date 1474057496 -32400
#      Sat Sep 17 05:24:56 2016 +0900
# Node ID 55bd90d63ad3a76fd4d27d5eabb3fe9a7fa0642b
# Parent  71b6b49f8a7ab6c894028b9153290f4bbf0f54f6
manifest: use specific opener to avoid file stat ambiguity around truncation

If steps below occurs at "the same time in sec", all of mtime, ctime
and size are same between (1) and (3).

  1. append data to 00manifest.i (and close transaction)
  2. discard appended data by truncation (strip or rollback)
  3. append same size but different data to 00manifest.i again

Therefore, cache validation doesn't work after (3) as expected.

To avoid such file stat ambiguity around truncation, this patch uses
specific opener, which forces checkambig=True at writing 00manifest.i
changes out.

Hooking vfs.__call__() is the only way to centralize changes into
manifest.py, because all logic to actually write in-memory changes out
is implemented in revlog layer.

In fact, reusing already wrapped self.opener for dirlogcache entries
like as below is a little redundant, because storecache-ed properties
of localrepository doesn't refer file stat of non-root 00manifest.i
files (= ambiguity of such files doesn't cause any issue).

        if dir not in self._dirlogcache:
            self._dirlogcache[dir] = manifestrevlog(self.opener, dir,

But wrapped opener never checks ambiguity of file stat for non-root
00manifest.i files, because "path == '00manifest.i'" condition in
wrapper.__call__() is always False for such files. Therefore, this
patch simply reuses wrapped self.opener for non-root 00manifest.i

Even after this patch, avoiding file stat ambiguity of 00manifest.i
around truncation isn't yet completed, because truncation side isn't
aware of this issue.

This is a part of ExactCacheValidationPlan.


diff --git a/mercurial/manifest.py b/mercurial/manifest.py
--- a/mercurial/manifest.py
+++ b/mercurial/manifest.py
@@ -892,6 +892,28 @@ class treemanifest(object):
                 subp1, subp2 = subp2, subp1
             writesubtree(subm, subp1, subp2)

+def _checkambigopener(opener):
+    """build an opener to add checkambig=True at changing 00manifest.i
+    """
+    class wrapper(opener.__class__):
+        def __call__(self, path, mode="r", text=False, atomictemp=False,
+                     notindexed=False, backgroundclose=False, checkambig=False,
+                      **kwargs):
+            if path == '00manifest.i' and mode not in ('r', 'rb'):
+                # check file stat ambiguity at closing forcibly
+                checkambig = True
+            supercall = super(wrapper, self).__call__
+            return supercall(path, mode,
+                             text=text,
+                             atomictemp=atomictemp,
+                             notindexed=notindexed,
+                             backgroundclose=backgroundclose,
+                             checkambig=checkambig,
+                             **kwargs)
+    opener.__class__ = wrapper

This wrapper logic is a bit scary here. Could we had the necessary minimal hooks in revlog to allow manifest and changelog to hooks in ? That would seems more robust/clean than wrapping core from core.

+    return opener
 class manifestrevlog(revlog.revlog):
     '''A revlog that stores manifest texts. This is responsible for caching the
     full-text manifest contents.
@@ -927,6 +949,13 @@ class manifestrevlog(revlog.revlog):
             self._dirlogcache = {'': self}

+            # If "dir" isn't empty, this "opener" is reused one via
+            # "self.opener" of another manifestrevlog. Therefore,
+            # wrapping opener is needed only at once for root
+            # 00manifest.i file, to avoid multiple (= redundant)
+            # wrapping.
+            opener = _checkambigopener(opener)
         super(manifestrevlog, self).__init__(opener, indexfile)

Mercurial-devel mailing list

Pierre-Yves David
Mercurial-devel mailing list

Reply via email to