On 09/21/2016 08:57 AM, FUJIWARA Katsunori wrote:
At Tue, 20 Sep 2016 18:14:32 +0200,
Pierre-Yves David wrote:

On 09/16/2016 10:51 PM, FUJIWARA Katsunori wrote:
# HG changeset patch
# User FUJIWARA Katsunori <fo...@lares.dti.ne.jp>
# Date 1474057496 -32400
#      Sat Sep 17 05:24:56 2016 +0900
# Node ID 55bd90d63ad3a76fd4d27d5eabb3fe9a7fa0642b
# Parent  71b6b49f8a7ab6c894028b9153290f4bbf0f54f6
manifest: use specific opener to avoid file stat ambiguity around truncation

If steps below occurs at "the same time in sec", all of mtime, ctime
and size are same between (1) and (3).

  1. append data to 00manifest.i (and close transaction)
  2. discard appended data by truncation (strip or rollback)
  3. append same size but different data to 00manifest.i again

Therefore, cache validation doesn't work after (3) as expected.

To avoid such file stat ambiguity around truncation, this patch uses
specific opener, which forces checkambig=True at writing 00manifest.i
changes out.

Hooking vfs.__call__() is the only way to centralize changes into
manifest.py, because all logic to actually write in-memory changes out
is implemented in revlog layer.

In fact, reusing already wrapped self.opener for dirlogcache entries
like as below is a little redundant, because storecache-ed properties
of localrepository doesn't refer file stat of non-root 00manifest.i
files (= ambiguity of such files doesn't cause any issue).

        if dir not in self._dirlogcache:
            self._dirlogcache[dir] = manifestrevlog(self.opener, dir,
                                                    self._dirlogcache)

But wrapped opener never checks ambiguity of file stat for non-root
00manifest.i files, because "path == '00manifest.i'" condition in
wrapper.__call__() is always False for such files. Therefore, this
patch simply reuses wrapped self.opener for non-root 00manifest.i
files.

Even after this patch, avoiding file stat ambiguity of 00manifest.i
around truncation isn't yet completed, because truncation side isn't
aware of this issue.

This is a part of ExactCacheValidationPlan.

    https://www.mercurial-scm.org/wiki/ExactCacheValidationPlan

diff --git a/mercurial/manifest.py b/mercurial/manifest.py
--- a/mercurial/manifest.py
+++ b/mercurial/manifest.py
@@ -892,6 +892,28 @@ class treemanifest(object):
                 subp1, subp2 = subp2, subp1
             writesubtree(subm, subp1, subp2)

+def _checkambigopener(opener):
+    """build an opener to add checkambig=True at changing 00manifest.i
+    """
+    class wrapper(opener.__class__):
+        def __call__(self, path, mode="r", text=False, atomictemp=False,
+                     notindexed=False, backgroundclose=False, checkambig=False,
+                      **kwargs):
+            if path == '00manifest.i' and mode not in ('r', 'rb'):
+                # check file stat ambiguity at closing forcibly
+                checkambig = True
+            supercall = super(wrapper, self).__call__
+            return supercall(path, mode,
+                             text=text,
+                             atomictemp=atomictemp,
+                             notindexed=notindexed,
+                             backgroundclose=backgroundclose,
+                             checkambig=checkambig,
+                             **kwargs)
+    opener.__class__ = wrapper

This wrapper logic is a bit scary here. Could we  had the necessary
minimal hooks in revlog to allow manifest and changelog to hooks in ?
That would seems more robust/clean than wrapping core from core.

How about adding optional argument "checkambig" to revlog.__init__(),
of which default value is False ?\

Given that multiple subclass uses it it seems to make sense to go this way. It seems simple enough. The revlog class has other small 'extension point' for subclass so there is precedence.

Cheers,

--
Pierre-Yves David
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Reply via email to