D5139: store: introduce _matchtrackedpath() and use it to filter store files

2018-11-08 Thread yuja (Yuya Nishihara)
yuja added a comment.


  Can you add the following tests?
  
  - encoded filename differs from the original name (e.g. uppercase letter)
  - fncache disabled, but encodedstore is used
  
  >   def datafiles(self, matcher=None):
  >   for a, b, size in super(encodedstore, self).datafiles():
  > 
  > +if not _matchtrackedpath(a, matcher):
  >  +continue
  > 
  >   try:
  >   a = decodefilename(a)
  
  I'm pretty sure it's wrong to pass in an encoded filename to matcher.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5139

To: pulkit, durin42, #hg-reviewers
Cc: yuja, mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: D5139: store: introduce _matchtrackedpath() and use it to filter store files

2018-11-08 Thread Yuya Nishihara
Can you add the following tests?

- encoded filename differs from the original name (e.g. uppercase letter)
- fncache disabled, but encodedstore is used

>  def datafiles(self, matcher=None):
>  for a, b, size in super(encodedstore, self).datafiles():
> +if not _matchtrackedpath(a, matcher):
> +continue
>  try:
>  a = decodefilename(a)

I'm pretty sure it's wrong to pass in an encoded filename to matcher.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D5139: store: introduce _matchtrackedpath() and use it to filter store files

2018-11-05 Thread pulkit (Pulkit Goyal)
This revision was automatically updated to reflect the committed changes.
Closed by commit rHG9aeb9e2d28a7: store: introduce _matchtrackedpath() and use 
it to filter store files (authored by pulkit, committed by ).

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D5139?vs=12259=12399

REVISION DETAIL
  https://phab.mercurial-scm.org/D5139

AFFECTED FILES
  mercurial/store.py
  mercurial/streamclone.py
  tests/test-narrow-clone-stream.t

CHANGE DETAILS

diff --git a/tests/test-narrow-clone-stream.t b/tests/test-narrow-clone-stream.t
--- a/tests/test-narrow-clone-stream.t
+++ b/tests/test-narrow-clone-stream.t
@@ -1,7 +1,16 @@
+#testcases tree flat
+
 Tests narrow stream clones
 
   $ . "$TESTDIR/narrow-library.sh"
 
+#if tree
+  $ cat << EOF >> $HGRCPATH
+  > [experimental]
+  > treemanifest = 1
+  > EOF
+#endif
+
 Server setup
 
   $ hg init master
@@ -27,13 +36,51 @@
 
 Enable stream clone on the server
 
-  $ echo "[server]" >> master/.hg/hgrc
+  $ echo "[experimental.server]" >> master/.hg/hgrc
   $ echo "stream-narrow-clones=True" >> master/.hg/hgrc
 
 Cloning a specific file when stream clone is supported
 
   $ hg clone --narrow ssh://user@dummy/master narrow --noupdate --include 
"dir/src/f10" --stream
   streaming all changes
-  remote: abort: server does not support narrow stream clones
-  abort: pull failed on remote
-  [255]
+  * files to transfer, * KB of data (glob)
+  transferred * KB in * seconds (* */sec) (glob)
+
+  $ cd narrow
+  $ ls
+  $ hg tracked
+  I path:dir/src/f10
+
+Making sure we have the correct set of requirements
+
+  $ cat .hg/requires
+  dotencode
+  fncache
+  generaldelta
+  narrowhg-experimental
+  revlogv1
+  store
+  treemanifest (tree !)
+
+Making sure store has the required files
+
+  $ ls .hg/store/
+  00changelog.i
+  00manifest.i
+  data
+  fncache
+  meta (tree !)
+  narrowspec
+  undo
+  undo.backupfiles
+  undo.phaseroots
+
+Checking that repository has all the required data and not broken
+
+  $ hg verify
+  checking changesets
+  checking manifests
+  checking directory manifests (tree !)
+  crosschecking files in changesets and manifests
+  checking files
+  checked 40 changesets with 1 changes to 1 files
diff --git a/mercurial/streamclone.py b/mercurial/streamclone.py
--- a/mercurial/streamclone.py
+++ b/mercurial/streamclone.py
@@ -545,10 +545,6 @@
 Returns a 3-tuple of (file count, file size, data iterator).
 """
 
-# temporarily raise error until we add storage level logic
-if includes or excludes:
-raise error.Abort(_("server does not support narrow stream clones"))
-
 with repo.lock():
 
 entries = []
diff --git a/mercurial/store.py b/mercurial/store.py
--- a/mercurial/store.py
+++ b/mercurial/store.py
@@ -24,6 +24,20 @@
 
 parsers = policy.importmod(r'parsers')
 
+def _matchtrackedpath(path, matcher):
+"""parses a fncache entry and returns whether the entry is tracking a path
+matched by matcher or not.
+
+If matcher is None, returns True"""
+
+if matcher is None:
+return True
+path = decodedir(path)
+if path.startswith('data/'):
+return matcher(path[len('data/'):-len('.i')])
+elif path.startswith('meta/'):
+return matcher.visitdir(path[len('meta/'):-len('/00manifest.i')] or 
'.')
+
 # This avoids a collision between a file named foo and a dir named
 # foo.i or foo.d
 def _encodedir(path):
@@ -413,6 +427,8 @@
 
 def datafiles(self, matcher=None):
 for a, b, size in super(encodedstore, self).datafiles():
+if not _matchtrackedpath(a, matcher):
+continue
 try:
 a = decodefilename(a)
 except KeyError:
@@ -542,6 +558,8 @@
 
 def datafiles(self, matcher=None):
 for f in sorted(self.fncache):
+if not _matchtrackedpath(f, matcher):
+continue
 ef = self.encode(f)
 try:
 yield f, ef, self.getsize(ef)



To: pulkit, durin42, #hg-reviewers
Cc: mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D5139: store: introduce _matchtrackedpath() and use it to filter store files

2018-11-05 Thread durin42 (Augie Fackler)
durin42 added inline comments.

INLINE COMMENTS

> store.py:39
> +elif path.startswith('meta/'):
> +return matcher.visitdir(path[len('meta/'):-len('/00manifest.i')] or 
> '.')
> +

Please follow up with a patch that raises ProgrammingError at the end of this 
function - right now you just magically return False if you don't recognize the 
path, which feels dangerous.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5139

To: pulkit, durin42, #hg-reviewers
Cc: mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D5139: store: introduce _matchtrackedpath() and use it to filter store files

2018-10-19 Thread pulkit (Pulkit Goyal)
pulkit updated this revision to Diff 12259.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D5139?vs=12233=12259

REVISION DETAIL
  https://phab.mercurial-scm.org/D5139

AFFECTED FILES
  mercurial/store.py
  mercurial/streamclone.py
  tests/test-narrow-clone-stream.t

CHANGE DETAILS

diff --git a/tests/test-narrow-clone-stream.t b/tests/test-narrow-clone-stream.t
--- a/tests/test-narrow-clone-stream.t
+++ b/tests/test-narrow-clone-stream.t
@@ -1,7 +1,16 @@
+#testcases tree flat
+
 Tests narrow stream clones
 
   $ . "$TESTDIR/narrow-library.sh"
 
+#if tree
+  $ cat << EOF >> $HGRCPATH
+  > [experimental]
+  > treemanifest = 1
+  > EOF
+#endif
+
 Server setup
 
   $ hg init master
@@ -27,13 +36,51 @@
 
 Enable stream clone on the server
 
-  $ echo "[server]" >> master/.hg/hgrc
+  $ echo "[experimental.server]" >> master/.hg/hgrc
   $ echo "stream-narrow-clones=True" >> master/.hg/hgrc
 
 Cloning a specific file when stream clone is supported
 
   $ hg clone --narrow ssh://user@dummy/master narrow --noupdate --include 
"dir/src/f10" --stream
   streaming all changes
-  remote: abort: server does not support narrow stream clones
-  abort: pull failed on remote
-  [255]
+  * files to transfer, * KB of data (glob)
+  transferred * KB in * seconds (* */sec) (glob)
+
+  $ cd narrow
+  $ ls
+  $ hg tracked
+  I path:dir/src/f10
+
+Making sure we have the correct set of requirements
+
+  $ cat .hg/requires
+  dotencode
+  fncache
+  generaldelta
+  narrowhg-experimental
+  revlogv1
+  store
+  treemanifest (tree !)
+
+Making sure store has the required files
+
+  $ ls .hg/store/
+  00changelog.i
+  00manifest.i
+  data
+  fncache
+  meta (tree !)
+  narrowspec
+  undo
+  undo.backupfiles
+  undo.phaseroots
+
+Checking that repository has all the required data and not broken
+
+  $ hg verify
+  checking changesets
+  checking manifests
+  checking directory manifests (tree !)
+  crosschecking files in changesets and manifests
+  checking files
+  checked 40 changesets with 1 changes to 1 files
diff --git a/mercurial/streamclone.py b/mercurial/streamclone.py
--- a/mercurial/streamclone.py
+++ b/mercurial/streamclone.py
@@ -545,10 +545,6 @@
 Returns a 3-tuple of (file count, file size, data iterator).
 """
 
-# temporarily raise error until we add storage level logic
-if includes or excludes:
-raise error.Abort(_("server does not support narrow stream clones"))
-
 with repo.lock():
 
 entries = []
diff --git a/mercurial/store.py b/mercurial/store.py
--- a/mercurial/store.py
+++ b/mercurial/store.py
@@ -24,6 +24,20 @@
 
 parsers = policy.importmod(r'parsers')
 
+def _matchtrackedpath(path, matcher):
+"""parses a fncache entry and returns whether the entry is tracking a path
+matched by matcher or not.
+
+If matcher is None, returns True"""
+
+if matcher is None:
+return True
+path = decodedir(path)
+if path.startswith('data/'):
+return matcher(path[len('data/'):-len('.i')])
+elif path.startswith('meta/'):
+return matcher.visitdir(path[len('meta/'):-len('/00manifest.i')] or 
'.')
+
 # This avoids a collision between a file named foo and a dir named
 # foo.i or foo.d
 def _encodedir(path):
@@ -413,6 +427,8 @@
 
 def datafiles(self, matcher=None):
 for a, b, size in super(encodedstore, self).datafiles():
+if not _matchtrackedpath(a, matcher):
+continue
 try:
 a = decodefilename(a)
 except KeyError:
@@ -542,6 +558,8 @@
 
 def datafiles(self, matcher=None):
 for f in sorted(self.fncache):
+if not _matchtrackedpath(f, matcher):
+continue
 ef = self.encode(f)
 try:
 yield f, ef, self.getsize(ef)



To: pulkit, durin42, #hg-reviewers
Cc: mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D5139: store: introduce _matchtrackedpath() and use it to filter store files

2018-10-18 Thread pulkit (Pulkit Goyal)
pulkit updated this revision to Diff 12233.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D5139?vs=12212=12233

REVISION DETAIL
  https://phab.mercurial-scm.org/D5139

AFFECTED FILES
  mercurial/store.py
  mercurial/streamclone.py
  tests/test-narrow-clone-stream.t

CHANGE DETAILS

diff --git a/tests/test-narrow-clone-stream.t b/tests/test-narrow-clone-stream.t
--- a/tests/test-narrow-clone-stream.t
+++ b/tests/test-narrow-clone-stream.t
@@ -1,7 +1,16 @@
+#testcases tree flat
+
 Tests narrow stream clones
 
   $ . "$TESTDIR/narrow-library.sh"
 
+#if tree
+  $ cat << EOF >> $HGRCPATH
+  > [experimental]
+  > treemanifest = 1
+  > EOF
+#endif
+
 Server setup
 
   $ hg init master
@@ -27,13 +36,51 @@
 
 Enable stream clone on the server
 
-  $ echo "[server]" >> master/.hg/hgrc
+  $ echo "[experimental.server]" >> master/.hg/hgrc
   $ echo "stream-narrow-clones=True" >> master/.hg/hgrc
 
 Cloning a specific file when stream clone is supported
 
   $ hg clone --narrow ssh://user@dummy/master narrow --noupdate --include 
"dir/src/f10" --stream
   streaming all changes
-  remote: abort: server does not support narrow stream clones
-  abort: pull failed on remote
-  [255]
+  * files to transfer, * KB of data (glob)
+  transferred * KB in * seconds (* MB/sec) (glob)
+
+  $ cd narrow
+  $ ls
+  $ hg tracked
+  I path:dir/src/f10
+
+Making sure we have the correct set of requirements
+
+  $ cat .hg/requires
+  dotencode
+  fncache
+  generaldelta
+  narrowhg-experimental
+  revlogv1
+  store
+  treemanifest (tree !)
+
+Making sure store has the required files
+
+  $ ls .hg/store/
+  00changelog.i
+  00manifest.i
+  data
+  fncache
+  meta (tree !)
+  narrowspec
+  undo
+  undo.backupfiles
+  undo.phaseroots
+
+Checking that repository has all the required data and not broken
+
+  $ hg verify
+  checking changesets
+  checking manifests
+  checking directory manifests (tree !)
+  crosschecking files in changesets and manifests
+  checking files
+  checked 40 changesets with 1 changes to 1 files
diff --git a/mercurial/streamclone.py b/mercurial/streamclone.py
--- a/mercurial/streamclone.py
+++ b/mercurial/streamclone.py
@@ -545,10 +545,6 @@
 Returns a 3-tuple of (file count, file size, data iterator).
 """
 
-# temporarily raise error until we add storage level logic
-if includes or excludes:
-raise error.Abort(_("server does not support narrow stream clones"))
-
 with repo.lock():
 
 entries = []
diff --git a/mercurial/store.py b/mercurial/store.py
--- a/mercurial/store.py
+++ b/mercurial/store.py
@@ -24,6 +24,20 @@
 
 parsers = policy.importmod(r'parsers')
 
+def _matchtrackedpath(path, matcher):
+"""parses a fncache entry and returns whether the entry is tracking a path
+matched by matcher or not.
+
+If matcher is None, returns True"""
+
+if matcher is None:
+return True
+path = decodefilename(path)
+if path.startswith('data/'):
+return matcher(path[len('data/'):-len('.i')])
+elif path.startswith('meta/'):
+return matcher.visitdir(path[len('meta/'):-len('/00manifest.i')] or 
'.')
+
 # This avoids a collision between a file named foo and a dir named
 # foo.i or foo.d
 def _encodedir(path):
@@ -413,6 +427,8 @@
 
 def datafiles(self, matcher=None):
 for a, b, size in super(encodedstore, self).datafiles():
+if not _matchtrackedpath(a, matcher):
+continue
 try:
 a = decodefilename(a)
 except KeyError:
@@ -542,6 +558,8 @@
 
 def datafiles(self, matcher=None):
 for f in sorted(self.fncache):
+if not _matchtrackedpath(f, matcher):
+continue
 ef = self.encode(f)
 try:
 yield f, ef, self.getsize(ef)



To: pulkit, durin42, #hg-reviewers
Cc: mjpieters, mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D5139: store: introduce _matchtrackedpath() and use it to filter store files

2018-10-17 Thread pulkit (Pulkit Goyal)
pulkit created this revision.
Herald added a reviewer: durin42.
Herald added subscribers: mercurial-devel, mjpieters.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This patch introduces a function to filter store files on the basis of the 
path
  which they are tracking.
  
  The function assumes that the entries can be of two types, 'meta/*' and 
'data/*'
  which means it will just work on revlog based storage and not with another
  storage ways.
  
  For the 'data/*' entries, we remove the 'data/' part and '.i/.d' part from the
  beginning and the end then pass that to matcher.
  
  For the 'meta/*' entries, we remove the 'meta/' and '/00manifest.(i/d)' part 
from
  beginning and end then call matcher.visitdir() with it to make sure all the
  parent directories are also downloaded.
  
  Since the storage filtering for narrow stream clones is implemented with this
  patch, we remove the un-implemented error message, add some more tests and add
  the treemanifest case to tests too.
  
  The tests demonstrate that it works correctly.
  
  After this patch, we have now narrow stream clones working. Narrow stream 
clones
  are a very important feature for large repositories who have good internet
  connection because they use streamclones for cloning and if they do normal
  narrow clone, that takes more time then a full streamclone. Also narrow-stream
  clone will drastically speed up clone timings.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D5139

AFFECTED FILES
  mercurial/store.py
  mercurial/streamclone.py
  tests/test-narrow-clone-stream.t

CHANGE DETAILS

diff --git a/tests/test-narrow-clone-stream.t b/tests/test-narrow-clone-stream.t
--- a/tests/test-narrow-clone-stream.t
+++ b/tests/test-narrow-clone-stream.t
@@ -1,7 +1,16 @@
+#testcases tree flat
+
 Tests narrow stream clones
 
   $ . "$TESTDIR/narrow-library.sh"
 
+#if tree
+  $ cat << EOF >> $HGRCPATH
+  > [experimental]
+  > treemanifest = 1
+  > EOF
+#endif
+
 Server setup
 
   $ hg init master
@@ -34,6 +43,44 @@
 
   $ hg clone --narrow ssh://user@dummy/master narrow --noupdate --include 
"dir/src/f10" --stream
   streaming all changes
-  remote: abort: support for narrow stream clones is missing
-  abort: pull failed on remote
-  [255]
+  * files to transfer, * KB of data (glob)
+  transferred * KB in * seconds (* MB/sec) (glob)
+
+  $ cd narrow
+  $ ls
+  $ hg tracked
+  I path:dir/src/f10
+
+Making sure we have the correct set of requirements
+
+  $ cat .hg/requires
+  dotencode
+  fncache
+  generaldelta
+  narrowhg-experimental
+  revlogv1
+  store
+  treemanifest (tree !)
+
+Making sure store has the required files
+
+  $ ls .hg/store/
+  00changelog.i
+  00manifest.i
+  data
+  fncache
+  meta (tree !)
+  narrowspec
+  undo
+  undo.backupfiles
+  undo.phaseroots
+
+Checking that repository has all the required data and not broken
+
+  $ hg verify
+  checking changesets
+  checking manifests
+  checking directory manifests (tree !)
+  crosschecking files in changesets and manifests
+  checking files
+  checked 40 changesets with 1 changes to 1 files
diff --git a/mercurial/streamclone.py b/mercurial/streamclone.py
--- a/mercurial/streamclone.py
+++ b/mercurial/streamclone.py
@@ -545,10 +545,6 @@
 Returns a 3-tuple of (file count, file size, data iterator).
 """
 
-# temporarily raise error until we add storage level logic
-if includes or excludes:
-raise error.Abort(_("support for narrow stream clones is missing"))
-
 with repo.lock():
 
 entries = []
diff --git a/mercurial/store.py b/mercurial/store.py
--- a/mercurial/store.py
+++ b/mercurial/store.py
@@ -24,6 +24,20 @@
 
 parsers = policy.importmod(r'parsers')
 
+def _matchtrackedpath(path, matcher):
+"""parses a fncache entry and returns whether the entry is tracking a path
+matched by matcher or not.
+
+If matcher is None, returns True"""
+
+if matcher is None:
+return True
+path = decodefilename(path)
+if path.startswith('data/'):
+return matcher(path[len('data/'):-len('.i')])
+elif path.startswith('meta/'):
+return matcher.visitdir(path[len('meta/'):-len('/00manifest.i')] or 
'.')
+
 # This avoids a collision between a file named foo and a dir named
 # foo.i or foo.d
 def _encodedir(path):
@@ -413,6 +427,8 @@
 
 def datafiles(self, matcher=None):
 for a, b, size in super(encodedstore, self).datafiles():
+if not _matchtrackedpath(a, matcher):
+continue
 try:
 a = decodefilename(a)
 except KeyError:
@@ -542,6 +558,8 @@
 
 def datafiles(self, matcher=None):
 for f in sorted(self.fncache):
+if not _matchtrackedpath(f, matcher):
+continue
 ef = self.encode(f)
 try:
 yield f, ef, self.getsize(ef)



To: pulkit, durin42, #hg-reviewers
Cc: mjpieters, mercurial-devel