Re: [PATCH 2 of 5 RFC] lfs: add the '{oid}' template keyword to '{lfs_files}'
On Sun, 14 Jan 2018 23:58:49 -0500, Matt Harbisonwrote: # HG changeset patch # User Matt Harbison # Date 1515963382 18000 # Sun Jan 14 15:56:22 2018 -0500 # Node ID 44ce4d93f9dce5393d0e2456ed89c7858dece71f # Parent 47840d8f396120e9fbfe74f874abd6b34725d807 lfs: add the '{oid}' template keyword to '{lfs_files}' The RFCs here are really only 3 and 5. But in an early version of this, I tried a couple ways to test, and didn't clean it up completely from the previous one. Maybe this is a parser problem? (Notice the unbalanced ')' outside of the quotes, which then seems to make both it and the closing '}' visible.) diff --git a/tests/test-lfs.t b/tests/test-lfs.t --- a/tests/test-lfs.t +++ b/tests/test-lfs.t @@ -859,10 +859,14 @@ oid sha256:5bb8341bee63b3649f222b2215bde37322bea075a30575aa685d8f8d21c77024 size 29 x-is-binary 0 - $ hg --cwd convert_lfs log -r 'all()' -T '{rev}: {lfs_files % "{lfs_file}\n"} - 0: a1 - 1: a2 - 2: a2 + $ hg --cwd convert_lfs \ + > log -r 'all()' -T '{rev}: {lfs_files % "{lfs_file}: {oid}\n")}\n' + 0: a1: 5bb8341bee63b3649f222b2215bde37322bea075a30575aa685d8f8d21c77024 + )} + 1: a2: 5bb8341bee63b3649f222b2215bde37322bea075a30575aa685d8f8d21c77024 + )} + 2: a2: 876dadc86a8542f9798048f2c47f51dbf8e4359aed883e8ec80c5db825f0d943 + )} $ grep 'lfs' convert_lfs/.hg/requires lfs ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1860: dispatch: handle IOError when writing to stderr
indygreg created this revision. Herald added a subscriber: mercurial-devel. Herald added a reviewer: hg-reviewers. REVISION SUMMARY Previously, attempts to write to stderr in dispatch.run() may lead to an exception being thrown. This would likely be handled by Python's default exception handler, which would print the exception and exit 1. Code in this function is already catching IOError for stdout failures and converting to exit code 255 (-1 & 255 == 255). Why we weren't doing the same for stderr for the sake of consistency, I don't know. I do know that chg and hg diverged in behavior here (as the changed test-basic.t shows). After this commit, we catch I/O failure on stderr and change the exit code to 255. chg and hg now behave consistently. As a bonus, Rust hg also now passes this test. I'm skeptical at changing the exit code due to failures this late in the process. I think we should consider preserving the current exit code - assuming it is non-0. And, we may want to preserve the exit code completely if the I/O error is EPIPE (and potentially other special error classes). There's definitely room to tweak behavior. But for now, let's at least prevent the uncaught exception. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1860 AFFECTED FILES mercurial/dispatch.py tests/test-basic.t CHANGE DETAILS diff --git a/tests/test-basic.t b/tests/test-basic.t --- a/tests/test-basic.t +++ b/tests/test-basic.t @@ -34,15 +34,7 @@ [255] #endif -#if devfull no-chg - $ hg status >/dev/full 2>&1 - [1] - - $ hg status ENOENT 2>/dev/full - [1] -#endif - -#if devfull chg +#if devfull $ hg status >/dev/full 2>&1 [255] diff --git a/mercurial/dispatch.py b/mercurial/dispatch.py --- a/mercurial/dispatch.py +++ b/mercurial/dispatch.py @@ -96,10 +96,16 @@ err = e status = -1 if util.safehasattr(req.ui, 'ferr'): -if err is not None and err.errno != errno.EPIPE: -req.ui.ferr.write('abort: %s\n' % - encoding.strtolocal(err.strerror)) -req.ui.ferr.flush() +try: +if err is not None and err.errno != errno.EPIPE: +req.ui.ferr.write('abort: %s\n' % + encoding.strtolocal(err.strerror)) +req.ui.ferr.flush() +# There's not much we can do about an I/O error here. So (possibly) +# change the status code and move on. +except IOError: +status = -1 + sys.exit(status & 255) def _initstdio(): To: indygreg, #hg-reviewers Cc: mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1859: commandserver: restore cwd in case of exception
indygreg created this revision. Herald added a subscriber: mercurial-devel. Herald added a reviewer: hg-reviewers. REVISION SUMMARY The order of the statements was also changed a bit. But it shouldn't matter. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1859 AFFECTED FILES mercurial/commandserver.py CHANGE DETAILS diff --git a/mercurial/commandserver.py b/mercurial/commandserver.py --- a/mercurial/commandserver.py +++ b/mercurial/commandserver.py @@ -247,13 +247,13 @@ req = dispatch.request(args[:], copiedui, self.repo, self.cin, self.cout, self.cerr) -ret = (dispatch.dispatch(req) or 0) & 255 # might return None - -# restore old cwd -if '--cwd' in args: -os.chdir(self.cwd) - -self.cresult.write(struct.pack('>i', int(ret))) +try: +ret = (dispatch.dispatch(req) or 0) & 255 # might return None +self.cresult.write(struct.pack('>i', int(ret))) +finally: +# restore old cwd +if '--cwd' in args: +os.chdir(self.cwd) def getencoding(self): """ writes the current encoding to the result channel """ To: indygreg, #hg-reviewers Cc: mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 4 of 5 RFC] lfs: move the tracked file function creation to a method
# HG changeset patch # User Matt Harbison# Date 1515909885 18000 # Sun Jan 14 01:04:45 2018 -0500 # Node ID 7e5d513b38c169856b51b6207d992abbef10d2d5 # Parent 197a87c27995bc3c59f0e22165ff60c7d43d2efc lfs: move the tracked file function creation to a method Once a commitable file format for tracked config is agreed upon, I can't see any reason to have a config based way to control this. (Other than convert. That will be necessary to override the file when converting to normal files. Also, converting to lfs needs this if not splicing the file in at the beginning. So maybe the existing config option should be `convert` specific.) Looking to hgeol for precedent, it looks like policy that affects how items are stored are handled only by the tracked file, while policy that affects the checkout can be handled by either a user config or the tracked file (but the latter takes precedence). We probably need a transition period, so this transition policy can be controlled by the function. Additionally, it provides a place for convert to wrap to override the file based config. diff --git a/hgext/lfs/__init__.py b/hgext/lfs/__init__.py --- a/hgext/lfs/__init__.py +++ b/hgext/lfs/__init__.py @@ -123,15 +123,7 @@ if not repo.local(): return -trackspec = repo.ui.config('lfs', 'track') - -# deprecated config: lfs.threshold -threshold = repo.ui.configbytes('lfs', 'threshold') -if threshold: -fileset.parse(trackspec) # make sure syntax errors are confined -trackspec = "(%s) | size('>%d')" % (trackspec, threshold) - -repo.svfs.options['lfstrack'] = minifileset.compile(trackspec) +repo.svfs.options['lfstrack'] = _trackedmatcher(repo) repo.svfs.lfslocalblobstore = blobstore.local(repo) repo.svfs.lfsremoteblobstore = blobstore.remote(repo) @@ -157,6 +149,19 @@ ui.setconfig('hooks', 'commit.lfs', checkrequireslfs, 'lfs') ui.setconfig('hooks', 'pretxnchangegroup.lfs', checkrequireslfs, 'lfs') +def _trackedmatcher(repo): +"""Return a function (path, size) -> bool indicating whether or not to +track a given file with lfs.""" +trackspec = repo.ui.config('lfs', 'track') + +# deprecated config: lfs.threshold +threshold = repo.ui.configbytes('lfs', 'threshold') +if threshold: +fileset.parse(trackspec) # make sure syntax errors are confined +trackspec = "(%s) | size('>%d')" % (trackspec, threshold) + +return minifileset.compile(trackspec) + def wrapfilelog(filelog): wrapfunction = extensions.wrapfunction ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 5 of 5 RFC] lfs: control tracked file selection via a tracked file
# HG changeset patch # User Matt Harbison# Date 1515971571 18000 # Sun Jan 14 18:12:51 2018 -0500 # Node ID 3b651cef0884ad8108a19c6354d53103e378e12e # Parent 7e5d513b38c169856b51b6207d992abbef10d2d5 lfs: control tracked file selection via a tracked file Since the lfs tracking policy can dramatically affect the repository, it makes more sense to have the policy file checked in, than to rely on all developers configuring their .hgrc properly. The inspiration for this is the .hgeol file. The configuration lives under '[track]', so that other things can be added in the future. Eventually, the config option should be limited to `convert` only. I'm sure that this is going to take a few iterations, so I'm ignoring the documentation for now. The tests have a bit of commentary, but the general idea is to put the most specific config first. Walk that list, and when the first key matches, the value on the right side becomes the deciding expression. That means a general catchall rule can be added at the end. See the test for an example. My initial thought was to read the file and change each "key = value" line into "((key) & (value))", so that each line could be ORed together, and make a single pass at compiling. Unfortunately, that prevents exclusions if there's a catchall rule. Consider what happens to a large *.c file here: [track] **.c = none() ** = size('>1MB') # ((**.c) & (none())) | ((**) & (size('>1MB'))) => anything > 1MB I also thought about having separate [include] and [exclude] sections. But that just seems to open things up to user mistakes. Consider: [include] **.zip = all() **.php = size('>10MB') [exclude] **.zip = all() # Who wins? **.php = none() # Effectively 'all()' (i.e. nothing excluded), or >10MB ? Therefore, it just compiles each key and value separately, and walks until the key matches something. I'm not sure how to enforce just file patterns on LHS without leaking knowledge about the minifileset here. That means this will allow odd looking lines like this: [track] **.c | **.txt = none() But that's also fewer lines to compile, so slightly more efficient? Some things like 'none()' won't work as expected on LHS though, because that won't match, so that line is skipped. Jun previously expressed concern about efficiency when scaling to large repos, so I tried avoiding 'repo[None]'. (localrepo.commit() gets repo[None] already, but doesn't tie it to the workingcommitctx used here.) Therefore, I looked at the passed context for 'AMR' status. But that doesn't help with the normal case where the policy file is tracked, but clean. That requires looking up p1() to read the file. I don't see any way to get the content of one file without first creating the full parent context. I'm a bit puzzled by the way eol handles this. It loads the file for p1() in a preupdate hook (what if the update fails?). It also directly reads the filesystem in cases where it wants to examine 'repo[None]' (what if the file is deleted or not tracked?). There's more for me to figure out here, but I wanted to float this trial balloon now, in case I'm off in the weeds. It would be nice to land this functionality before the freeze. diff --git a/hgext/lfs/__init__.py b/hgext/lfs/__init__.py --- a/hgext/lfs/__init__.py +++ b/hgext/lfs/__init__.py @@ -52,6 +52,7 @@ from mercurial import ( bundle2, changegroup, +config, context, exchange, extensions, @@ -123,13 +124,21 @@ if not repo.local(): return -repo.svfs.options['lfstrack'] = _trackedmatcher(repo) repo.svfs.lfslocalblobstore = blobstore.local(repo) repo.svfs.lfsremoteblobstore = blobstore.remote(repo) # Push hook repo.prepushoutgoinghooks.add('lfs', wrapper.prepush) +class lfsrepo(repo.__class__): +@localrepo.unfilteredmethod +def commitctx(self, ctx, error=False): +# TODO: Ensure this gets called for import +repo.svfs.options['lfstrack'] = _trackedmatcher(self, ctx) +return super(lfsrepo, self).commitctx(ctx, error) + +repo.__class__ = lfsrepo + if 'lfs' not in repo.requirements: def checkrequireslfs(ui, repo, **kwargs): if 'lfs' not in repo.requirements: @@ -149,18 +158,52 @@ ui.setconfig('hooks', 'commit.lfs', checkrequireslfs, 'lfs') ui.setconfig('hooks', 'pretxnchangegroup.lfs', checkrequireslfs, 'lfs') -def _trackedmatcher(repo): +def _trackedmatcher(repo, ctx): """Return a function (path, size) -> bool indicating whether or not to track a given file with lfs.""" -trackspec = repo.ui.config('lfs', 'track') +data = '' + +if '.hglfs' in ctx.added() or '.hglfs' in ctx.modified(): +data = ctx['.hglfs'].data() +elif '.hglfs' not in ctx.removed(): +p1 = repo['.'] + +if '.hglfs' not in p1: +# No '.hglfs' in wdir or in parent. Fallback to config +
[PATCH 3 of 5 RFC] lfs: add the '{raw}' template keyword to '{lfs_files}'
# HG changeset patch # User Matt Harbison# Date 1515967224 18000 # Sun Jan 14 17:00:24 2018 -0500 # Node ID 197a87c27995bc3c59f0e22165ff60c7d43d2efc # Parent 44ce4d93f9dce5393d0e2456ed89c7858dece71f lfs: add the '{raw}' template keyword to '{lfs_files}' Even though it is (probably) weird to have multiline output from a keyword, something similar to this is useful as the public interface to dump the raw pointer content. I still haven't figured out how to use `hg debugdata` in a non trivial repo, and that will just be a point of aggravation when debugging a problem. One problem will be that {lfs_files} only finds files that were added or modified in a revision. So maybe this is better on cat? But I can't think of any extensions that extend the generic templater like this, to figure out how to do that. diff --git a/hgext/lfs/__init__.py b/hgext/lfs/__init__.py --- a/hgext/lfs/__init__.py +++ b/hgext/lfs/__init__.py @@ -231,6 +231,7 @@ makemap = lambda v: { 'lfs_file': v, 'oid': pointers[v].oid(), +'raw': pointers[v].rawtext(), } # TODO: make the separator ', '? diff --git a/hgext/lfs/pointer.py b/hgext/lfs/pointer.py --- a/hgext/lfs/pointer.py +++ b/hgext/lfs/pointer.py @@ -24,11 +24,14 @@ def __init__(self, *args, **kwargs): self['version'] = self.VERSION super(gitlfspointer, self).__init__(*args, **kwargs) +self._rawtext = '' @classmethod def deserialize(cls, text): try: -return cls(l.split(' ', 1) for l in text.splitlines()).validate() +p = cls(l.split(' ', 1) for l in text.splitlines()).validate() +p._rawtext = text # Exclude from dict so it doesn't serialize +return p except ValueError: # l.split returns 1 item instead of 2 raise InvalidPointer(_('cannot parse git-lfs text: %r') % text) @@ -43,6 +46,11 @@ def size(self): return int(self['size']) +def rawtext(self): +"""The raw text read from the pointer file to create object. This will +be empty for objects instantiated from key/values for serialization.""" +return self._rawtext + # regular expressions used by _validate # see https://github.com/git-lfs/git-lfs/blob/master/docs/spec.md _keyre = re.compile(r'\A[a-z0-9.-]+\Z') diff --git a/tests/test-lfs.t b/tests/test-lfs.t --- a/tests/test-lfs.t +++ b/tests/test-lfs.t @@ -859,6 +859,12 @@ oid sha256:5bb8341bee63b3649f222b2215bde37322bea075a30575aa685d8f8d21c77024 size 29 x-is-binary 0 + $ hg --cwd convert_lfs log -r 0 -T '{lfs_files % "{raw}\n"}' + version https://git-lfs.github.com/spec/v1 + oid sha256:5bb8341bee63b3649f222b2215bde37322bea075a30575aa685d8f8d21c77024 + size 29 + x-is-binary 0 + $ hg --cwd convert_lfs \ > log -r 'all()' -T '{rev}: {lfs_files % "{lfs_file}: {oid}\n"}' 0: a1: 5bb8341bee63b3649f222b2215bde37322bea075a30575aa685d8f8d21c77024 ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 1 of 5 RFC] lfs: convert '{lfs_files}' keyword to a hybrid list
# HG changeset patch # User Matt Harbison# Date 1515962350 18000 # Sun Jan 14 15:39:10 2018 -0500 # Node ID 47840d8f396120e9fbfe74f874abd6b34725d807 # Parent 3f5167faeb5d1f28939eaf2c2825bb65f67a2458 lfs: convert '{lfs_files}' keyword to a hybrid list This will allow more attributes about the file to be queried. diff --git a/hgext/lfs/__init__.py b/hgext/lfs/__init__.py --- a/hgext/lfs/__init__.py +++ b/hgext/lfs/__init__.py @@ -61,9 +61,11 @@ localrepo, minifileset, node, +pycompat, registrar, revlog, scmutil, +templatekw, upgrade, vfs as vfsmod, wireproto, @@ -221,8 +223,18 @@ @templatekeyword('lfs_files') def lfsfiles(repo, ctx, **args): """List of strings. LFS files added or modified by the changeset.""" +args = pycompat.byteskwargs(args) + pointers = wrapper.pointersfromctx(ctx) # {path: pointer} -return sorted(pointers.keys()) +files = sorted(pointers.keys()) + +makemap = lambda v: { +'lfs_file': v, +} + +# TODO: make the separator ', '? +f = templatekw._showlist('lfs_file', files, args) +return templatekw._hybrid(f, files, makemap, pycompat.identity) @command('debuglfsupload', [('r', 'rev', [], _('upload large files introduced by REV'))]) diff --git a/tests/test-lfs.t b/tests/test-lfs.t --- a/tests/test-lfs.t +++ b/tests/test-lfs.t @@ -859,6 +859,11 @@ oid sha256:5bb8341bee63b3649f222b2215bde37322bea075a30575aa685d8f8d21c77024 size 29 x-is-binary 0 + $ hg --cwd convert_lfs log -r 'all()' -T '{rev}: {lfs_files % "{lfs_file}\n"}' + 0: a1 + 1: a2 + 2: a2 + $ grep 'lfs' convert_lfs/.hg/requires lfs ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 2 of 5 RFC] lfs: add the '{oid}' template keyword to '{lfs_files}'
# HG changeset patch # User Matt Harbison# Date 1515963382 18000 # Sun Jan 14 15:56:22 2018 -0500 # Node ID 44ce4d93f9dce5393d0e2456ed89c7858dece71f # Parent 47840d8f396120e9fbfe74f874abd6b34725d807 lfs: add the '{oid}' template keyword to '{lfs_files}' The 'sha256:' prefix is skipped because this seems like the most convenient way to consume it. Maybe we should also add a '{oid_type}' keyword? Then again, that can be added in the future if a different algorithm is supported. diff --git a/hgext/lfs/__init__.py b/hgext/lfs/__init__.py --- a/hgext/lfs/__init__.py +++ b/hgext/lfs/__init__.py @@ -230,6 +230,7 @@ makemap = lambda v: { 'lfs_file': v, +'oid': pointers[v].oid(), } # TODO: make the separator ', '? diff --git a/tests/test-lfs.t b/tests/test-lfs.t --- a/tests/test-lfs.t +++ b/tests/test-lfs.t @@ -859,10 +859,11 @@ oid sha256:5bb8341bee63b3649f222b2215bde37322bea075a30575aa685d8f8d21c77024 size 29 x-is-binary 0 - $ hg --cwd convert_lfs log -r 'all()' -T '{rev}: {lfs_files % "{lfs_file}\n"}' - 0: a1 - 1: a2 - 2: a2 + $ hg --cwd convert_lfs \ + > log -r 'all()' -T '{rev}: {lfs_files % "{lfs_file}: {oid}\n"}' + 0: a1: 5bb8341bee63b3649f222b2215bde37322bea075a30575aa685d8f8d21c77024 + 1: a2: 5bb8341bee63b3649f222b2215bde37322bea075a30575aa685d8f8d21c77024 + 2: a2: 876dadc86a8542f9798048f2c47f51dbf8e4359aed883e8ec80c5db825f0d943 $ grep 'lfs' convert_lfs/.hg/requires lfs ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1856: wireproto: server-side support for pullbundles
joerg.sonnenberger updated this revision to Diff 4826. REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D1856?vs=4816=4826 REVISION DETAIL https://phab.mercurial-scm.org/D1856 AFFECTED FILES mercurial/configitems.py mercurial/help/config.txt mercurial/wireproto.py tests/test-pull-r.t CHANGE DETAILS diff --git a/tests/test-pull-r.t b/tests/test-pull-r.t --- a/tests/test-pull-r.t +++ b/tests/test-pull-r.t @@ -145,3 +145,59 @@ $ cd .. $ killdaemons.py + +Test pullbundle functionality + + $ cd repo + $ cat < .hg/hgrc + > [server] + > pullbundle = True + > EOF + $ hg bundle --base null -r 0 .hg/0.hg + 1 changesets found + $ hg bundle --base 0 -r 1 .hg/1.hg + 1 changesets found + $ hg bundle --base 1 -r 2 .hg/2.hg + 1 changesets found + $ cat < .hg/pullbundles.manifest + > 2.hg heads=effea6de0384e684f44435651cb7bd70b8735bd4 bases=bbd179dfa0a71671c253b3ae0aa1513b60d199fa + > 1.hg heads=ed1b79f46b9a29f5a6efa59cf12fcfca43bead5a bases=bbd179dfa0a71671c253b3ae0aa1513b60d199fa + > 0.hg heads=bbd179dfa0a71671c253b3ae0aa1513b60d199fa + > EOF + $ hg serve --debug -p $HGPORT2 --pid-file=../repo.pid > ../repo-server.txt 2>&1 & + $ while ! grep listening ../repo-server.txt > /dev/null; do sleep 1; done + $ cat ../repo.pid >> $DAEMON_PIDS + $ cd .. + $ hg clone -r 0 http://localhost:$HGPORT2/ repo.pullbundle + adding changesets + adding manifests + adding file changes + added 1 changesets with 1 changes to 1 files + new changesets bbd179dfa0a7 + updating to branch default + 1 files updated, 0 files merged, 0 files removed, 0 files unresolved + $ cd repo.pullbundle + $ hg pull -r 1 + pulling from http://localhost:$HGPORT2/ + searching for changes + adding changesets + adding manifests + adding file changes + added 1 changesets with 1 changes to 1 files + new changesets ed1b79f46b9a + (run 'hg update' to get a working copy) + $ hg pull -r 2 + pulling from http://localhost:$HGPORT2/ + searching for changes + adding changesets + adding manifests + adding file changes + added 1 changesets with 1 changes to 1 files (+1 heads) + new changesets effea6de0384 + (run 'hg heads' to see heads, 'hg merge' to merge) + $ cd .. + $ killdaemons.py + $ grep 'sending pullbundle ' repo-server.txt + sending pullbundle "0.hg" + sending pullbundle "1.hg" + sending pullbundle "2.hg" diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py --- a/mercurial/wireproto.py +++ b/mercurial/wireproto.py @@ -831,6 +831,64 @@ opts = options('debugwireargs', ['three', 'four'], others) return repo.debugwireargs(one, two, **pycompat.strkwargs(opts)) +def find_pullbundle(repo, opts, clheads, heads, common): +"""Return a file object for the first matching pullbundle. + +Pullbundles are specified in .hg/pullbundles.manifest similar to +clonebundles. +For each entry, the bundle specification is checked for compatibility: +- Client features vs the BUNDLESPEC. +- Revisions shared with the clients vs base revisions of the bundle. + A bundle can be applied only if all its base revisions are known by + the client. +- At least one leaf of the bundle's DAG is missing on the client. +- Every leaf of the bundle's DAG is part of node set the client wants. + E.g. do not send a bundle of all changes if the client wants only + one specific branch of many. +""" +def decodehexstring(s): +return set([h.decode('hex') for h in s.split(';')]) + +manifest = repo.vfs.tryread('pullbundles.manifest') +if not manifest: +return None +res = exchange.parseclonebundlesmanifest(repo, manifest) +res = exchange.filterclonebundleentries(repo, res) +if not res: +return None +cl = repo.changelog +heads_anc = cl.ancestors([cl.rev(rev) for rev in heads], inclusive=True) +common_anc = cl.ancestors([cl.rev(rev) for rev in common], inclusive=True) +for entry in res: +if 'heads' in entry: +try: +bundle_heads = decodehexstring(entry['heads']) +except TypeError: +# Bad heads entry +continue +if bundle_heads.issubset(common): +continue # Nothing new +if all(cl.rev(rev) in common_anc for rev in bundle_heads): +continue # Still nothing new +if any(cl.rev(rev) not in heads_anc for rev in bundle_heads): +continue +if 'bases' in entry: +try: +bundle_bases = decodehexstring(entry['bases']) +except TypeError: +# Bad bases entry +continue +if not all(cl.rev(rev) in common_anc for rev in bundle_bases): +continue +path = entry['URL'] +repo.ui.debug('sending pullbundle "%s"\n' % path) +try: +return repo.vfs.open(path) +except IOError: +
D1858: tests: make hg frame optional
mharbison72 added a comment. I haven't been paying attention to the rust threads, but is there an hghave test for it? I don't see one locally. If so, then all that should be needed is appending ' (no-rust !)'. (Note the spaces.) I think this works for #test-cases too. Note that this is *not* an "if and only if" test. If you evaluate the feature(s) listed, it will be required for '(true !)', and reverts to (?) for '(false !)'. I had a patch to make it IFF, because that's how it seems to be used in practice. But somehow I came up with a test that succeeded outside test-run-tests.t, but failed inside it. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1858 To: indygreg, #hg-reviewers Cc: mharbison72, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1856: wireproto: server-side support for pullbundles
joerg.sonnenberger added a comment. > I wish we could find a way to send multiple, inline, pre-generated bundles in one response. However, the existing design of compression within the bundle2 format doesn't easily facilitate this. We should think long and hard about whether to implement this feature as partial pull or extend bundle2 to allow us to do nice things and avoid the extra client-server roundtrips. I was looking at both parts initially. The existing bundle2 compression doesn't work very well for this purpose as it doesn't allow switching compression in the middle. It might be possible to hack something together with a flag to `hg bundle` to skip the final zero record and depend on the ability of all currently supported compression engines to do concat streams, but that feels like a crude hack. Doing extra client-server roundtrips is a bit annoying and costly, but it also makes the functionality a bit more forgiving for mistakes in the bundle specification. Unlike clonebundles, it requires the records to be much more precise and we certainly don't want to parse the bundles themselve over and over again. It might be nice for some future bundle version to allow easy access to the roots and leaves of the DAG. > Speaking of compression, we probably need some more code around e.g. the mismatch between on-disk compression and wire compression. There is a potential for local bundles to be stored uncompressed and for the server to compress this data as part of streaming it out. That involves CPU. We still come out ahead because it avoids having to generate the bundle content from revlogs. But it isn't as ideal as streaming an already compressed bundle over the wire. We may want to open the bundle2 files and look at their compression info and make sure we don't do stupid things. It should be fast to parse the bundle2 header to obtain this knowledge. The bundle will be sent without wire compression if supported. A v1 HTTP client will still get the basic zlib framing, but that's not really avoidable. Otherwise, BUNDLESPEC is supposed to make sure that the client knows how to deal with the bundle. The idea is explicitly that the administrator has a free choice over the compression format and the data is send as is. Only crypto overhead should apply. INLINE COMMENTS > indygreg wrote in wireproto.py:854-862 > I worry about performance issues here. You were making some noise about this > on IRC. So it sounds like it is on your radar. > > I'm just not sure how to make things more efficient. I just know that doing > so many DAG tests feels like it will lead to performance problems for repos > with hundreds of thousands or millions of changesets. > > But we shouldn't let performance issues derail this feature. Pull bundles > have the potential to offload tens of seconds of CPU time from the server. So > even if DAG operations consume a few seconds of CPU, we come out ahead. It > would be nice to get that down to 100's of milliseconds at most though. But > this feels like the territory of follow-up work, possibly involving caching > or more efficient stores of which revisions are in which bundles. As I wrote in the comments on the client side, it seems to be OK now. The heavy lifting is the initial computation of the ancestor sets outside the loop, which is in the worst case an enumeration of all local revisions. That by itself doesn't seem to result in a big performance difference. The rest of the loop depends mostly on the number of roots and leaves of the bundle DAGs. My test case had a total of ~800 and the impact on normal pull operations was in the noise. > indygreg wrote in wireproto.py:874-877 > Using a `URL` field for a vfs path (which doesn't have nor handle protocol > schemes - e.g. `file://`` IIRC) feels wrong. I think the entry in the > manifest should be `PATH`. The URL field is synthesized by `parseclonebundlesmanifest` for the first entry. It isn't tagged in the actual file. As long as the format of the files is otherwise the same, I'd just keep it identical. > indygreg wrote in wireproto.py:918 > I'd consider enabling this feature by default. Checking for the presence of a > `pullbundles.manifest` should be a cheap operation. Especially when you > consider that serving a bundle will open likely dozens of revlog files. > > Another idea to consider is to tie this feature into the `clonebundles.py` > extension. We already have extensive documentation in that extension for > offloading server load via pre-generated bundles. I view this feature as a > close sibling. I think they could live under the same umbrella. But because > "pull bundles" are part of the `getbundle` wire protocol command, the bulk of > the server code needs to remain in core. The bigger question is whether it should have an associated bundle or protocol capability. The client needs to be able to deal with the partial replies. If we go that way, I
D1858: tests: make hg frame optional
indygreg created this revision. Herald added a subscriber: mercurial-devel. Herald added a reviewer: hg-reviewers. REVISION SUMMARY When `hg` is a Rust binary, the `hg` frame doesn't exist because an `hg` Python script doesn't exist. This commit updates expected test output to make the `hg` frame optional. There /might/ be a way to do this more accurately with the "(feature !)" syntax in .t files. However, I poked at it for a few minutes and couldn't get it to work. Worst case with using (?) is we drop the frame from output for Python `hg`. The `hg` frame isn't terribly important. So the worst case doesn't feel that bad. If someone wants to enlighten me on how to use "(feature !)" for optional output based on hghave features, I'd be more than willing to update this. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1858 AFFECTED FILES tests/test-devel-warnings.t CHANGE DETAILS diff --git a/tests/test-devel-warnings.t b/tests/test-devel-warnings.t --- a/tests/test-devel-warnings.t +++ b/tests/test-devel-warnings.t @@ -99,7 +99,7 @@ #if no-chg $ hg buggylocking --traceback devel-warn: "wlock" acquired after "lock" at: - */hg:* in (glob) + */hg:* in (glob) (?) */mercurial/dispatch.py:* in run (glob) */mercurial/dispatch.py:* in dispatch (glob) */mercurial/dispatch.py:* in _runcatch (glob) @@ -115,7 +115,7 @@ #else $ hg buggylocking --traceback devel-warn: "wlock" acquired after "lock" at: - */hg:* in (glob) + */hg:* in (glob) (?) */mercurial/dispatch.py:* in run (glob) */mercurial/dispatch.py:* in dispatch (glob) */mercurial/dispatch.py:* in _runcatch (glob) @@ -177,7 +177,7 @@ $ hg oldanddeprecated --traceback devel-warn: foorbar is deprecated, go shopping (compatibility will be dropped after Mercurial-42.1337, update your code.) at: - */hg:* in (glob) + */hg:* in (glob) (?) */mercurial/dispatch.py:* in run (glob) */mercurial/dispatch.py:* in dispatch (glob) */mercurial/dispatch.py:* in _runcatch (glob) @@ -238,7 +238,7 @@ 1970/01/01 00:00:00 bob @cb9a9f314b8b07ba71012fcdbc544b5a4d82ff5b (5000)> oldanddeprecated --traceback 1970/01/01 00:00:00 bob @cb9a9f314b8b07ba71012fcdbc544b5a4d82ff5b (5000)> devel-warn: foorbar is deprecated, go shopping (compatibility will be dropped after Mercurial-42.1337, update your code.) at: - */hg:* in (glob) + */hg:* in (glob) (?) */mercurial/dispatch.py:* in run (glob) */mercurial/dispatch.py:* in dispatch (glob) */mercurial/dispatch.py:* in _runcatch (glob) To: indygreg, #hg-reviewers Cc: mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1856: wireproto: server-side support for pullbundles
indygreg added a comment. Another idea to consider is storing changegroup data or bundle2 parts on disk instead of full bundles. Then, we could stream multiple changegroup parts into a larger bundle2 payload. That does mean that'd we'd have to compress outgoing data over the wire. So not ideal if minimizing CPU usage is your goal. But we still come out ahead by not having to generate the changegroup data. This would require a mechanism to generate files of that format, however. Not a huge amount of work. But still work. I'm just not sure what the sweet spot for this feature is. Is it enough to eliminate changegroup generation overhead. Or are we striving for ~0 CPU usage to service `hg pull` operations (like what clone bundles achieve)? REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1856 To: joerg.sonnenberger, #hg-reviewers, indygreg Cc: indygreg, durin42, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1857: pull: re-run discovery and pullbundle2 if server didn't send all heads
joerg.sonnenberger updated this revision to Diff 4824. REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D1857?vs=4813=4824 REVISION DETAIL https://phab.mercurial-scm.org/D1857 AFFECTED FILES mercurial/exchange.py mercurial/wireproto.py tests/test-pull-r.t CHANGE DETAILS diff --git a/tests/test-pull-r.t b/tests/test-pull-r.t --- a/tests/test-pull-r.t +++ b/tests/test-pull-r.t @@ -201,3 +201,40 @@ sending pullbundle "0.hg" sending pullbundle "1.hg" sending pullbundle "2.hg" + +Test pullbundle functionality for incremental pulls + + $ cd repo + $ hg serve --debug -p $HGPORT2 --pid-file=../repo.pid > ../repo-server.txt 2>&1 & + $ while ! grep listening ../repo-server.txt > /dev/null; do sleep 1; done + $ cat ../repo.pid >> $DAEMON_PIDS + $ cd .. + $ hg clone http://localhost:$HGPORT2/ repo.pullbundle2 + requesting all changes + adding changesets + adding manifests + adding file changes + added 1 changesets with 1 changes to 1 files + searching for changes + adding changesets + adding manifests + adding file changes + added 1 changesets with 1 changes to 1 files + searching for changes + adding changesets + adding manifests + adding file changes + added 1 changesets with 1 changes to 1 files (+1 heads) + searching for changes + adding changesets + adding manifests + adding file changes + added 1 changesets with 1 changes to 1 files + new changesets bbd179dfa0a7:66e3ba28d0d7 + updating to branch default + 3 files updated, 0 files merged, 0 files removed, 0 files unresolved + $ killdaemons.py + $ grep 'sending pullbundle ' repo-server.txt + sending pullbundle "0.hg" + sending pullbundle "2.hg" + sending pullbundle "1.hg" diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py --- a/mercurial/wireproto.py +++ b/mercurial/wireproto.py @@ -915,8 +915,7 @@ common = set(opts.get('common', set())) common.discard(nullid) -if repo.ui.configbool('server', 'pullbundle') and - self.capable('partial-pull'): +if repo.ui.configbool('server', 'pullbundle'): # Check if a pre-built bundle covers this request. bundle = find_pullbundle(repo, opts, clheads, heads, common) if bundle: diff --git a/mercurial/exchange.py b/mercurial/exchange.py --- a/mercurial/exchange.py +++ b/mercurial/exchange.py @@ -1351,9 +1351,24 @@ # before discovery to avoid extra work. _maybeapplyclonebundle(pullop) streamclone.maybeperformlegacystreamclone(pullop) -_pulldiscovery(pullop) -if pullop.canusebundle2: +while True: +_pulldiscovery(pullop) +if not pullop.canusebundle2: +break _pullbundle2(pullop) +# The server may send a partial reply, i.e. when inlining +# pre-computed bundles. In that case, re-run the discovery +# phase and bundle again. There are two indicators that the +# process is finished: +# - no changes have been received (cgresult) +# - all remote heads are known locally +# The head check must use the unfiltered view as obsoletion +# markers can hide heads. +if not pullop.cgresult: +break +unficl = repo.unfiltered().changelog +if all(unficl.hasnode(n) for n in pullop.rheads): +break _pullchangeset(pullop) _pullphase(pullop) _pullbookmarks(pullop) To: joerg.sonnenberger, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1856: wireproto: server-side support for pullbundles
indygreg added subscribers: durin42, indygreg. indygreg requested changes to this revision. indygreg added a comment. This revision now requires changes to proceed. This patch needs a lot of work. But I'm very supportive of the feature and the preliminary implementation! I wish we could find a way to send multiple, inline, pre-generated bundles in one response. However, the existing design of compression within the bundle2 format doesn't easily facilitate this. We should think long and hard about whether to implement this feature as //partial pull// or extend bundle2 to allow us to do nice things and avoid the extra client-server roundtrips. Speaking of compression, we probably need some more code around e.g. the mismatch between on-disk compression and wire compression. There is a potential for local bundles to be stored uncompressed and for the server to compress this data as part of streaming it out. That involves CPU. We still come out ahead because it avoids having to generate the bundle content from revlogs. But it isn't as ideal as streaming an already compressed bundle over the wire. We may want to open the bundle2 files and look at their compression info and make sure we don't do stupid things. It should be fast to parse the bundle2 header to obtain this knowledge. My biggest concern with the architecture of this feature is the multiple roundtrips. I **really** wish we could stream multiple bundles off disk to the wire with no decompression/compression involved. That would mean storing compressed bundles on disk. But this would require some additional bundle2 magic. The existing solution is simple and elegant. I do like that. I'd very much like to get the opinion of someone like @durin42 (who also likes designing protocols). INLINE COMMENTS > wireproto.py:834 > > +def find_pullbundle(repo, opts, clheads, heads, common): > +def decodehexstring(s): This function needs a docstring. And the expected behavior of this function needs to be documented in the docstring and/or inline as comments. > wireproto.py:836 > +def decodehexstring(s): > +return set([h.decode('hex') for h in s.split(':')]) > + `:` as a delimiter for nodes doesn't feel right because `:` has meaning for revsets. I would use `,` or `;`. That being said, it may make the manifest format slightly less human friendly because URI encoding may come into play. That should be fine though: we already escape values for `BUNDLESPEC` when using `packed1` bundles. > wireproto.py:838 > + > +manifest = repo.vfs.tryread('pullbundles.manifest') > +res = exchange.parseclonebundlesmanifest(repo, manifest) I think there should be an `if not manifest: return` here to handle the common case. > wireproto.py:848 > +for entry in res: > +if 'heads' in entry: > +try: We'll need documentation of the fields in the manifest somewhere. If we combine this feature with the `clonebundles.py` extension, that seems like the logical place to document things. > wireproto.py:854-862 > +if len(bundle_heads) > len(heads): > +# Client wants less heads than the bundle contains > +continue > +if bundle_heads.issubset(common): > +continue # Nothing new > +if all(cl.rev(rev) in common_anc for rev in bundle_heads): > +continue # Still nothing new I worry about performance issues here. You were making some noise about this on IRC. So it sounds like it is on your radar. I'm just not sure how to make things more efficient. I just know that doing so many DAG tests feels like it will lead to performance problems for repos with hundreds of thousands or millions of changesets. But we shouldn't let performance issues derail this feature. Pull bundles have the potential to offload tens of seconds of CPU time from the server. So even if DAG operations consume a few seconds of CPU, we come out ahead. It would be nice to get that down to 100's of milliseconds at most though. But this feels like the territory of follow-up work, possibly involving caching or more efficient stores of which revisions are in which bundles. > wireproto.py:874-877 > +path = entry['URL'] > +repo.ui.debug('sending pullbundle "%s"\n' % path) > +try: > +return repo.vfs.open(path) Using a `URL` field for a vfs path (which doesn't have nor handle protocol schemes - e.g. `file://`` IIRC) feels wrong. I think the entry in the manifest should be `PATH`. > wireproto.py:918 > + > +if repo.ui.configbool('server', 'pullbundle') and > + self.capable('partial-pull'): I'd consider enabling this feature by default. Checking for the presence of a `pullbundles.manifest` should be a cheap operation. Especially when you consider that serving a bundle will open likely dozens of revlog files. Another idea to consider is to tie this feature into the `clonebundles.py`
D1857: pull: re-run discovery and pullbundle2 if server didn't send all heads
indygreg requested changes to this revision. indygreg added a comment. This revision now requires changes to proceed. I'm generally in favor of this functionality. It enables some interesting server features (such as pullbundles). This review needs to get hooked up to the Phabricator "stack" as the pullbundles patch(es). Using `hg phabsend A::B` should do that. I'd like to see a test around a server not sending changegroup data. I /think/ the existing code will abort the `while True` loop in this case. But I don't fully understand when various attributes on `pullop` are updated. I'd also feel more comfortable about things if there were an explicit check in the loop that the tip of changelog increased if revisions were requested. I'm worried about getting into an infinite loop due to a misbehaving server. I /think/ I'd like to see the establishment of a //contract// that if revision data is requested, the server **MUST** respond with revision data. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1857 To: joerg.sonnenberger, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1850: hgweb: when no agreement on compression can be found, fail for v2
indygreg added a comment. In https://phab.mercurial-scm.org/D1850#31404, @joerg.sonnenberger wrote: > Which status code shall we use then, just plain 400? Good question. We use `400` for parts of hgweb. But not the wire protocol parts. And the use of `400` should arguably be avoided because it is supposed to mean the HTTP request message itself was malformed. A lot of people (including our uses in hgweb) extend `400` to mean things like the query string parameters are invalid. RFC 7231 is still vague in its wording and does seem to allow liberal use of this status code. That's probably because of common use of `400` in the wild. In the wire protocol today, it looks like we use `200` and a `Content-Type: application/hg-error` to indicate error. I think this is what we should use here (assuming existing client codes handles the error sanely). It should, since `httppeer.py` always checks for this content-type as part of processing every HTTP response. For the next submission, please add a test showing that an `hg` operation hitting this error in the context of `hg pull` or `hg clone` does something reasonable. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1850 To: joerg.sonnenberger, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 5] log: make opt2revset table a module constant
On Thu, Jan 11, 2018 at 5:58 AM, Yuya Nishiharawrote: > # HG changeset patch > # User Yuya Nishihara > # Date 1514879917 -32400 > # Tue Jan 02 16:58:37 2018 +0900 > # Node ID da12c978eafe1b414122213c75ce149a5e8d8b5b > # Parent 4b68ca118d8d316cff1fbfe260e8fdb0dae3e26a > log: make opt2revset table a module constant > Queued this series. Nice cleanup for the revspecs. > > Just makes it clear that the table isn't updated in _makelogrevset(). > > diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py > --- a/mercurial/cmdutil.py > +++ b/mercurial/cmdutil.py > @@ -2338,6 +2338,24 @@ def _makenofollowlogfilematcher(repo, pa > '''hook for extensions to override the filematcher for non-follow > cases''' > return None > > +_opt2logrevset = { > +'no_merges':('not merge()', None), > +'only_merges': ('merge()', None), > +'_ancestors': ('ancestors(%(val)s)', None), > +'_fancestors': ('_firstancestors(%(val)s)', None), > +'_descendants': ('descendants(%(val)s)', None), > +'_fdescendants':('_firstdescendants(%(val)s)', None), > +'_matchfiles': ('_matchfiles(%(val)s)', None), > +'date': ('date(%(val)r)', None), > +'branch': ('branch(%(val)r)', ' or '), > +'_patslog': ('filelog(%(val)r)', ' or '), > +'_patsfollow': ('follow(%(val)r)', ' or '), > +'_patsfollowfirst': ('_followfirst(%(val)r)', ' or '), > +'keyword': ('keyword(%(val)r)', ' or '), > +'prune':('not (%(val)r or ancestors(%(val)r))', ' and '), > +'user': ('user(%(val)r)', ' or '), > +} > + > def _makelogrevset(repo, pats, opts, revs): > """Return (expr, filematcher) where expr is a revset string built > from log options and file patterns or None. If --stat or --patch > @@ -2345,24 +2363,6 @@ def _makelogrevset(repo, pats, opts, rev > taking a revision number and returning a match objects filtering > the files to be detailed when displaying the revision. > """ > -opt2revset = { > -'no_merges':('not merge()', None), > -'only_merges': ('merge()', None), > -'_ancestors': ('ancestors(%(val)s)', None), > -'_fancestors': ('_firstancestors(%(val)s)', None), > -'_descendants': ('descendants(%(val)s)', None), > -'_fdescendants':('_firstdescendants(%(val)s)', None), > -'_matchfiles': ('_matchfiles(%(val)s)', None), > -'date': ('date(%(val)r)', None), > -'branch': ('branch(%(val)r)', ' or '), > -'_patslog': ('filelog(%(val)r)', ' or '), > -'_patsfollow': ('follow(%(val)r)', ' or '), > -'_patsfollowfirst': ('_followfirst(%(val)r)', ' or '), > -'keyword': ('keyword(%(val)r)', ' or '), > -'prune':('not (%(val)r or ancestors(%(val)r))', ' and > '), > -'user': ('user(%(val)r)', ' or '), > -} > - > opts = dict(opts) > # follow or not follow? > follow = opts.get('follow') or opts.get('follow_first') > @@ -2471,9 +2471,9 @@ def _makelogrevset(repo, pats, opts, rev > for op, val in sorted(opts.iteritems()): > if not val: > continue > -if op not in opt2revset: > +if op not in _opt2logrevset: > continue > -revop, andor = opt2revset[op] > +revop, andor = _opt2logrevset[op] > if '%(val)' not in revop: > expr.append(revop) > else: > ___ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 2 of 8] _addrevision: choose between ifh and dfh once for all
On Sun, Jan 14, 2018 at 2:28 AM, Paul Morellewrote: > # HG changeset patch > # User Paul Morelle > # Date 1515771775 -3600 > # Fri Jan 12 16:42:55 2018 +0100 > # Node ID 84eb864137a7b27e2357eb4f6d465f726670dc98 > # Parent 7526dfca3d32e7c51864c21de2c2f4735c4cade6 > # EXP-Topic refactor-revlog > # Available At https://bitbucket.org/octobus/mercurial-devel/ > # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r > 84eb864137a7 > _addrevision: choose between ifh and dfh once for all > Queued parts 2-8. The entire series is now queued. FWIW, I was thinking about enabling aggressivemergedeltas by default. Perf work around bdiff optimization in the past ~1 year has made it fast enough that only very large fulltexts have noticeable performance loss from enabling the feature. If you make delta generation faster in the remainder of this series, I think there should be little reason to not enable aggressivemergedeltas by default. > > diff -r 7526dfca3d32 -r 84eb864137a7 mercurial/revlog.py > --- a/mercurial/revlog.py Thu Jan 11 11:59:02 2018 +0100 > +++ b/mercurial/revlog.py Fri Jan 12 16:42:55 2018 +0100 > @@ -1901,6 +1901,11 @@ > raise RevlogError(_("%s: attempt to add wdir revision") % >(self.indexfile)) > > +if self._inline: > +fh = ifh > +else: > +fh = dfh > + > btext = [rawtext] > def buildtext(): > if btext[0] is not None: > @@ -1915,10 +1920,6 @@ > len(delta) - hlen): > btext[0] = delta[hlen:] > else: > -if self._inline: > -fh = ifh > -else: > -fh = dfh > basetext = self.revision(baserev, _df=fh, raw=True) > btext[0] = mdiff.patch(basetext, delta) > > @@ -1947,10 +1948,6 @@ > header = mdiff.replacediffheader(self.rawsize(rev), > len(t)) > delta = header + t > else: > -if self._inline: > -fh = ifh > -else: > -fh = dfh > ptext = self.revision(rev, _df=fh, raw=True) > delta = mdiff.textdiff(ptext, t) > header, data = self.compress(delta) > ___ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 7 of 8] _builddeltainfo: rename 'rev' to 'base', as it is the base revision
On Sun, Jan 14, 2018 at 2:28 AM, Paul Morellewrote: > # HG changeset patch > # User Paul Morelle > # Date 1515918647 -3600 > # Sun Jan 14 09:30:47 2018 +0100 > # Node ID 9f916b7bc16409831776b50d6f400a41fdfbbcb7 > # Parent d321149c4918b0c008fc38f318c4759c7c29ba80 > # EXP-Topic refactor-revlog > # Available At https://bitbucket.org/octobus/mercurial-devel/ > # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r > 9f916b7bc164 > _builddeltainfo: rename 'rev' to 'base', as it is the base revision > Nit: "revlog:" Also, this commit should ideally have been split into two to make review easier since there was an existing "base" variable that was also renamed. > > diff -r d321149c4918 -r 9f916b7bc164 mercurial/revlog.py > --- a/mercurial/revlog.py Sat Jan 13 12:55:18 2018 +0100 > +++ b/mercurial/revlog.py Sun Jan 14 09:30:47 2018 +0100 > @@ -1948,26 +1948,26 @@ > > return delta > > -def _builddeltainfo(self, node, rev, p1, p2, btext, cachedelta, fh, > flags): > +def _builddeltainfo(self, node, base, p1, p2, btext, cachedelta, fh, > flags): > # can we use the cached delta? > -if cachedelta and cachedelta[0] == rev: > +if cachedelta and cachedelta[0] == base: > delta = cachedelta[1] > else: > -delta = self._builddeltadiff(rev, node, p1, p2, btext, > cachedelta, > +delta = self._builddeltadiff(base, node, p1, p2, btext, > cachedelta, > fh, flags) > header, data = self.compress(delta) > deltalen = len(header) + len(data) > -chainbase = self.chainbase(rev) > +chainbase = self.chainbase(base) > offset = self.end(len(self) - 1) > dist = deltalen + offset - self.start(chainbase) > if self._generaldelta: > -base = rev > +deltabase = base > else: > -base = chainbase > -chainlen, compresseddeltalen = self._chaininfo(rev) > +deltabase = chainbase > +chainlen, compresseddeltalen = self._chaininfo(base) > chainlen += 1 > compresseddeltalen += deltalen > -return _deltainfo(dist, deltalen, (header, data), base, > +return _deltainfo(dist, deltalen, (header, data), deltabase, > chainbase, chainlen, compresseddeltalen) > > def _addrevision(self, node, rawtext, transaction, link, p1, p2, > flags, > ___ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 8] _addrevision: refactor out the selection of candidate revisions
On Sun, Jan 14, 2018 at 2:28 AM, Paul Morellewrote: > # HG changeset patch > # User Paul Morelle > # Date 1515668342 -3600 > # Thu Jan 11 11:59:02 2018 +0100 > # Node ID 7526dfca3d32e7c51864c21de2c2f4735c4cade6 > # Parent 4b68ca118d8d316cff1fbfe260e8fdb0dae3e26a > # EXP-Topic refactor-revlog > # Available At https://bitbucket.org/octobus/mercurial-devel/ > # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r > 7526dfca3d32 > _addrevision: refactor out the selection of candidate revisions > This refactor seems to preserve the existing behavior AFAICT. The code is not easy to read. But that was a preexisting problem. Hopefully the end state of this series improves matters more. If not, that can be done in follow-ups. Queued part 1. Will look at the subsequent parts soon... Also, please use "revlog:" in the commit message. > > The new function will be useful to retrieve all the revisions which will be > needed to determine the best delta, and parallelize the computation of the > necessary diffs. > > diff -r 4b68ca118d8d -r 7526dfca3d32 mercurial/revlog.py > --- a/mercurial/revlog.py Thu Jan 11 11:57:59 2018 + > +++ b/mercurial/revlog.py Thu Jan 11 11:59:02 2018 +0100 > @@ -1844,6 +1844,44 @@ > > return True > > +def _getcandidaterevs(self, p1, p2, cachedelta): > +""" > +Provides revisions that present an interest to be diffed against, > +grouped by level of easiness. > +""" > I'd appreciate a bit more documentation around the intended use of the generator stream. Essentially, it is emitting iterables of revs and the first group that yields a rev with a suitable delta is used and the client stops processing. That's kinda a weird pattern. But it is how the existing logic works. Using a generator is a clever way to codify that logic. > +curr = len(self) > +prev = curr - 1 > +p1r, p2r = self.rev(p1), self.rev(p2) > curr, p1r, and p2r are already calculated by the caller. I'd pass them into this function. Please send a follow-up. > + > +# should we try to build a delta? > +if prev != nullrev and self.storedeltachains: > We could change this to `if pre == nullrev or not self.storedeltachains: return` and dedent the following block. That would make things a bit easier to read. > +tested = set() > +# This condition is true most of the time when processing > +# changegroup data into a generaldelta repo. The only time it > +# isn't true is if this is the first revision in a delta chain > +# or if ``format.generaldelta=true`` disabled > ``lazydeltabase``. > +if cachedelta and self._generaldelta and self._lazydeltabase: > +# Assume what we received from the server is a good choice > +# build delta will reuse the cache > +yield (cachedelta[0],) > +tested.add(cachedelta[0]) > + > +if self._generaldelta: > +# exclude already lazy tested base if any > +parents = [p for p in (p1r, p2r) > + if p != nullrev and p not in tested] > +if parents and not self._aggressivemergedeltas: > +# Pick whichever parent is closer to us (to minimize > the > +# chance of having to build a fulltext). > +parents = [max(parents)] > +tested.update(parents) > +yield parents > + > +if prev not in tested: > +# other approach failed try against prev to hopefully > save us a > +# fulltext. > +yield (prev,) > Having looked at this code in detail as part of the review, it seems nonsensical to me to emit prev as a candidate revision when generaldelta is being used. I'd refactor the code to use separate branches for generaldelta and non-generaldelta scenarios. But this can be done as follow-up (if it isn't addressed by later patches in this series). > def _addrevision(self, node, rawtext, transaction, link, p1, p2, > flags, > cachedelta, ifh, dfh, alwayscache=False): > """internal function to add revisions to the log > @@ -1943,42 +1981,16 @@ > else: > textlen = len(rawtext) > > -# should we try to build a delta? > -if prev != nullrev and self.storedeltachains: > -tested = set() > -# This condition is true most of the time when processing > -# changegroup data into a generaldelta repo. The only time it > -# isn't true is if this is the first revision in a delta chain > -# or if ``format.generaldelta=true`` disabled > ``lazydeltabase``. > -if cachedelta and self._generaldelta and self._lazydeltabase: > -# Assume what we received from
Re: [PATCH] py3: use email.parser module to parse email messages
On Sun, Jan 14, 2018 at 10:48 AM, Pulkit Goyal <7895pul...@gmail.com> wrote: > # HG changeset patch > # User Pulkit Goyal <7895pul...@gmail.com> > # Date 1514573036 -19800 > # Sat Dec 30 00:13:56 2017 +0530 > # Node ID 9c8cc14cd05fa3420b1549c5369bf9b3623bd5ee > # Parent 390f860228ba909499093e0e8861c908fe15a2d0 > # EXP-Topic py3 > py3: use email.parser module to parse email messages > Queued. This was a top 5 crasher for Python 3 in the test harness. I suspect some tests starting passing as a result of this! > > Before this patch we use email.Parser.Parser() from the email module which > is > not available on Python 3. > > On Python 2: > > >>> import email > >>> import email.parser as emailparser > >>> email.Parser.Parser is emailparser.Parser > True > > diff --git a/hgext/convert/gnuarch.py b/hgext/convert/gnuarch.py > --- a/hgext/convert/gnuarch.py > +++ b/hgext/convert/gnuarch.py > @@ -7,7 +7,7 @@ > # GNU General Public License version 2 or any later version. > from __future__ import absolute_import > > -import email > +import email.parser as emailparser > import os > import shutil > import stat > @@ -63,7 +63,7 @@ > self.changes = {} > self.parents = {} > self.tags = {} > -self.catlogparser = email.Parser.Parser() > +self.catlogparser = emailparser.Parser() > self.encoding = encoding.encoding > self.archives = [] > > diff --git a/hgext/notify.py b/hgext/notify.py > --- a/hgext/notify.py > +++ b/hgext/notify.py > @@ -135,6 +135,7 @@ > from __future__ import absolute_import > > import email > +import email.parser as emailparser > import fnmatch > import socket > import time > @@ -339,7 +340,7 @@ >'and revset\n') > return > > -p = email.Parser.Parser() > +p = emailparser.Parser() > try: > msg = p.parsestr(data) > except email.Errors.MessageParseError as inst: > diff --git a/mercurial/patch.py b/mercurial/patch.py > --- a/mercurial/patch.py > +++ b/mercurial/patch.py > @@ -12,6 +12,7 @@ > import copy > import difflib > import email > +import email.parser as emailparser > import errno > import hashlib > import os > @@ -108,7 +109,7 @@ > cur.append(line) > c = chunk(cur) > > -m = email.Parser.Parser().parse(c) > +m = emailparser.Parser().parse(c) > if not m.is_multipart(): > yield msgfp(m) > else: > @@ -217,7 +218,7 @@ > fd, tmpname = tempfile.mkstemp(prefix='hg-patch-') > tmpfp = os.fdopen(fd, pycompat.sysstr('w')) > try: > -msg = email.Parser.Parser().parse(fileobj) > +msg = emailparser.Parser().parse(fileobj) > > subject = msg['Subject'] and mail.headdecode(msg['Subject']) > data['user'] = msg['From'] and mail.headdecode(msg['From']) > ___ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1478: py3: cast error message to localstr in blackbox.py
indygreg updated this revision to Diff 4823. indygreg edited the summary of this revision. indygreg retitled this revision from "py3: cast error message to bytes in blackbox.py" to "py3: cast error message to localstr in blackbox.py". REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D1478?vs=3718=4823 REVISION DETAIL https://phab.mercurial-scm.org/D1478 AFFECTED FILES hgext/blackbox.py CHANGE DETAILS diff --git a/hgext/blackbox.py b/hgext/blackbox.py --- a/hgext/blackbox.py +++ b/hgext/blackbox.py @@ -44,6 +44,7 @@ from mercurial.node import hex from mercurial import ( +encoding, registrar, ui as uimod, util, @@ -182,7 +183,7 @@ fp.write(fmt % args) except (IOError, OSError) as err: self.debug('warning: cannot write to blackbox.log: %s\n' % - err.strerror) + encoding.strtolocal(err.strerror)) # do not restore _bbinlog intentionally to avoid failed # logging again else: To: indygreg, #hg-reviewers, yuja Cc: yuja, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1479: py3: use byteskwargs in templatekw.showobsfate()
indygreg abandoned this revision. indygreg added a comment. This was addressed by https://phab.mercurial-scm.org/D1536. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1479 To: indygreg, #hg-reviewers, pulkit Cc: pulkit, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 2] rust: extract function to convert Path to platform CString
On Fri, Jan 12, 2018 at 6:22 AM, Yuya Nishiharawrote: > # HG changeset patch > # User Yuya Nishihara > # Date 1515762574 -32400 > # Fri Jan 12 22:09:34 2018 +0900 > # Node ID 44289d889542a3c559c424fa1f2d85cb7e16 > # Parent ea9bd35529f231c438630071119a309ba84dcc77 > rust: extract function to convert Path to platform CString > Queued this series, thanks. FWIW, I was thinking that on Windows we could pass the raw/native Windows args array into a Python list on the mercurial module or something. This would bypass the CPython APIs to set argv. If we go through the CPython APIs and use sys.argv, we need to normalize to char*. The advantage here is Mercurial would be able to access the raw byte sequence and apply an appropriate encoding. In other words, bypassing the restrictive Python 2.7 C API could result in less data loss. > > It can be better on Unix. > > diff --git a/rust/hgcli/src/main.rs b/rust/hgcli/src/main.rs > --- a/rust/hgcli/src/main.rs > +++ b/rust/hgcli/src/main.rs > @@ -14,7 +14,7 @@ use libc::{c_char, c_int}; > > use std::env; > use std::path::PathBuf; > -use std::ffi::CString; > +use std::ffi::{CString, OsStr}; > #[cfg(target_family = "unix")] > use std::os::unix::ffi::OsStringExt; > > @@ -62,6 +62,10 @@ fn get_environment() -> Environment { > } > } > > +fn cstring_from_os>(s: T) -> CString { > +CString::new(s.as_ref().to_str().unwrap()).unwrap() > +} > + > // On UNIX, argv starts as an array of char*. So it is easy to convert > // to C strings. > #[cfg(target_family = "unix")] > @@ -86,9 +90,7 @@ fn args_to_cstrings() -> Vec { > } > > fn set_python_home(env: ) { > -let raw = CString::new(env.python_home.to_str().unwrap()) > -.unwrap() > -.into_raw(); > +let raw = cstring_from_os(_home).into_raw(); > unsafe { > python27_sys::Py_SetPythonHome(raw); > } > @@ -133,9 +135,7 @@ fn run() -> Result<(), i32> { > // Python files. Apparently we could define our own ``Py_GetPath()`` > // implementation. But this may require statically linking Python, > which is > // not desirable. > -let program_name = CString::new(env.python_exe.to_str().unwrap()) > -.unwrap() > -.as_ptr(); > +let program_name = cstring_from_os(_exe).as_ptr(); > unsafe { > python27_sys::Py_SetProgramName(program_name as *mut i8); > } > ___ > Mercurial-devel mailing list > Mercurial-devel@mercurial-scm.org > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1481: py3: ensure hashes are bytes in sparse.py
indygreg abandoned this revision. indygreg added a comment. This was addressed in https://phab.mercurial-scm.org/D1792. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1481 To: indygreg, #hg-reviewers, yuja Cc: yuja, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1833: style: remove multiple statement on a single line in zeroconf
This revision was automatically updated to reflect the committed changes. Closed by commit rHG31451f3f4b56: style: remove multiple statement on a single line in zeroconf (authored by lothiraldan, committed by ). REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D1833?vs=4750=4821 REVISION DETAIL https://phab.mercurial-scm.org/D1833 AFFECTED FILES hgext/zeroconf/Zeroconf.py CHANGE DETAILS diff --git a/hgext/zeroconf/Zeroconf.py b/hgext/zeroconf/Zeroconf.py --- a/hgext/zeroconf/Zeroconf.py +++ b/hgext/zeroconf/Zeroconf.py @@ -1613,7 +1613,8 @@ _DNS_TTL, service.address)) service = self.services.get(question.name.lower(), None) -if not service: continue +if not service: +continue if (question.type == _TYPE_SRV or question.type == _TYPE_ANY): To: lothiraldan, #hg-reviewers, pulkit, indygreg Cc: mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1832: style: remove multiple statement on a single line
This revision was automatically updated to reflect the committed changes. Closed by commit rHGab11af15a149: style: remove multiple statement on a single line (authored by lothiraldan, committed by ). REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D1832?vs=4749=4820 REVISION DETAIL https://phab.mercurial-scm.org/D1832 AFFECTED FILES hgext/convert/git.py CHANGE DETAILS diff --git a/hgext/convert/git.py b/hgext/convert/git.py --- a/hgext/convert/git.py +++ b/hgext/convert/git.py @@ -342,13 +342,15 @@ p = v.split() tm, tz = p[-2:] author = " ".join(p[:-2]) -if author[0] == "<": author = author[1:-1] +if author[0] == "<": +author = author[1:-1] author = self.recode(author) if n == "committer": p = v.split() tm, tz = p[-2:] committer = " ".join(p[:-2]) -if committer[0] == "<": committer = committer[1:-1] +if committer[0] == "<": +committer = committer[1:-1] committer = self.recode(committer) if n == "parent": parents.append(v) To: lothiraldan, #hg-reviewers, pulkit, indygreg Cc: mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1834: pylint: add a check for multiple statement on a single line
This revision was automatically updated to reflect the committed changes. Closed by commit rHG6061e54ff81d: pylint: add a check for multiple statement on a single line (authored by lothiraldan, committed by ). REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D1834?vs=4751=4822 REVISION DETAIL https://phab.mercurial-scm.org/D1834 AFFECTED FILES tests/test-check-pylint.t CHANGE DETAILS diff --git a/tests/test-check-pylint.t b/tests/test-check-pylint.t --- a/tests/test-check-pylint.t +++ b/tests/test-check-pylint.t @@ -11,7 +11,7 @@ $ touch $TESTTMP/fakerc $ pylint --rcfile=$TESTTMP/fakerc --disable=all \ - > --enable=W0102 \ + > --enable=W0102,C0321 \ > --reports=no \ > --ignore=thirdparty \ > mercurial hgdemandimport hgext hgext3rd To: lothiraldan, #hg-reviewers, pulkit, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1831: pylint: split command line argument on multiple lines
This revision was automatically updated to reflect the committed changes. Closed by commit rHG882998f08c3c: pylint: split command line argument on multiple lines (authored by lothiraldan, committed by ). REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D1831?vs=4748=4819 REVISION DETAIL https://phab.mercurial-scm.org/D1831 AFFECTED FILES tests/test-check-pylint.t CHANGE DETAILS diff --git a/tests/test-check-pylint.t b/tests/test-check-pylint.t --- a/tests/test-check-pylint.t +++ b/tests/test-check-pylint.t @@ -11,7 +11,8 @@ $ touch $TESTTMP/fakerc $ pylint --rcfile=$TESTTMP/fakerc --disable=all \ - > --enable=W0102 --reports=no \ + > --enable=W0102 \ + > --reports=no \ > --ignore=thirdparty \ > mercurial hgdemandimport hgext hgext3rd (?) To: lothiraldan, #hg-reviewers, pulkit, indygreg Cc: mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1801: Use named group for parsing differential reviews lines.
This revision was automatically updated to reflect the committed changes. Closed by commit rHGa0d33f4ddff9: phabricator: use named group for parsing differential reviews lines (authored by tom.prince, committed by ). REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D1801?vs=4680=4818 REVISION DETAIL https://phab.mercurial-scm.org/D1801 AFFECTED FILES contrib/phabricator.py CHANGE DETAILS diff --git a/contrib/phabricator.py b/contrib/phabricator.py --- a/contrib/phabricator.py +++ b/contrib/phabricator.py @@ -166,7 +166,7 @@ _differentialrevisiontagre = re.compile('\AD([1-9][0-9]*)\Z') _differentialrevisiondescre = re.compile( -'^Differential Revision:\s*(?:.*)D([1-9][0-9]*)$', re.M) +'^Differential Revision:\s*(?P(?:.*)D(?P[1-9][0-9]*))$', re.M) def getoldnodedrevmap(repo, nodelist): """find previous nodes that has been sent to Phabricator @@ -207,7 +207,7 @@ # Check commit message m = _differentialrevisiondescre.search(ctx.description()) if m: -toconfirm[node] = (1, set(precnodes), int(m.group(1))) +toconfirm[node] = (1, set(precnodes), int(m.group('id'))) # Double check if tags are genuine by collecting all old nodes from # Phabricator, and expect precursors overlap with it. @@ -442,7 +442,7 @@ # Create a local tag to note the association, if commit message # does not have it already m = _differentialrevisiondescre.search(ctx.description()) -if not m or int(m.group(1)) != newrevid: +if not m or int(m.group('id')) != newrevid: tagname = 'D%d' % newrevid tags.tag(repo, tagname, ctx.node(), message=None, user=None, date=None, local=True) To: tom.prince, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1834: pylint: add a check for multiple statement on a single line
indygreg accepted this revision. indygreg added a comment. This revision is now accepted and ready to land. Nice cleanup. Always happy to turn on more lint checks. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1834 To: lothiraldan, #hg-reviewers, pulkit, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1802: Add a template item for linking to a differential review.
indygreg requested changes to this revision. indygreg added a comment. This revision now requires changes to proceed. Please also remove the double newlines and fix the commit message to abide by our message standards. From the `tests`/ directory, run `./run-tests.py -j8 test-check-*` to run the static analysis checks. INLINE COMMENTS > phabricator.py:871-872 > + > +def extsetup(ui): > +templatekw.keywords['phabreview'] = template_review > + `registrar.templatekeyword()` returns a function that can be used as a decorator. This is the preferred mechanism to define templates in extensions. See `hgext/transplant.py` for a simple example. Please switch to that API. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1802 To: tom.prince, #hg-reviewers, indygreg Cc: indygreg, pulkit, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1801: Use named group for parsing differential reviews lines.
indygreg accepted this revision. indygreg added a comment. This revision is now accepted and ready to land. Thanks for the improvement. For future patches, please make the commit message of the form `topic: short summary of changes`. `hg log` should give you plenty of examples. I'll tweak the commit message as part of landing this. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1801 To: tom.prince, #hg-reviewers, indygreg Cc: indygreg, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1729: githelp: don't reference 3rd party commands for `git show`
indygreg updated this revision to Diff 4817. REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D1729?vs=4597=4817 REVISION DETAIL https://phab.mercurial-scm.org/D1729 AFFECTED FILES hgext/githelp.py tests/test-githelp.t CHANGE DETAILS diff --git a/tests/test-githelp.t b/tests/test-githelp.t --- a/tests/test-githelp.t +++ b/tests/test-githelp.t @@ -177,39 +177,39 @@ githelp for git show --name-status $ hg githelp -- git show --name-status - hg log --style status -r tip + hg log --style status -r . githelp for git show --pretty=format: --name-status $ hg githelp -- git show --pretty=format: --name-status - hg stat --change tip + hg status --change . githelp for show with no arguments $ hg githelp -- show - hg show + hg export githelp for show with a path $ hg githelp -- show test_file - hg show . test_file + hg cat test_file githelp for show with not a path: $ hg githelp -- show rev - hg show rev + hg export rev githelp for show with many arguments $ hg githelp -- show argone argtwo - hg show argone argtwo + hg export argone argtwo $ hg githelp -- show test_file argone argtwo - hg show . test_file argone argtwo + hg cat test_file argone argtwo githelp for show with --unified options $ hg githelp -- show --unified=10 - hg show --config diff.unified=10 + hg export --config diff.unified=10 $ hg githelp -- show -U100 - hg show --config diff.unified=100 + hg export --config diff.unified=100 githelp for show with a path and --unified $ hg githelp -- show -U20 test_file - hg show . test_file --config diff.unified=20 + hg cat test_file --config diff.unified=20 githelp for stash drop without name $ hg githelp -- git stash drop diff --git a/hgext/githelp.py b/hgext/githelp.py --- a/hgext/githelp.py +++ b/hgext/githelp.py @@ -877,23 +877,27 @@ ] args, opts = parseoptions(ui, cmdoptions, args) -cmd = Command('show') if opts.get('name_status'): if opts.get('pretty') == 'format:': -cmd = Command('stat') -cmd['--change'] = 'tip' +cmd = Command('status') +cmd['--change'] = '.' else: cmd = Command('log') cmd.append('--style status') -cmd.append('-r tip') +cmd.append('-r .') elif len(args) > 0: if ispath(repo, args[0]): -cmd.append('.') +cmd = Command('cat') +else: +cmd = Command('export') cmd.extend(args) if opts.get('unified'): cmd.append('--config diff.unified=%d' % (opts['unified'],)) elif opts.get('unified'): +cmd = Command('export') cmd.append('--config diff.unified=%d' % (opts['unified'],)) +else: +cmd = Command('export') ui.status((str(cmd)), "\n") To: indygreg, #hg-reviewers, durin42, pulkit, krbullock Cc: krbullock, pulkit, mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH] bookmarks: calculate visibility exceptions only once
# HG changeset patch # User Pulkit Goyal <7895pul...@gmail.com> # Date 1515955571 -19800 # Mon Jan 15 00:16:11 2018 +0530 # Node ID a1551e6da839be8d51dc2371520de836ba0f0dba # Parent 390f860228ba909499093e0e8861c908fe15a2d0 bookmarks: calculate visibility exceptions only once In the loop "for mark in names", the rev is same in each iteration, so it does not makes sense to call unhidehashlikerevs multiple times. Thanks to Yuya for spotting this. diff --git a/mercurial/bookmarks.py b/mercurial/bookmarks.py --- a/mercurial/bookmarks.py +++ b/mercurial/bookmarks.py @@ -830,7 +830,12 @@ cur = repo.changectx('.').node() newact = None changes = [] -hiddenrevs = set() +hiddenrev = None + +# unhide revs if any +if rev: +repo = scmutil.unhidehashlikerevs(repo, [rev], 'nowarn') + for mark in names: mark = checkformat(repo, mark) if newact is None: @@ -840,17 +845,16 @@ return tgt = cur if rev: -repo = scmutil.unhidehashlikerevs(repo, [rev], 'nowarn') ctx = scmutil.revsingle(repo, rev) if ctx.hidden(): -hiddenrevs.add(ctx.hex()[:12]) +hiddenrev = ctx.hex()[:12] tgt = ctx.node() for bm in marks.checkconflict(mark, force, tgt): changes.append((bm, None)) changes.append((mark, tgt)) -if hiddenrevs: -repo.ui.warn(_("bookmarking hidden changeset %s\n") % - ', '.join(hiddenrevs)) + +if hiddenrev: +repo.ui.warn(_("bookmarking hidden changeset %s\n") % hiddenrev) marks.applychanges(repo, tr, changes) if not inactive and cur == marks[newact] and not rev: activate(repo, newact) ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH] py3: use email.parser module to parse email messages
# HG changeset patch # User Pulkit Goyal <7895pul...@gmail.com> # Date 1514573036 -19800 # Sat Dec 30 00:13:56 2017 +0530 # Node ID 9c8cc14cd05fa3420b1549c5369bf9b3623bd5ee # Parent 390f860228ba909499093e0e8861c908fe15a2d0 # EXP-Topic py3 py3: use email.parser module to parse email messages Before this patch we use email.Parser.Parser() from the email module which is not available on Python 3. On Python 2: >>> import email >>> import email.parser as emailparser >>> email.Parser.Parser is emailparser.Parser True diff --git a/hgext/convert/gnuarch.py b/hgext/convert/gnuarch.py --- a/hgext/convert/gnuarch.py +++ b/hgext/convert/gnuarch.py @@ -7,7 +7,7 @@ # GNU General Public License version 2 or any later version. from __future__ import absolute_import -import email +import email.parser as emailparser import os import shutil import stat @@ -63,7 +63,7 @@ self.changes = {} self.parents = {} self.tags = {} -self.catlogparser = email.Parser.Parser() +self.catlogparser = emailparser.Parser() self.encoding = encoding.encoding self.archives = [] diff --git a/hgext/notify.py b/hgext/notify.py --- a/hgext/notify.py +++ b/hgext/notify.py @@ -135,6 +135,7 @@ from __future__ import absolute_import import email +import email.parser as emailparser import fnmatch import socket import time @@ -339,7 +340,7 @@ 'and revset\n') return -p = email.Parser.Parser() +p = emailparser.Parser() try: msg = p.parsestr(data) except email.Errors.MessageParseError as inst: diff --git a/mercurial/patch.py b/mercurial/patch.py --- a/mercurial/patch.py +++ b/mercurial/patch.py @@ -12,6 +12,7 @@ import copy import difflib import email +import email.parser as emailparser import errno import hashlib import os @@ -108,7 +109,7 @@ cur.append(line) c = chunk(cur) -m = email.Parser.Parser().parse(c) +m = emailparser.Parser().parse(c) if not m.is_multipart(): yield msgfp(m) else: @@ -217,7 +218,7 @@ fd, tmpname = tempfile.mkstemp(prefix='hg-patch-') tmpfp = os.fdopen(fd, pycompat.sysstr('w')) try: -msg = email.Parser.Parser().parse(fileobj) +msg = emailparser.Parser().parse(fileobj) subject = msg['Subject'] and mail.headdecode(msg['Subject']) data['user'] = msg['From'] and mail.headdecode(msg['From']) ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1856: wireproto: server-side support for pullbundles
joerg.sonnenberger added a comment. For my test case, which is a bundle of all changes in the NetBSD repo before 2014 and a yearly bundle afterwards until 2018/1/1 and normal pull for the rest, find_pullbundle needs less than 0.5s of CPU time in this iteration when it matches. After the initial clone and with additional available changes, pull time is: With pullbundles.manifest: 0.42s Without pullbundles.manifest: 0.41s i.e. the difference is the noise level. Further benchmarks by others would be appreciated. The biggest remaining question for me is whether we want to introduce a capability for the client to mark that it is willing to do multiple pull rounds. REPOSITORY rHG Mercurial REVISION DETAIL https://phab.mercurial-scm.org/D1856 To: joerg.sonnenberger, #hg-reviewers Cc: mercurial-devel ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
D1856: wireproto: server-side support for pullbundles
joerg.sonnenberger updated this revision to Diff 4816. REPOSITORY rHG Mercurial CHANGES SINCE LAST UPDATE https://phab.mercurial-scm.org/D1856?vs=4811=4816 REVISION DETAIL https://phab.mercurial-scm.org/D1856 AFFECTED FILES mercurial/configitems.py mercurial/help/config.txt mercurial/wireproto.py tests/test-pull-r.t CHANGE DETAILS diff --git a/tests/test-pull-r.t b/tests/test-pull-r.t --- a/tests/test-pull-r.t +++ b/tests/test-pull-r.t @@ -145,3 +145,59 @@ $ cd .. $ killdaemons.py + +Test pullbundle functionality + + $ cd repo + $ cat < .hg/hgrc + > [server] + > pullbundle = True + > EOF + $ hg bundle --base null -r 0 .hg/0.hg + 1 changesets found + $ hg bundle --base 0 -r 1 .hg/1.hg + 1 changesets found + $ hg bundle --base 1 -r 2 .hg/2.hg + 1 changesets found + $ cat < .hg/pullbundles.manifest + > 2.hg heads=effea6de0384e684f44435651cb7bd70b8735bd4 bases=bbd179dfa0a71671c253b3ae0aa1513b60d199fa + > 1.hg heads=ed1b79f46b9a29f5a6efa59cf12fcfca43bead5a bases=bbd179dfa0a71671c253b3ae0aa1513b60d199fa + > 0.hg heads=bbd179dfa0a71671c253b3ae0aa1513b60d199fa + > EOF + $ hg serve --debug -p $HGPORT2 --pid-file=../repo.pid > ../repo-server.txt 2>&1 & + $ while ! grep listening ../repo-server.txt > /dev/null; do sleep 1; done + $ cat ../repo.pid >> $DAEMON_PIDS + $ cd .. + $ hg clone -r 0 http://localhost:$HGPORT2/ repo.pullbundle + adding changesets + adding manifests + adding file changes + added 1 changesets with 1 changes to 1 files + new changesets bbd179dfa0a7 + updating to branch default + 1 files updated, 0 files merged, 0 files removed, 0 files unresolved + $ cd repo.pullbundle + $ hg pull -r 1 + pulling from http://localhost:$HGPORT2/ + searching for changes + adding changesets + adding manifests + adding file changes + added 1 changesets with 1 changes to 1 files + new changesets ed1b79f46b9a + (run 'hg update' to get a working copy) + $ hg pull -r 2 + pulling from http://localhost:$HGPORT2/ + searching for changes + adding changesets + adding manifests + adding file changes + added 1 changesets with 1 changes to 1 files (+1 heads) + new changesets effea6de0384 + (run 'hg heads' to see heads, 'hg merge' to merge) + $ cd .. + $ killdaemons.py + $ grep 'sending pullbundle ' repo-server.txt + sending pullbundle "0.hg" + sending pullbundle "1.hg" + sending pullbundle "2.hg" diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py --- a/mercurial/wireproto.py +++ b/mercurial/wireproto.py @@ -831,6 +831,55 @@ opts = options('debugwireargs', ['three', 'four'], others) return repo.debugwireargs(one, two, **pycompat.strkwargs(opts)) +def find_pullbundle(repo, opts, clheads, heads, common): +def decodehexstring(s): +return set([h.decode('hex') for h in s.split(':')]) + +manifest = repo.vfs.tryread('pullbundles.manifest') +res = exchange.parseclonebundlesmanifest(repo, manifest) +res = exchange.filterclonebundleentries(repo, res) +cl = repo.changelog +if res: +heads_anc = cl.ancestors([cl.rev(rev) for rev in heads], + inclusive=True) +common_anc = cl.ancestors([cl.rev(rev) for rev in common], + inclusive=True) +for entry in res: +if 'heads' in entry: +try: +bundle_heads = decodehexstring(entry['heads']) +except TypeError: +# Bad heads entry +continue +if len(bundle_heads) > len(heads): +# Client wants less heads than the bundle contains +continue +if bundle_heads.issubset(common): +continue # Nothing new +if all(cl.rev(rev) in common_anc for rev in bundle_heads): +continue # Still nothing new +if any(cl.rev(rev) not in heads_anc for rev in bundle_heads): +continue +if 'bases' in entry: +try: +bundle_bases = decodehexstring(entry['bases']) +except TypeError: +# Bad bases entry +continue +if len(bundle_bases) > len(common): +# Client is missing a revision the bundle requires +continue +if not all(cl.rev(rev) in common_anc for rev in bundle_bases): +continue +path = entry['URL'] +repo.ui.debug('sending pullbundle "%s"\n' % path) +try: +return repo.vfs.open(path) +except IOError: +repo.ui.debug('pullbundle "%s" not accessible\n' % path) +continue +return None + @wireprotocommand('getbundle', '*') def getbundle(repo, proto, others): opts = options('getbundle', gboptsmap.keys(), others) @@ -861,12 +910,21 @@ hint=bundle2requiredhint) try: +clheads = set(repo.changelog.heads()) +heads = set(opts.get('heads', set())) +
[PATCH 8 of 8] _addrevision: group revision info into a dedicated structure
# HG changeset patch # User Paul Morelle# Date 1515919891 -3600 # Sun Jan 14 09:51:31 2018 +0100 # Node ID 6e287bddaacd03378c8fcde174dd1668211673e1 # Parent 9f916b7bc16409831776b50d6f400a41fdfbbcb7 # EXP-Topic refactor-revlog # Available At https://bitbucket.org/octobus/mercurial-devel/ # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 6e287bddaacd _addrevision: group revision info into a dedicated structure diff -r 9f916b7bc164 -r 6e287bddaacd mercurial/revlog.py --- a/mercurial/revlog.py Sun Jan 14 09:30:47 2018 +0100 +++ b/mercurial/revlog.py Sun Jan 14 09:51:31 2018 +0100 @@ -264,6 +264,24 @@ chainlen = attr.ib() compresseddeltalen = attr.ib() +@attr.s(slots=True, frozen=True) +class _revisioninfo(object): +"""Information about a revision that allows building its fulltext +node: expected hash of the revision +p1, p2: parent revs of the revision +btext: built text cache consisting of a one-element list +cachedelta: (baserev, uncompressed_delta) or None +flags: flags associated to the revision storage + +One of btext[0] or cachedelta must be set. +""" +node = attr.ib() +p1 = attr.ib() +p2 = attr.ib() +btext = attr.ib() +cachedelta = attr.ib() +flags = attr.ib() + # index v0: # 4 bytes: offset # 4 bytes: compressed length @@ -1894,21 +1912,21 @@ # fulltext. yield (prev,) -def _buildtext(self, node, p1, p2, btext, cachedelta, fh, flags): +def _buildtext(self, revinfo, fh): """Builds a fulltext version of a revision -node: expected hash of the revision -p1, p2: parent revs of the revision -btext: built text cache consisting of a one-element list -cachedelta: (baserev, uncompressed_delta) or None -fh: file handle to either the .i or the .d revlog file, -depending on whether it is inlined or not -flags: flags associated to the revision storage - -One of btext[0] or cachedelta must be set. +revinfo: _revisioninfo instance that contains all needed info +fh: file handle to either the .i or the .d revlog file, + depending on whether it is inlined or not """ +btext = revinfo.btext if btext[0] is not None: return btext[0] + +cachedelta = revinfo.cachedelta +flags = revinfo.flags +node = revinfo.node + baserev = cachedelta[0] delta = cachedelta[1] # special case deltas which replace entire base; no need to decode @@ -1926,7 +1944,7 @@ res = self._processflags(btext[0], flags, 'read', raw=True) btext[0], validatehash = res if validatehash: -self.checkhash(btext[0], node, p1=p1, p2=p2) +self.checkhash(btext[0], node, p1=revinfo.p1, p2=revinfo.p2) if flags & REVIDX_ISCENSORED: raise RevlogError(_('node %s is not censored') % node) except CensoredNodeError: @@ -1935,8 +1953,8 @@ raise return btext[0] -def _builddeltadiff(self, base, node, p1, p2, btext, cachedelta, fh, flags): -t = self._buildtext(node, p1, p2, btext, cachedelta, fh, flags) +def _builddeltadiff(self, base, revinfo, fh): +t = self._buildtext(revinfo, fh) if self.iscensored(base): # deltas based on a censored revision must replace the # full content in one patch, so delta works everywhere @@ -1948,13 +1966,12 @@ return delta -def _builddeltainfo(self, node, base, p1, p2, btext, cachedelta, fh, flags): +def _builddeltainfo(self, revinfo, base, fh): # can we use the cached delta? -if cachedelta and cachedelta[0] == base: -delta = cachedelta[1] +if revinfo.cachedelta and revinfo.cachedelta[0] == base: +delta = revinfo.cachedelta[1] else: -delta = self._builddeltadiff(base, node, p1, p2, btext, cachedelta, - fh, flags) +delta = self._builddeltadiff(base, revinfo, fh) header, data = self.compress(delta) deltalen = len(header) + len(data) chainbase = self.chainbase(base) @@ -2010,12 +2027,11 @@ else: textlen = len(rawtext) +revinfo = _revisioninfo(node, p1, p2, btext, cachedelta, flags) for candidaterevs in self._getcandidaterevs(p1, p2, cachedelta): nominateddeltas = [] for candidaterev in candidaterevs: -candidatedelta = self._builddeltainfo(node, candidaterev, p1, - p2, btext, cachedelta, - fh, flags) +candidatedelta = self._builddeltainfo(revinfo,
[PATCH 1 of 8] _addrevision: refactor out the selection of candidate revisions
# HG changeset patch # User Paul Morelle# Date 1515668342 -3600 # Thu Jan 11 11:59:02 2018 +0100 # Node ID 7526dfca3d32e7c51864c21de2c2f4735c4cade6 # Parent 4b68ca118d8d316cff1fbfe260e8fdb0dae3e26a # EXP-Topic refactor-revlog # Available At https://bitbucket.org/octobus/mercurial-devel/ # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 7526dfca3d32 _addrevision: refactor out the selection of candidate revisions The new function will be useful to retrieve all the revisions which will be needed to determine the best delta, and parallelize the computation of the necessary diffs. diff -r 4b68ca118d8d -r 7526dfca3d32 mercurial/revlog.py --- a/mercurial/revlog.py Thu Jan 11 11:57:59 2018 + +++ b/mercurial/revlog.py Thu Jan 11 11:59:02 2018 +0100 @@ -1844,6 +1844,44 @@ return True +def _getcandidaterevs(self, p1, p2, cachedelta): +""" +Provides revisions that present an interest to be diffed against, +grouped by level of easiness. +""" +curr = len(self) +prev = curr - 1 +p1r, p2r = self.rev(p1), self.rev(p2) + +# should we try to build a delta? +if prev != nullrev and self.storedeltachains: +tested = set() +# This condition is true most of the time when processing +# changegroup data into a generaldelta repo. The only time it +# isn't true is if this is the first revision in a delta chain +# or if ``format.generaldelta=true`` disabled ``lazydeltabase``. +if cachedelta and self._generaldelta and self._lazydeltabase: +# Assume what we received from the server is a good choice +# build delta will reuse the cache +yield (cachedelta[0],) +tested.add(cachedelta[0]) + +if self._generaldelta: +# exclude already lazy tested base if any +parents = [p for p in (p1r, p2r) + if p != nullrev and p not in tested] +if parents and not self._aggressivemergedeltas: +# Pick whichever parent is closer to us (to minimize the +# chance of having to build a fulltext). +parents = [max(parents)] +tested.update(parents) +yield parents + +if prev not in tested: +# other approach failed try against prev to hopefully save us a +# fulltext. +yield (prev,) + def _addrevision(self, node, rawtext, transaction, link, p1, p2, flags, cachedelta, ifh, dfh, alwayscache=False): """internal function to add revisions to the log @@ -1943,42 +1981,16 @@ else: textlen = len(rawtext) -# should we try to build a delta? -if prev != nullrev and self.storedeltachains: -tested = set() -# This condition is true most of the time when processing -# changegroup data into a generaldelta repo. The only time it -# isn't true is if this is the first revision in a delta chain -# or if ``format.generaldelta=true`` disabled ``lazydeltabase``. -if cachedelta and self._generaldelta and self._lazydeltabase: -# Assume what we received from the server is a good choice -# build delta will reuse the cache -candidatedelta = builddelta(cachedelta[0]) -tested.add(cachedelta[0]) +for candidaterevs in self._getcandidaterevs(p1, p2, cachedelta): +nominateddeltas = [] +for candidaterev in candidaterevs: +candidatedelta = builddelta(candidaterev) if self._isgooddelta(candidatedelta, textlen): -delta = candidatedelta -if delta is None and self._generaldelta: -# exclude already lazy tested base if any -parents = [p for p in (p1r, p2r) - if p != nullrev and p not in tested] -if parents and not self._aggressivemergedeltas: -# Pick whichever parent is closer to us (to minimize the -# chance of having to build a fulltext). -parents = [max(parents)] -tested.update(parents) -pdeltas = [] -for p in parents: -pd = builddelta(p) -if self._isgooddelta(pd, textlen): -pdeltas.append(pd) -if pdeltas: -delta = min(pdeltas, key=lambda x: x[1]) -if delta is None and prev not in tested: -# other approach failed try against prev to hopefully save us a -# fulltext. -candidatedelta = builddelta(prev) -if
[PATCH 6 of 8] _builddeltainfo: separate diff computation from the collection of other info
# HG changeset patch # User Paul Morelle# Date 1515844518 -3600 # Sat Jan 13 12:55:18 2018 +0100 # Node ID d321149c4918b0c008fc38f318c4759c7c29ba80 # Parent 6e83370fc8befdebc523b92f6f4ff6ce009c97ad # EXP-Topic refactor-revlog # Available At https://bitbucket.org/octobus/mercurial-devel/ # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r d321149c4918 _builddeltainfo: separate diff computation from the collection of other info diff -r 6e83370fc8be -r d321149c4918 mercurial/revlog.py --- a/mercurial/revlog.py Fri Jan 12 18:58:44 2018 +0100 +++ b/mercurial/revlog.py Sat Jan 13 12:55:18 2018 +0100 @@ -1935,20 +1935,26 @@ raise return btext[0] +def _builddeltadiff(self, base, node, p1, p2, btext, cachedelta, fh, flags): +t = self._buildtext(node, p1, p2, btext, cachedelta, fh, flags) +if self.iscensored(base): +# deltas based on a censored revision must replace the +# full content in one patch, so delta works everywhere +header = mdiff.replacediffheader(self.rawsize(base), len(t)) +delta = header + t +else: +ptext = self.revision(base, _df=fh, raw=True) +delta = mdiff.textdiff(ptext, t) + +return delta + def _builddeltainfo(self, node, rev, p1, p2, btext, cachedelta, fh, flags): # can we use the cached delta? if cachedelta and cachedelta[0] == rev: delta = cachedelta[1] else: -t = self._buildtext(node, p1, p2, btext, cachedelta, fh, flags) -if self.iscensored(rev): -# deltas based on a censored revision must replace the -# full content in one patch, so delta works everywhere -header = mdiff.replacediffheader(self.rawsize(rev), len(t)) -delta = header + t -else: -ptext = self.revision(rev, _df=fh, raw=True) -delta = mdiff.textdiff(ptext, t) +delta = self._builddeltadiff(rev, node, p1, p2, btext, cachedelta, + fh, flags) header, data = self.compress(delta) deltalen = len(header) + len(data) chainbase = self.chainbase(rev) ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 4 of 8] revlog: extract 'builddelta' closure function from _addrevision
# HG changeset patch # User Paul Morelle# Date 1515777003 -3600 # Fri Jan 12 18:10:03 2018 +0100 # Node ID c9069bebf72b906229e740bf8fe4beee37570dc9 # Parent 2f39856d4feee57695b05c9298a3bf1789edf173 # EXP-Topic refactor-revlog # Available At https://bitbucket.org/octobus/mercurial-devel/ # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r c9069bebf72b revlog: extract 'builddelta' closure function from _addrevision diff -r 2f39856d4fee -r c9069bebf72b mercurial/revlog.py --- a/mercurial/revlog.py Fri Jan 12 15:55:25 2018 +0100 +++ b/mercurial/revlog.py Fri Jan 12 18:10:03 2018 +0100 @@ -1923,6 +1923,35 @@ raise return btext[0] +def _builddelta(self, node, rev, p1, p2, btext, cachedelta, fh, flags): +# can we use the cached delta? +if cachedelta and cachedelta[0] == rev: +delta = cachedelta[1] +else: +t = self._buildtext(node, p1, p2, btext, cachedelta, fh, flags) +if self.iscensored(rev): +# deltas based on a censored revision must replace the +# full content in one patch, so delta works everywhere +header = mdiff.replacediffheader(self.rawsize(rev), len(t)) +delta = header + t +else: +ptext = self.revision(rev, _df=fh, raw=True) +delta = mdiff.textdiff(ptext, t) +header, data = self.compress(delta) +deltalen = len(header) + len(data) +chainbase = self.chainbase(rev) +offset = self.end(len(self) - 1) +dist = deltalen + offset - self.start(chainbase) +if self._generaldelta: +base = rev +else: +base = chainbase +chainlen, compresseddeltalen = self._chaininfo(rev) +chainlen += 1 +compresseddeltalen += deltalen +return (dist, deltalen, (header, data), base, +chainbase, chainlen, compresseddeltalen) + def _addrevision(self, node, rawtext, transaction, link, p1, p2, flags, cachedelta, ifh, dfh, alwayscache=False): """internal function to add revisions to the log @@ -1949,34 +1978,6 @@ btext = [rawtext] -def builddelta(rev): -# can we use the cached delta? -if cachedelta and cachedelta[0] == rev: -delta = cachedelta[1] -else: -t = self._buildtext(node, p1, p2, btext, cachedelta, fh, flags) -if self.iscensored(rev): -# deltas based on a censored revision must replace the -# full content in one patch, so delta works everywhere -header = mdiff.replacediffheader(self.rawsize(rev), len(t)) -delta = header + t -else: -ptext = self.revision(rev, _df=fh, raw=True) -delta = mdiff.textdiff(ptext, t) -header, data = self.compress(delta) -deltalen = len(header) + len(data) -chainbase = self.chainbase(rev) -dist = deltalen + offset - self.start(chainbase) -if self._generaldelta: -base = rev -else: -base = chainbase -chainlen, compresseddeltalen = self._chaininfo(rev) -chainlen += 1 -compresseddeltalen += deltalen -return (dist, deltalen, (header, data), base, -chainbase, chainlen, compresseddeltalen) - curr = len(self) prev = curr - 1 offset = self.end(prev) @@ -1994,7 +1995,9 @@ for candidaterevs in self._getcandidaterevs(p1, p2, cachedelta): nominateddeltas = [] for candidaterev in candidaterevs: -candidatedelta = builddelta(candidaterev) +candidatedelta = self._builddelta(node, candidaterev, p1, p2, + btext, cachedelta, fh, + flags) if self._isgooddelta(candidatedelta, textlen): nominateddeltas.append(candidatedelta) if nominateddeltas: ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 2 of 8] _addrevision: choose between ifh and dfh once for all
# HG changeset patch # User Paul Morelle# Date 1515771775 -3600 # Fri Jan 12 16:42:55 2018 +0100 # Node ID 84eb864137a7b27e2357eb4f6d465f726670dc98 # Parent 7526dfca3d32e7c51864c21de2c2f4735c4cade6 # EXP-Topic refactor-revlog # Available At https://bitbucket.org/octobus/mercurial-devel/ # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 84eb864137a7 _addrevision: choose between ifh and dfh once for all diff -r 7526dfca3d32 -r 84eb864137a7 mercurial/revlog.py --- a/mercurial/revlog.py Thu Jan 11 11:59:02 2018 +0100 +++ b/mercurial/revlog.py Fri Jan 12 16:42:55 2018 +0100 @@ -1901,6 +1901,11 @@ raise RevlogError(_("%s: attempt to add wdir revision") % (self.indexfile)) +if self._inline: +fh = ifh +else: +fh = dfh + btext = [rawtext] def buildtext(): if btext[0] is not None: @@ -1915,10 +1920,6 @@ len(delta) - hlen): btext[0] = delta[hlen:] else: -if self._inline: -fh = ifh -else: -fh = dfh basetext = self.revision(baserev, _df=fh, raw=True) btext[0] = mdiff.patch(basetext, delta) @@ -1947,10 +1948,6 @@ header = mdiff.replacediffheader(self.rawsize(rev), len(t)) delta = header + t else: -if self._inline: -fh = ifh -else: -fh = dfh ptext = self.revision(rev, _df=fh, raw=True) delta = mdiff.textdiff(ptext, t) header, data = self.compress(delta) ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
[PATCH 3 of 8] revlog: extract 'buildtext' closure function from _addrevision
# HG changeset patch # User Paul Morelle# Date 1515768925 -3600 # Fri Jan 12 15:55:25 2018 +0100 # Node ID 2f39856d4feee57695b05c9298a3bf1789edf173 # Parent 84eb864137a7b27e2357eb4f6d465f726670dc98 # EXP-Topic refactor-revlog # Available At https://bitbucket.org/octobus/mercurial-devel/ # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 2f39856d4fee revlog: extract 'buildtext' closure function from _addrevision diff -r 84eb864137a7 -r 2f39856d4fee mercurial/revlog.py --- a/mercurial/revlog.py Fri Jan 12 16:42:55 2018 +0100 +++ b/mercurial/revlog.py Fri Jan 12 15:55:25 2018 +0100 @@ -1882,6 +1882,47 @@ # fulltext. yield (prev,) +def _buildtext(self, node, p1, p2, btext, cachedelta, fh, flags): +"""Builds a fulltext version of a revision + +node: expected hash of the revision +p1, p2: parent revs of the revision +btext: built text cache consisting of a one-element list +cachedelta: (baserev, uncompressed_delta) or None +fh: file handle to either the .i or the .d revlog file, +depending on whether it is inlined or not +flags: flags associated to the revision storage + +One of btext[0] or cachedelta must be set. +""" +if btext[0] is not None: +return btext[0] +baserev = cachedelta[0] +delta = cachedelta[1] +# special case deltas which replace entire base; no need to decode +# base revision. this neatly avoids censored bases, which throw when +# they're decoded. +hlen = struct.calcsize(">lll") +if delta[:hlen] == mdiff.replacediffheader(self.rawsize(baserev), + len(delta) - hlen): +btext[0] = delta[hlen:] +else: +basetext = self.revision(baserev, _df=fh, raw=True) +btext[0] = mdiff.patch(basetext, delta) + +try: +res = self._processflags(btext[0], flags, 'read', raw=True) +btext[0], validatehash = res +if validatehash: +self.checkhash(btext[0], node, p1=p1, p2=p2) +if flags & REVIDX_ISCENSORED: +raise RevlogError(_('node %s is not censored') % node) +except CensoredNodeError: +# must pass the censored index flag to add censored revisions +if not flags & REVIDX_ISCENSORED: +raise +return btext[0] + def _addrevision(self, node, rawtext, transaction, link, p1, p2, flags, cachedelta, ifh, dfh, alwayscache=False): """internal function to add revisions to the log @@ -1907,41 +1948,13 @@ fh = dfh btext = [rawtext] -def buildtext(): -if btext[0] is not None: -return btext[0] -baserev = cachedelta[0] -delta = cachedelta[1] -# special case deltas which replace entire base; no need to decode -# base revision. this neatly avoids censored bases, which throw when -# they're decoded. -hlen = struct.calcsize(">lll") -if delta[:hlen] == mdiff.replacediffheader(self.rawsize(baserev), - len(delta) - hlen): -btext[0] = delta[hlen:] -else: -basetext = self.revision(baserev, _df=fh, raw=True) -btext[0] = mdiff.patch(basetext, delta) - -try: -res = self._processflags(btext[0], flags, 'read', raw=True) -btext[0], validatehash = res -if validatehash: -self.checkhash(btext[0], node, p1=p1, p2=p2) -if flags & REVIDX_ISCENSORED: -raise RevlogError(_('node %s is not censored') % node) -except CensoredNodeError: -# must pass the censored index flag to add censored revisions -if not flags & REVIDX_ISCENSORED: -raise -return btext[0] def builddelta(rev): # can we use the cached delta? if cachedelta and cachedelta[0] == rev: delta = cachedelta[1] else: -t = buildtext() +t = self._buildtext(node, p1, p2, btext, cachedelta, fh, flags) if self.iscensored(rev): # deltas based on a censored revision must replace the # full content in one patch, so delta works everywhere @@ -1991,7 +2004,8 @@ if delta is not None: dist, l, data, base, chainbase, chainlen, compresseddeltalen = delta else: -rawtext = buildtext() +rawtext = self._buildtext(node, p1, p2, btext, cachedelta, fh, + flags)
[PATCH 7 of 8] _builddeltainfo: rename 'rev' to 'base', as it is the base revision
# HG changeset patch # User Paul Morelle# Date 1515918647 -3600 # Sun Jan 14 09:30:47 2018 +0100 # Node ID 9f916b7bc16409831776b50d6f400a41fdfbbcb7 # Parent d321149c4918b0c008fc38f318c4759c7c29ba80 # EXP-Topic refactor-revlog # Available At https://bitbucket.org/octobus/mercurial-devel/ # hg pull https://bitbucket.org/octobus/mercurial-devel/ -r 9f916b7bc164 _builddeltainfo: rename 'rev' to 'base', as it is the base revision diff -r d321149c4918 -r 9f916b7bc164 mercurial/revlog.py --- a/mercurial/revlog.py Sat Jan 13 12:55:18 2018 +0100 +++ b/mercurial/revlog.py Sun Jan 14 09:30:47 2018 +0100 @@ -1948,26 +1948,26 @@ return delta -def _builddeltainfo(self, node, rev, p1, p2, btext, cachedelta, fh, flags): +def _builddeltainfo(self, node, base, p1, p2, btext, cachedelta, fh, flags): # can we use the cached delta? -if cachedelta and cachedelta[0] == rev: +if cachedelta and cachedelta[0] == base: delta = cachedelta[1] else: -delta = self._builddeltadiff(rev, node, p1, p2, btext, cachedelta, +delta = self._builddeltadiff(base, node, p1, p2, btext, cachedelta, fh, flags) header, data = self.compress(delta) deltalen = len(header) + len(data) -chainbase = self.chainbase(rev) +chainbase = self.chainbase(base) offset = self.end(len(self) - 1) dist = deltalen + offset - self.start(chainbase) if self._generaldelta: -base = rev +deltabase = base else: -base = chainbase -chainlen, compresseddeltalen = self._chaininfo(rev) +deltabase = chainbase +chainlen, compresseddeltalen = self._chaininfo(base) chainlen += 1 compresseddeltalen += deltalen -return _deltainfo(dist, deltalen, (header, data), base, +return _deltainfo(dist, deltalen, (header, data), deltabase, chainbase, chainlen, compresseddeltalen) def _addrevision(self, node, rawtext, transaction, link, p1, p2, flags, ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
Re: [PATCH 1 of 7] share: use context manager or utility function to write file
On Sat, 13 Jan 2018 23:24:11 -0800, Gregory Szorc wrote: > On Fri, Jan 12, 2018 at 9:02 PM, Yuya Nishiharawrote: > > > # HG changeset patch > > # User Yuya Nishihara > > # Date 1515817396 -32400 > > # Sat Jan 13 13:23:16 2018 +0900 > > # Node ID 2eeaf96c20fce19c8edccf4936aceee4ce651de9 > > # Parent 991f0be9dc39c402d63a4a8f19cde052095c4689 > > share: use context manager or utility function to write file > > > > Queued this series. > > I'm personally not a huge fan of the helper APIs in the vfs/io layer: Neither am I, at least for write/append operation. But I don't have strong preference. ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel