[PATCH 1 of 2] tests: add commits to test-casecollision.t

2016-08-17 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1471495316 25200
#  Wed Aug 17 21:41:56 2016 -0700
# Node ID 04ba5f1f6c28674ed2df750a8acca02021b2698c
# Parent  997e8cf4d0a29d28759e38659736cb3d1cf9ef3f
tests: add commits to test-casecollision.t

This demonstrates that we don't alert when committing a case collision.

diff --git a/tests/test-casecollision.t b/tests/test-casecollision.t
--- a/tests/test-casecollision.t
+++ b/tests/test-casecollision.t
@@ -24,46 +24,65 @@ test file addition with colliding case
   $ hg forget A
   $ hg st
   A a
   ? A
   $ hg add --config ui.portablefilenames=no A
   $ hg st
   A A
   A a
+  $ hg commit -m 'add A a'
+
   $ mkdir b
   $ touch b/c b/D
   $ hg add b
   adding b/D
   adding b/c
+  $ hg commit -m 'add b/c b/D'
+
   $ touch b/d b/C
   $ hg add b/C
   warning: possible case-folding collision for b/C
   $ hg add b/d
   warning: possible case-folding collision for b/d
+  $ hg commit -m 'add b/C b/d'
+
   $ touch b/a1 b/a2
   $ hg add b
   adding b/a1
   adding b/a2
+  $ hg commit -m 'add b/a1 b/a2'
+
   $ touch b/A2 b/a1.1
   $ hg add b/a1.1 b/A2
   warning: possible case-folding collision for b/A2
+  $ hg commit -m 'add b/A2 b/a1.1'
+
   $ touch b/f b/F
   $ hg add b/f b/F
   warning: possible case-folding collision for b/f
+  $ hg commit -m 'add b/f b/F'
+
   $ touch g G
   $ hg add g G
   warning: possible case-folding collision for g
+  $ hg commit -m 'add g G'
+
   $ mkdir h H
   $ touch h/x H/x
   $ hg add h/x H/x
   warning: possible case-folding collision for h/x
+  $ hg commit -m 'add h/x H/x'
+
   $ touch h/s H/s
   $ hg add h/s
   $ hg add H/s
   warning: possible case-folding collision for H/s
+  $ hg commit -m 'add h/s H/s'
 
 case changing rename must not warn or abort
 
   $ echo c > c
   $ hg ci -qAmx
   $ hg mv c C
+  $ hg commit -m 'mv c C'
+
   $ cd ..
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 2] localrepo: check for case collisions at commit time (issue4665) (BC)

2016-08-17 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1471497059 25200
#  Wed Aug 17 22:10:59 2016 -0700
# Node ID e8d4db665bf172344d2354c17a0eb8f69ef265c7
# Parent  04ba5f1f6c28674ed2df750a8acca02021b2698c
localrepo: check for case collisions at commit time (issue4665) (BC)

Before, case collisions were only reported when performing `hg add`.
This meant that many commands that could add or rename files (including
graft, rebase, histedit, backout, etc) weren't audited for case
collisions.

This patch adds case collision detection to the lower-level
repo.commit() so that any time a commit is performed from the working
directory state we look for case collisions.

As the test changes show, this results in a warning or abort at
`hg commit` time.

It would arguably be even better to perform case collision detection
in localrepository.commitctx(), which is the lowest level function
for adding a new changeset. But I'm not sure if that's appropriate.

The added code in scmutil.py could likely be consolidated with existing
code. It's short enough that I'm fine with the DRY violation. But if
someone wants to push back, I'll understand.

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -1581,16 +1581,24 @@ class localrepository(object):
 '.hgsubstate' not in (status.modified + status.added +
   status.removed)):
 status.removed.insert(0, '.hgsubstate')
 
 # make sure all explicit patterns are matched
 if not force:
 self.checkcommitpatterns(wctx, vdirs, match, status, fail)
 
+# Audit added files for case collisions.
+caseabort, casewarn = scmutil.checkportabilityalert(self.ui)
+if caseabort or casewarn:
+cca = scmutil.ctxcasecollisionauditor(self.ui, wctx.p1(),
+  caseabort)
+for f in status.added:
+cca(f)
+
 cctx = context.workingcommitctx(self, status,
 text, user, date, extra)
 
 # internal config: ui.allowemptycommit
 allowemptycommit = (wctx.branch() != wctx.p1().branch()
 or extra.get('close') or merge or cctx.files()
 or self.ui.configbool('ui', 
'allowemptycommit'))
 if not allowemptycommit:
diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
--- a/mercurial/scmutil.py
+++ b/mercurial/scmutil.py
@@ -202,16 +202,33 @@ class casecollisionauditor(object):
 if fl in self._loweredfiles and f not in self._dirstate:
 msg = _('possible case-folding collision for %s') % f
 if self._abort:
 raise error.Abort(msg)
 self._ui.warn(_("warning: %s\n") % msg)
 self._loweredfiles.add(fl)
 self._newfiles.add(f)
 
+class ctxcasecollisionauditor(object):
+"""A case collision auditor that works on a changectx."""
+def __init__(self, ui, ctx, abort):
+self._ui = ui
+self._abort = abort
+self._loweredfiles = set(encoding.lower(p) for p in ctx.manifest())
+
+def __call__(self, path):
+l = encoding.lower(path)
+if l in self._loweredfiles:
+msg = _('possible case-folding collision for %s') % path
+if self._abort:
+raise error.Abort(msg)
+self._ui.warn(_('warning: %s\n') % msg)
+
+self._loweredfiles.add(l)
+
 def filteredhash(repo, maxrev):
 """build hash of filtered revisions in the current repoview.
 
 Multiple caches perform up-to-date validation by checking that the
 tiprev and tipnode stored in the cache file match the current repository.
 However, this is not sufficient for validating repoviews because the set
 of revisions in the view may change without the repository tiprev and
 tipnode changing.
diff --git a/tests/test-casecollision.t b/tests/test-casecollision.t
--- a/tests/test-casecollision.t
+++ b/tests/test-casecollision.t
@@ -24,65 +24,88 @@ test file addition with colliding case
   $ hg forget A
   $ hg st
   A a
   ? A
   $ hg add --config ui.portablefilenames=no A
   $ hg st
   A A
   A a
+
+Abort at commit time when case-folding collisions aren't allowed
+
+  $ hg --config ui.portablefilenames=abort commit -m 'add A a'
+  abort: possible case-folding collision for a
+  [255]
+
   $ hg commit -m 'add A a'
+  warning: possible case-folding collision for a
 
   $ mkdir b
   $ touch b/c b/D
   $ hg add b
   adding b/D
   adding b/c
   $ hg commit -m 'add b/c b/D'
 
   $ touch b/d b/C
   $ hg add b/C
   warning: possible case-folding collision for b/C
   $ hg add b/d
   warning: possible case-folding collision for b/d
-  $ hg commit -m 'add b/C b/d'
+  $ hg --config ui.portablefilenames=no commit -m 

[PATCH V2] help: internals topic for wire protocol

2016-08-17 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1471489767 25200
#  Wed Aug 17 20:09:27 2016 -0700
# Node ID 4ff105e8cfc56a643e55b079c0d01d7e86d43879
# Parent  997e8cf4d0a29d28759e38659736cb3d1cf9ef3f
help: internals topic for wire protocol

The Mercurial wire protocol is under-documented. This includes a lack
of source docstrings and comments as well as pages on the official
wiki.

This patch adds the beginnings of "internals" documentation on the
wire protocol.

The documentation, while fairly comprehensive, is far from exhaustive.
Known missing pieces include a detailed overview of bundle2 (including
the "push back" protocol) as well as higher-level details, such as
how mechanisms like discovery work. But you have to start somewhere.

The documentation should have nearly complete coverage on the
lower-level parts of the protocol, such as the different transport
mechanims, how commands and arguments are sent, capabilities, and,
of course, the commands themselves.

As part of writing this documentation, I discovered a number of
deficiencies in the protocol and bugs in the implementation. I've
started sending patches for some of the issues. I hope to send a lot
more.

I'm sure this documentation could be bikeshedded for months. Unless
there are factual errors, I'd prefer to have it land and iterate on
improving it rather than wait for perfection.

diff --git a/contrib/wix/help.wxs b/contrib/wix/help.wxs
--- a/contrib/wix/help.wxs
+++ b/contrib/wix/help.wxs
@@ -37,16 +37,17 @@
 
 
 
   
 
 
 
 
+
   
 
 
   
 
   
 
 
diff --git a/mercurial/help.py b/mercurial/help.py
--- a/mercurial/help.py
+++ b/mercurial/help.py
@@ -187,16 +187,18 @@ internalstable = sorted([
 (['bundles'], _('Bundles'),
  loaddoc('bundles', subdir='internals')),
 (['changegroups'], _('Changegroups'),
  loaddoc('changegroups', subdir='internals')),
 (['requirements'], _('Repository Requirements'),
  loaddoc('requirements', subdir='internals')),
 (['revlogs'], _('Revision Logs'),
  loaddoc('revlogs', subdir='internals')),
+(['wireprotocol'], _('Wire Protocol'),
+ loaddoc('wireprotocol', subdir='internals')),
 ])
 
 def internalshelp(ui):
 """Generate the index for the "internals" topic."""
 lines = []
 for names, header, doc in internalstable:
 lines.append(' :%s: %s\n' % (names[0], header))
 
diff --git a/mercurial/help/internals/wireprotocol.txt 
b/mercurial/help/internals/wireprotocol.txt
new file mode 100644
--- /dev/null
+++ b/mercurial/help/internals/wireprotocol.txt
@@ -0,0 +1,773 @@
+The Mercurial wire protocol is a request-response based protocol
+with multiple wire representations.
+
+Each request is modeled as a command name, a dictionary of arguments, and
+optional raw input. Command arguments and their types are intrinsic
+properties of commands. So is the response type of the command. This means
+clients can't always send arbitrary arguments to servers and servers can't
+return multiple response types.
+
+The protocol is synchronous and does not support multiplexing (concurrent
+commands).
+
+Transport Protocols
+===
+
+HTTP Transport
+--
+
+Commands are issued as HTTP/1.0 or HTTP/1.1 requests. Commands are
+sent to the base URL of the repository with the command name sent in
+the ``cmd`` query string parameter. e.g.
+``https://example.com/repo?cmd=capabilities``. The HTTP method is ``GET``
+or ``POST`` depending on the command and whether there is a request
+body.
+
+Command arguments can be sent multiple ways.
+
+The simplest is part of the URL query string using ``x-www-form-urlencoded``
+encoding (see Python's ``urllib.urlencode()``. However, many servers impose
+length limitations on the URL. So this mechanism is typically only used if
+the server doesn't support other mechanisms.
+
+If the server supports the ``httpheader`` capability, command arguments can
+be sent in HTTP request headers named ``X-HgArg-`` where  is an
+integer starting at 1. A ``x-www-form-urlencoded`` representation of the
+arguments is obtained. This full string is then split into chunks and sent
+in numbered ``X-HgArg-`` headers. The maximum length of each HTTP header
+is defined by the server in the ``httpheader`` capability value, which defaults
+to ``1024``. The server reassembles the encoded arguments string by
+concatenating the ``X-HgArg-`` headers then URL decodes them into a
+dictionary.
+
+The list of ``X-HgArg-`` headers should be added to the ``Vary`` request
+header to instruct caches to take these headers into consideration when caching
+requests.
+
+If the server supports the ``httppostargs`` capability, the client
+may send command arguments in the HTTP request body as part of an
+HTTP POST request. The command arguments will be URL encoded just like
+they would for sending them via HTTP headers. 

[PATCH 3 of 5 V2] manifest: introduce manifestlog and manifestctx classes

2016-08-17 Thread Durham Goode
# HG changeset patch
# User Durham Goode 
# Date 1471465513 25200
#  Wed Aug 17 13:25:13 2016 -0700
# Node ID 00f8d3832aad368660c69eff4be3e03a5568aebe
# Parent  8ddbe86953023c40beb7be9dcb8025d1055813c5
manifest: introduce manifestlog and manifestctx classes

This is the start of a large refactoring of the manifest class. It introduces
the new manifestlog and manifestctx classes which will represent the collection
of all manifests and individual instances, respectively.

Future patches will begin to convert usages of repo.manifest to
repo.manifestlog, adding the necessary functionality to manifestlog and instance
as they are needed.

diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
--- a/mercurial/localrepo.py
+++ b/mercurial/localrepo.py
@@ -504,6 +504,10 @@ class localrepository(object):
 def manifest(self):
 return manifest.manifest(self.svfs)
 
+@storecache('00manifest.i')
+def manifestlog(self):
+return manifest.manifestlog(self.svfs, self.manifest)
+
 @repofilecache('dirstate')
 def dirstate(self):
 return dirstate.dirstate(self.vfs, self.ui, self.root,
diff --git a/mercurial/manifest.py b/mercurial/manifest.py
--- a/mercurial/manifest.py
+++ b/mercurial/manifest.py
@@ -914,6 +914,70 @@ class manifestrevlog(revlog.revlog):
 super(manifestrevlog, self).clearcaches()
 self._fulltextcache.clear()
 
+class manifestlog(object):
+"""A collection class representing the collection of manifest snapshots
+referenced by commits in the repository.
+
+In this situation, 'manifest' refers to the abstract concept of a snapshot
+of the list of files in the given commit. Consumers of the output of this
+class do not care about the implementation details of the actual manifests
+they receive (i.e. tree or flat or lazily loaded, etc)."""
+def __init__(self, opener, oldmanifest):
+self._revlog = oldmanifest
+
+# We'll separate this into it's own cache once oldmanifest is no longer
+# used
+self._mancache = oldmanifest._mancache
+
+# _revlog is the same as _oldmanifest right now, but we eventually want
+# to delete _oldmanifest while still allowing manifestlog to access the
+# revlog specific apis.
+self._oldmanifest = oldmanifest
+
+def __getitem__(self, node):
+"""Retrieves the manifest instance for the given node. Throws a 
KeyError
+if not found.
+"""
+if (self._oldmanifest._treeondisk
+or self._oldmanifest._treeinmem):
+# TODO: come back and support tree manifests directly
+return self._oldmanifest.read(node)
+
+if node == revlog.nullid:
+return manifestdict()
+if node in self._mancache:
+cachemf = self._mancache[node]
+# The old manifest may put non-ctx manifests in the cache, so skip
+# those since they don't implement the full api.
+if isinstance(cachemf, manifestctx):
+return cachemf
+
+m = manifestctx(self._revlog, node)
+self._mancache[node] = m
+return m
+
+class manifestctx(manifestdict):
+"""A class representing a single revision of a manifest, including its
+contents, its parent revs, and its linkrev.
+"""
+def __init__(self, revlog, node):
+self._revlog = revlog
+
+self._node = node
+self.p1, self.p2 = revlog.parents(node)
+rev = revlog.rev(node)
+self.linkrev = revlog.linkrev(rev)
+
+# This should eventually be made lazy loaded, so consumers can access
+# the node/p1/linkrev data without having to parse the whole manifest.
+data = revlog.revision(node)
+arraytext = array.array('c', data)
+revlog._fulltextcache[node] = arraytext
+super(manifestctx, self).__init__(data)
+
+def node(self):
+return self._node
+
 class manifest(manifestrevlog):
 def __init__(self, opener, dir='', dirlogcache=None):
 '''The 'dir' and 'dirlogcache' arguments are for internal use by
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 1 of 5 V2] manifest: break mancache into two caches

2016-08-17 Thread Durham Goode
# HG changeset patch
# User Durham Goode 
# Date 1471465513 25200
#  Wed Aug 17 13:25:13 2016 -0700
# Node ID b5eca5d531599261641e1d2d3f9f2e60b6538da7
# Parent  997e8cf4d0a29d28759e38659736cb3d1cf9ef3f
manifest: break mancache into two caches

The old manifest cache cached both the inmemory representation and the raw text.
As part of the manifest refactor we want to separate the storage format from the
in memory representation, so let's split this cache into two caches.

This will let other manifest implementations participate in the in memory cache,
while allowing the revlog based implementations to still depend on the full text
caching where necessary.

diff --git a/mercurial/bundlerepo.py b/mercurial/bundlerepo.py
--- a/mercurial/bundlerepo.py
+++ b/mercurial/bundlerepo.py
@@ -205,7 +205,7 @@ class bundlemanifest(bundlerevlog, manif
 node = self.node(node)
 
 if node in self._mancache:
-result = self._mancache[node][0].text()
+result = self._mancache[node].text()
 else:
 result = manifest.manifest.revision(self, nodeorrev)
 return result
diff --git a/mercurial/manifest.py b/mercurial/manifest.py
--- a/mercurial/manifest.py
+++ b/mercurial/manifest.py
@@ -908,6 +908,7 @@ class manifest(revlog.revlog):
 usetreemanifest = opts.get('treemanifest', usetreemanifest)
 usemanifestv2 = opts.get('manifestv2', usemanifestv2)
 self._mancache = util.lrucachedict(cachesize)
+self._fulltextcache = util.lrucachedict(cachesize)
 self._treeinmem = usetreemanifest
 self._treeondisk = usetreemanifest
 self._usemanifestv2 = usemanifestv2
@@ -1000,7 +1001,7 @@ class manifest(revlog.revlog):
 if node == revlog.nullid:
 return self._newmanifest() # don't upset local cache
 if node in self._mancache:
-return self._mancache[node][0]
+return self._mancache[node]
 if self._treeondisk:
 def gettext():
 return self.revision(node)
@@ -1014,7 +1015,8 @@ class manifest(revlog.revlog):
 text = self.revision(node)
 m = self._newmanifest(text)
 arraytext = array.array('c', text)
-self._mancache[node] = (m, arraytext)
+self._mancache[node] = m
+self._fulltextcache[node] = arraytext
 return m
 
 def readshallow(self, node):
@@ -1034,7 +1036,7 @@ class manifest(revlog.revlog):
 return None, None
 
 def add(self, m, transaction, link, p1, p2, added, removed):
-if (p1 in self._mancache and not self._treeinmem
+if (p1 in self._fulltextcache and not self._treeinmem
 and not self._usemanifestv2):
 # If our first parent is in the manifest cache, we can
 # compute a delta here using properties we know about the
@@ -1046,7 +1048,7 @@ class manifest(revlog.revlog):
 work = heapq.merge([(x, False) for x in added],
[(x, True) for x in removed])
 
-arraytext, deltatext = m.fastdelta(self._mancache[p1][1], work)
+arraytext, deltatext = m.fastdelta(self._fulltextcache[p1], work)
 cachedelta = self.rev(p1), deltatext
 text = util.buffer(arraytext)
 n = self.addrevision(text, transaction, link, p1, p2, cachedelta)
@@ -1065,7 +1067,8 @@ class manifest(revlog.revlog):
 n = self.addrevision(text, transaction, link, p1, p2)
 arraytext = array.array('c', text)
 
-self._mancache[n] = (m, arraytext)
+self._mancache[n] = m
+self._fulltextcache[n] = arraytext
 
 return n
 
@@ -1092,5 +1095,6 @@ class manifest(revlog.revlog):
 
 def clearcaches(self):
 super(manifest, self).clearcaches()
+self._fulltextcache.clear()
 self._mancache.clear()
 self._dirlogcache = {'': self}
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 2 of 5 V2] manifest: make manifest derive from manifestrevlog

2016-08-17 Thread Durham Goode
# HG changeset patch
# User Durham Goode 
# Date 1471465513 25200
#  Wed Aug 17 13:25:13 2016 -0700
# Node ID 8ddbe86953023c40beb7be9dcb8025d1055813c5
# Parent  b5eca5d531599261641e1d2d3f9f2e60b6538da7
manifest: make manifest derive from manifestrevlog

As part of our refactoring to split the manifest concept from its storage, we
need to start moving the revlog specific parts of the manifest implementation to
a new class. This patch creates manifestrevlog and moves the fulltextcache onto
the base class.

diff --git a/mercurial/manifest.py b/mercurial/manifest.py
--- a/mercurial/manifest.py
+++ b/mercurial/manifest.py
@@ -890,7 +890,31 @@ class treemanifest(object):
 subp1, subp2 = subp2, subp1
 writesubtree(subm, subp1, subp2)
 
-class manifest(revlog.revlog):
+class manifestrevlog(revlog.revlog):
+'''A revlog that stores manifest texts. This is responsible for caching the
+full-text manifest contents.
+'''
+def __init__(self, opener, indexfile):
+super(manifestrevlog, self).__init__(opener, indexfile)
+
+# During normal operations, we expect to deal with not more than four
+# revs at a time (such as during commit --amend). When rebasing large
+# stacks of commits, the number can go up, hence the config knob below.
+cachesize = 4
+opts = getattr(opener, 'options', None)
+if opts is not None:
+cachesize = opts.get('manifestcachesize', cachesize)
+self._fulltextcache = util.lrucachedict(cachesize)
+
+@property
+def fulltextcache(self):
+return self._fulltextcache
+
+def clearcaches(self):
+super(manifestrevlog, self).clearcaches()
+self._fulltextcache.clear()
+
+class manifest(manifestrevlog):
 def __init__(self, opener, dir='', dirlogcache=None):
 '''The 'dir' and 'dirlogcache' arguments are for internal use by
 manifest.manifest only. External users should create a root manifest
@@ -908,7 +932,6 @@ class manifest(revlog.revlog):
 usetreemanifest = opts.get('treemanifest', usetreemanifest)
 usemanifestv2 = opts.get('manifestv2', usemanifestv2)
 self._mancache = util.lrucachedict(cachesize)
-self._fulltextcache = util.lrucachedict(cachesize)
 self._treeinmem = usetreemanifest
 self._treeondisk = usetreemanifest
 self._usemanifestv2 = usemanifestv2
@@ -918,7 +941,7 @@ class manifest(revlog.revlog):
 if not dir.endswith('/'):
 dir = dir + '/'
 indexfile = "meta/" + dir + "00manifest.i"
-revlog.revlog.__init__(self, opener, indexfile)
+super(manifest, self).__init__(opener, indexfile)
 self._dir = dir
 # The dirlogcache is kept on the root manifest log
 if dir:
@@ -1016,7 +1039,7 @@ class manifest(revlog.revlog):
 m = self._newmanifest(text)
 arraytext = array.array('c', text)
 self._mancache[node] = m
-self._fulltextcache[node] = arraytext
+self.fulltextcache[node] = arraytext
 return m
 
 def readshallow(self, node):
@@ -1036,7 +1059,7 @@ class manifest(revlog.revlog):
 return None, None
 
 def add(self, m, transaction, link, p1, p2, added, removed):
-if (p1 in self._fulltextcache and not self._treeinmem
+if (p1 in self.fulltextcache and not self._treeinmem
 and not self._usemanifestv2):
 # If our first parent is in the manifest cache, we can
 # compute a delta here using properties we know about the
@@ -1048,7 +1071,7 @@ class manifest(revlog.revlog):
 work = heapq.merge([(x, False) for x in added],
[(x, True) for x in removed])
 
-arraytext, deltatext = m.fastdelta(self._fulltextcache[p1], work)
+arraytext, deltatext = m.fastdelta(self.fulltextcache[p1], work)
 cachedelta = self.rev(p1), deltatext
 text = util.buffer(arraytext)
 n = self.addrevision(text, transaction, link, p1, p2, cachedelta)
@@ -1068,7 +1091,7 @@ class manifest(revlog.revlog):
 arraytext = array.array('c', text)
 
 self._mancache[n] = m
-self._fulltextcache[n] = arraytext
+self.fulltextcache[n] = arraytext
 
 return n
 
@@ -1095,6 +1118,5 @@ class manifest(revlog.revlog):
 
 def clearcaches(self):
 super(manifest, self).clearcaches()
-self._fulltextcache.clear()
 self._mancache.clear()
 self._dirlogcache = {'': self}
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[Bug 5330] New: Create a steering committee e-mail list

2016-08-17 Thread bugzilla
https://bz.mercurial-scm.org/show_bug.cgi?id=5330

Bug ID: 5330
   Summary: Create a steering committee e-mail list
   Product: Mercurial
   Version: unspecified
  Hardware: PC
OS: Mac OS
Status: UNCONFIRMED
  Severity: feature
  Priority: wish
 Component: infrastructure
  Assignee: bugzi...@selenic.com
  Reporter: kbullock+mercur...@ringworld.org
CC: kbullock+mercur...@ringworld.org,
mercurial-de...@selenic.com

Create steer...@mercurial-scm.org with the steering committee as members.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH] hgweb: add inheritance support to style maps

2016-08-17 Thread Matt Mackall
# HG changeset patch
# User Matt Mackall 
# Date 1471459227 18000
#  Wed Aug 17 13:40:27 2016 -0500
# Node ID f2bb8352d994be9bb9ca55d49dacba35c996d8cf
# Parent  73ff159923c1f05899c27238409ca398342d9ae0
hgweb: add inheritance support to style maps

We can now specify a base map file:

__base__ = path/to/map/file

That map file will be read and used to populate unset elements of the
current map. Unlike using %include, elements in the inherited class
will be read relative to that path.

This makes it much easier to make custom local tweaks to a style.

diff -r 73ff159923c1 -r f2bb8352d994 mercurial/templater.py
--- a/mercurial/templater.pyMon Aug 01 13:14:13 2016 -0400
+++ b/mercurial/templater.pyWed Aug 17 13:40:27 2016 -0500
@@ -1026,6 +1026,16 @@
 raise error.ParseError(_('unmatched quotes'),
conf.source('', key))
 cache[key] = unquotestring(val)
+elif key == "__base__":
+# treat as a pointer to a base class for this style
+path = util.normpath(os.path.join(base, val))
+bcache, btmap = _readmapfile(path)
+for k in bcache:
+if k not in cache:
+cache[k] = bcache[k]
+for k in btmap:
+if k not in tmap:
+tmap[k] = btmap[k]
 else:
 val = 'default', val
 if ':' in val[1]:
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[Bug 5329] New: Setup a buildbot with Python 3.5

2016-08-17 Thread bugzilla
https://bz.mercurial-scm.org/show_bug.cgi?id=5329

Bug ID: 5329
   Summary: Setup a buildbot with Python 3.5
   Product: Mercurial
   Version: default branch
  Hardware: PC
OS: Linux
Status: UNCONFIRMED
  Severity: feature
  Priority: wish
 Component: infrastructure
  Assignee: bugzi...@selenic.com
  Reporter: 7895pul...@gmail.com
CC: kbullock+mercur...@ringworld.org,
mercurial-de...@selenic.com

We have came some way through porting to Python 3 and we should have our
buildbot covering Python3 so we can keep a track on whats passing, what not and
moreover it will force people to write py3 compatible code. We are only
supporting Python 3.5

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] py3: handle os.environ.get() case in module loader

2016-08-17 Thread Gregory Szorc
On Wed, Aug 17, 2016 at 10:04 AM, Siddharth Agarwal 
wrote:

> On 8/4/16 20:26, Siddharth Agarwal wrote:
>
>>
>> I agree with Greg -- this makes things more complicated than necessary.
>> We should just have a helper (e.g. util.environ) that gets assigned to
>> os.environ on py2 and os.environb on py3. (And with possibly different
>> behavior on Windows, similar to filenames.)
>>
>
> Pulkit asked me to elaborate a bit:
>
> For Python 3 on Unix (including OS X), byte strings (UTF-8 encoded byte
> strings on OS X) are as high fidelity a way to talk to the native OS APIs
> as Unicode strings. So on Unix, you don't lose any information by using
> os.environb versus os.environ.
>
> For Python 3 on Windows, byte strings are a *lower* fidelity way to talk
> to the native OS APIs than Unicode strings. So os.environb gives you
> potentially less information than os.environ.
>
> This behavior is identical to the way filesystem APIs work on Windows. See
> https://www.mercurial-scm.org/wiki/WindowsUTF8Plan for how Mercurial has
> tackled/plans to tackle this problem.


WindowsUTF8Plan has existed for years without any movement. Addressing that
will require a massive refactor. Since it isn't on our radar, I think it is
out of scope for Python 3 porting work. Can we just use os.environb/bytes
paths on Python 3 for now and deal with Windows paths compatibility another
time? Worst case this doesn't make Mercurial usable under Python 3 on
Windows. I'm not sure anyone will care until Mercurial actually works with
Python 3. And that's a way off. I think I'm fine kicking the can down the
road to unblock the overall Python 3 porting effort.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] py3: handle os.environ.get() case in module loader

2016-08-17 Thread Siddharth Agarwal

On 8/4/16 20:26, Siddharth Agarwal wrote:


I agree with Greg -- this makes things more complicated than 
necessary. We should just have a helper (e.g. util.environ) that gets 
assigned to os.environ on py2 and os.environb on py3. (And with 
possibly different behavior on Windows, similar to filenames.)


Pulkit asked me to elaborate a bit:

For Python 3 on Unix (including OS X), byte strings (UTF-8 encoded byte 
strings on OS X) are as high fidelity a way to talk to the native OS 
APIs as Unicode strings. So on Unix, you don't lose any information by 
using os.environb versus os.environ.


For Python 3 on Windows, byte strings are a *lower* fidelity way to talk 
to the native OS APIs than Unicode strings. So os.environb gives you 
potentially less information than os.environ.


This behavior is identical to the way filesystem APIs work on Windows. 
See https://www.mercurial-scm.org/wiki/WindowsUTF8Plan for how Mercurial 
has tackled/plans to tackle this problem.


- Siddharth
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 09 of 10 V2] tests: explicitly use ls profiler

2016-08-17 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1471449135 25200
#  Wed Aug 17 08:52:15 2016 -0700
# Node ID e262b893e1fb54af71123a1e77956f24a9cfea52
# Parent  b42fbfe6196215490d6f3a395440d908babcc0e9
tests: explicitly use ls profiler

In preparation for making the statprof profiler the default.

diff --git a/tests/test-profile.t b/tests/test-profile.t
--- a/tests/test-profile.t
+++ b/tests/test-profile.t
@@ -3,44 +3,46 @@ test --time
   $ hg --time help -q help 2>&1 | grep time > /dev/null
   $ hg init a
   $ cd a
 
 #if lsprof
 
 test --profile
 
-  $ hg --profile st 2>../out
+  $ prof='hg --config profiling.type=ls --profile'
+
+  $ $prof st 2>../out
   $ grep CallCount ../out > /dev/null || cat ../out
 
-  $ hg --profile --config profiling.output=../out st
+  $ $prof --config profiling.output=../out st
   $ grep CallCount ../out > /dev/null || cat ../out
 
-  $ hg --profile --config profiling.output=blackbox --config 
extensions.blackbox= st
+  $ $prof --config profiling.output=blackbox --config extensions.blackbox= st
   $ grep CallCount .hg/blackbox.log > /dev/null || cat .hg/blackbox.log
 
-  $ hg --profile --config profiling.format=text st 2>../out
+  $ $prof --config profiling.format=text st 2>../out
   $ grep CallCount ../out > /dev/null || cat ../out
 
   $ echo "[profiling]" >> $HGRCPATH
   $ echo "format=kcachegrind" >> $HGRCPATH
 
-  $ hg --profile st 2>../out
+  $ $prof st 2>../out
   $ grep 'events: Ticks' ../out > /dev/null || cat ../out
 
-  $ hg --profile --config profiling.output=../out st
+  $ $prof --config profiling.output=../out st
   $ grep 'events: Ticks' ../out > /dev/null || cat ../out
 
 #endif
 
 #if lsprof serve
 
 Profiling of HTTP requests works
 
-  $ hg --profile --config profiling.format=text --config 
profiling.output=../profile.log serve -d -p $HGPORT --pid-file ../hg.pid -A 
../access.log
+  $ $prof --config profiling.format=text --config 
profiling.output=../profile.log serve -d -p $HGPORT --pid-file ../hg.pid -A 
../access.log
   $ cat ../hg.pid >> $DAEMON_PIDS
   $ hg -q clone -U http://localhost:$HGPORT ../clone
 
 A single profile is logged because file logging doesn't append
   $ grep CallCount ../profile.log | wc -l
   \s*1 (re)
 
 #endif
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 10 of 10 V2] profiling: make statprof the default profiler (BC)

2016-08-17 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1471449279 25200
#  Wed Aug 17 08:54:39 2016 -0700
# Node ID d20e3064df489d1a29c5186d3cb77eec3fdf03b4
# Parent  e262b893e1fb54af71123a1e77956f24a9cfea52
profiling: make statprof the default profiler (BC)

The statprof sampling profiler runs with significantly less overhead.
Its data is therefore more useful. Furthermore, its default output
shows the hotpath by default, which I've found to be way more useful
than the default profiler's function time table.

diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
--- a/mercurial/help/config.txt
+++ b/mercurial/help/config.txt
@@ -1396,17 +1396,17 @@ profiling is done using lsprof.
 ``enabled``
 Enable the profiler.
 (default: false)
 
 This is equivalent to passing ``--profile`` on the command line.
 
 ``type``
 The type of profiler to use.
-(default: ls)
+(default: stat)
 
 ``ls``
   Use Python's built-in instrumenting profiler. This profiler
   works on all platforms, but each line number it reports is the
   first line of a function. This restriction makes it difficult to
   identify the expensive parts of a non-trivial function.
 ``stat``
   Use a statistical profiler, statprof. This profiler is most
diff --git a/mercurial/profiling.py b/mercurial/profiling.py
--- a/mercurial/profiling.py
+++ b/mercurial/profiling.py
@@ -113,20 +113,20 @@ def statprofile(ui, fp):
 def profile(ui):
 """Start profiling.
 
 Profiling is active when the context manager is active. When the context
 manager exits, profiling results will be written to the configured output.
 """
 profiler = os.getenv('HGPROF')
 if profiler is None:
-profiler = ui.config('profiling', 'type', default='ls')
+profiler = ui.config('profiling', 'type', default='stat')
 if profiler not in ('ls', 'stat', 'flame'):
 ui.warn(_("unrecognized profiler '%s' - ignored\n") % profiler)
-profiler = 'ls'
+profiler = 'stat'
 
 output = ui.config('profiling', 'output')
 
 if output == 'blackbox':
 fp = util.stringio()
 elif output:
 path = ui.expandpath(output)
 fp = open(path, 'wb')
diff --git a/tests/test-profile.t b/tests/test-profile.t
--- a/tests/test-profile.t
+++ b/tests/test-profile.t
@@ -42,9 +42,15 @@ Profiling of HTTP requests works
   $ hg -q clone -U http://localhost:$HGPORT ../clone
 
 A single profile is logged because file logging doesn't append
   $ grep CallCount ../profile.log | wc -l
   \s*1 (re)
 
 #endif
 
+statistical profiler works
+
+  $ hg --profile st 2>../out
+  $ grep Sample ../out
+  Sample count: \d+ (re)
+
   $ cd ..
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 04 of 10 V2] statprof: use absolute_imports

2016-08-17 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1471448311 25200
#  Wed Aug 17 08:38:31 2016 -0700
# Node ID 289db60b052ad08ba2b9850937025654d7b8ded4
# Parent  2828b7bd26694c78c9d6e0bcaa865b93530cc6e6
statprof: use absolute_imports

This entails switching to Mercurial's import convention.

diff --git a/mercurial/statprof.py b/mercurial/statprof.py
--- a/mercurial/statprof.py
+++ b/mercurial/statprof.py
@@ -97,22 +97,32 @@ Threading
 
 Because signals only get delivered to the main thread in Python,
 statprof only profiles the main thread. However because the time
 reporting function uses per-process timers, the results can be
 significantly off if other threads' work patterns are not similar to the
 main thread's work patterns.
 """
 # no-check-code
-from __future__ import division
+from __future__ import absolute_import, division
 
-import inspect, json, os, signal, tempfile, sys, getopt, threading
+import collections
+import contextlib
+import getopt
+import inspect
+import json
+import os
+import signal
+import sys
+import tempfile
+import threading
 import time
-from collections import defaultdict
-from contextlib import contextmanager
+
+defaultdict = collections.defaultdict
+contextmanager = contextlib.contextmanager
 
 __all__ = ['start', 'stop', 'reset', 'display', 'profile']
 
 skips = set(["util.py:check", "extensions.py:closure",
  "color.py:colorcmd", "dispatch.py:checkargs",
  "dispatch.py:", "dispatch.py:_runcatch",
  "dispatch.py:_dispatch", "dispatch.py:_runcommand",
  "pager.py:pagecmd", "dispatch.py:run",
diff --git a/tests/test-check-py3-compat.t b/tests/test-check-py3-compat.t
--- a/tests/test-check-py3-compat.t
+++ b/tests/test-check-py3-compat.t
@@ -4,17 +4,16 @@
   $ cd "$TESTDIR"/..
 
   $ hg files 'set:(**.py)' | sed 's|\\|/|g' | xargs python 
contrib/check-py3-compat.py
   hgext/fsmonitor/pywatchman/__init__.py not using absolute_import
   hgext/fsmonitor/pywatchman/__init__.py requires print_function
   hgext/fsmonitor/pywatchman/capabilities.py not using absolute_import
   hgext/fsmonitor/pywatchman/pybser.py not using absolute_import
   i18n/check-translation.py not using absolute_import
-  mercurial/statprof.py not using absolute_import
   mercurial/statprof.py requires print_function
   setup.py not using absolute_import
   tests/test-demandimport.py not using absolute_import
 
 #if py3exe
   $ hg files 'set:(**.py)' | sed 's|\\|/|g' | xargs $PYTHON3 
contrib/check-py3-compat.py
   doc/hgmanpage.py: invalid syntax: invalid syntax (, line *) (glob)
   hgext/acl.py: error importing:  str expected, not bytes (error at 
encoding.py:*) (glob)
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 07 of 10 V2] statprof: support stacked collection

2016-08-17 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1471449673 25200
#  Wed Aug 17 09:01:13 2016 -0700
# Node ID bf247c2f0136d23100b0098fe152cfabbb511c1b
# Parent  7409c675a2be190605f07277627b095976f56da8
statprof: support stacked collection

Before, statprof theoretically supported starting and stopping
profiling multiple times. But it didn't actually do anything
upon each "stacked" call to start()/stop().

This patch adds state tracking of sorts to start()/stop().

We also have stop() return a data structure with captured
data. This allows consumers to do things like display only the
just-collected data.

I'm not completely convinced this code works as advertised and does
the right things in all situations. But it does appear to get the
results I want when running the statistical profiler during `hg
serve`.

diff --git a/mercurial/statprof.py b/mercurial/statprof.py
--- a/mercurial/statprof.py
+++ b/mercurial/statprof.py
@@ -167,17 +167,17 @@ class ProfileState(object):
 
 def accumulate_time(self, stop_time):
 self.accumulated_time += stop_time - self.last_start_time
 
 def seconds_per_sample(self):
 return self.accumulated_time / len(self.samples)
 
 state = ProfileState()
-
+_statestack = []
 
 class CodeSite(object):
 cache = {}
 
 __slots__ = ('path', 'lineno', 'function', 'source')
 
 def __init__(self, path, lineno, function):
 self.path = path
@@ -244,69 +244,90 @@ class Sample(object):
 
 while frame:
 stack.append(CodeSite.get(frame.f_code.co_filename, frame.f_lineno,
   frame.f_code.co_name))
 frame = frame.f_back
 
 return Sample(stack, time)
 
+class CollectedData(object):
+"""Represents collected data for a sampling interval."""
+def __init__(self, samples, accumulated_time, sample_interval):
+self.samples = samples
+self.accumulated_time = accumulated_time
+self.sample_interval = sample_interval
+
 ###
 ## SIGPROF handler
 
 def profile_signal_handler(signum, frame):
 if state.profile_level > 0:
 state.accumulate_time(clock())
 
 state.samples.append(Sample.from_frame(frame, state.accumulated_time))
 
 signal.setitimer(signal.ITIMER_PROF,
 state.sample_interval, 0.0)
 state.last_start_time = clock()
 
 stopthread = threading.Event()
-def samplerthread(tid):
+def samplerthread():
 while not stopthread.is_set():
 state.accumulate_time(clock())
 
-frame = sys._current_frames()[tid]
+frame = sys._current_frames()[state.threadid]
 state.samples.append(Sample.from_frame(frame, state.accumulated_time))
 
 state.last_start_time = clock()
 time.sleep(state.sample_interval)
 
 stopthread.clear()
 
 ###
 ## Profiling API
 
 def is_active():
 return state.profile_level > 0
 
 lastmechanism = None
 def start(mechanism='thread'):
 '''Install the profiling signal handler, and start profiling.'''
+# Store old state if present.
+if state.profile_level > 0:
+laststate = {
+'samples': state.samples,
+'tid': state.threadid,
+'accumulated_time': state.accumulated_time,
+}
+_statestack.append(laststate)
+
+state.samples = []
+state.accumulated_time = 0.0
+frame = inspect.currentframe()
+tid = [k for k, f in sys._current_frames().items() if f == frame][0]
+state.threadid = tid
+
 state.profile_level += 1
 if state.profile_level == 1:
 state.last_start_time = clock()
 rpt = state.remaining_prof_time
 state.remaining_prof_time = None
 
 global lastmechanism
 lastmechanism = mechanism
 
 if mechanism == 'signal':
 signal.signal(signal.SIGPROF, profile_signal_handler)
 signal.setitimer(signal.ITIMER_PROF,
 rpt or state.sample_interval, 0.0)
 elif mechanism == 'thread':
-frame = inspect.currentframe()
-tid = [k for k, f in sys._current_frames().items() if f == 
frame][0]
+
 state.thread = threading.Thread(target=samplerthread,
- args=(tid,), name="samplerthread")
+name="samplerthread")
 state.thread.start()
 
 def stop():
 '''Stop profiling, and uninstall the profiling signal handler.'''
 state.profile_level -= 1
 if state.profile_level == 0:
 if lastmechanism == 'signal':
 rpt = signal.setitimer(signal.ITIMER_PROF, 0.0, 0.0)
@@ -317,16 +338,39 @@ def stop():
 state.thread.join()
 
 state.accumulate_time(clock())
 state.last_start_time = None
 statprofpath = os.environ.get('STATPROF_DEST')
 if 

[PATCH 05 of 10 V2] statprof: use print function

2016-08-17 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1471227612 25200
#  Sun Aug 14 19:20:12 2016 -0700
# Node ID 207736e7490b561b050d345914ade2aeefc2482b
# Parent  289db60b052ad08ba2b9850937025654d7b8ded4
statprof: use print function

diff --git a/mercurial/statprof.py b/mercurial/statprof.py
--- a/mercurial/statprof.py
+++ b/mercurial/statprof.py
@@ -97,17 +97,17 @@ Threading
 
 Because signals only get delivered to the main thread in Python,
 statprof only profiles the main thread. However because the time
 reporting function uses per-process timers, the results can be
 significantly off if other threads' work patterns are not similar to the
 main thread's work patterns.
 """
 # no-check-code
-from __future__ import absolute_import, division
+from __future__ import absolute_import, division, print_function
 
 import collections
 import contextlib
 import getopt
 import inspect
 import json
 import os
 import signal
@@ -427,17 +427,17 @@ class DisplayFormats:
 
 def display(fp=None, format=3, **kwargs):
 '''Print statistics, either to stdout or the given file object.'''
 
 if fp is None:
 import sys
 fp = sys.stdout
 if len(state.samples) == 0:
-print >> fp, ('No samples recorded.')
+print('No samples recorded.', file=fp)
 return
 
 if format == DisplayFormats.ByLine:
 display_by_line(fp)
 elif format == DisplayFormats.ByMethod:
 display_by_method(fp)
 elif format == DisplayFormats.AboutMethod:
 display_about_method(fp, **kwargs)
@@ -446,47 +446,48 @@ def display(fp=None, format=3, **kwargs)
 elif format == DisplayFormats.FlameGraph:
 write_to_flame(fp)
 elif format == DisplayFormats.Json:
 write_to_json(fp)
 else:
 raise Exception("Invalid display format")
 
 if format != DisplayFormats.Json:
-print >> fp, ('---')
-print >> fp, ('Sample count: %d' % len(state.samples))
-print >> fp, ('Total time: %f seconds' % state.accumulated_time)
+print('---', file=fp)
+print('Sample count: %d' % len(state.samples), file=fp)
+print('Total time: %f seconds' % state.accumulated_time, file=fp)
 
 def display_by_line(fp):
 '''Print the profiler data with each sample line represented
 as one row in a table.  Sorted by self-time per line.'''
 stats = SiteStats.buildstats(state.samples)
 stats.sort(reverse=True, key=lambda x: x.selfseconds())
 
-print >> fp, ('%5.5s %10.10s   %7.7s  %-8.8s' %
-  ('%  ', 'cumulative', 'self', ''))
-print >> fp, ('%5.5s  %9.9s  %8.8s  %-8.8s' %
-  ("time", "seconds", "seconds", "name"))
+print('%5.5s %10.10s   %7.7s  %-8.8s' %
+  ('%  ', 'cumulative', 'self', ''), file=fp)
+print('%5.5s  %9.9s  %8.8s  %-8.8s' %
+  ("time", "seconds", "seconds", "name"), file=fp)
 
 for stat in stats:
 site = stat.site
 sitelabel = '%s:%d:%s' % (site.filename(), site.lineno, site.function)
-print >> fp, ('%6.2f %9.2f %9.2f  %s' % (stat.selfpercent(),
- stat.totalseconds(),
- stat.selfseconds(),
- sitelabel))
+print('%6.2f %9.2f %9.2f  %s' % (stat.selfpercent(),
+ stat.totalseconds(),
+ stat.selfseconds(),
+ sitelabel),
+  file=fp)
 
 def display_by_method(fp):
 '''Print the profiler data with each sample function represented
 as one row in a table.  Important lines within that function are
 output as nested rows.  Sorted by self-time per line.'''
-print >> fp, ('%5.5s %10.10s   %7.7s  %-8.8s' %
-  ('%  ', 'cumulative', 'self', ''))
-print >> fp, ('%5.5s  %9.9s  %8.8s  %-8.8s' %
-  ("time", "seconds", "seconds", "name"))
+print('%5.5s %10.10s   %7.7s  %-8.8s' %
+  ('%  ', 'cumulative', 'self', ''), file=fp)
+print('%5.5s  %9.9s  %8.8s  %-8.8s' %
+  ("time", "seconds", "seconds", "name"), file=fp)
 
 stats = SiteStats.buildstats(state.samples)
 
 grouped = defaultdict(list)
 for stat in stats:
 grouped[stat.site.filename() + ":" + stat.site.function].append(stat)
 
 # compute sums for each function
@@ -507,29 +508,30 @@ def display_by_method(fp):
  sitestats))
 
 # sort by total self sec
 functiondata.sort(reverse=True, key=lambda x: x[2])
 
 for function in functiondata:
 if function[3] < 0.05:
 continue
-print >> fp, ('%6.2f %9.2f %9.2f  %s' % (function[3], # total percent
- function[1], # total cum sec
- function[2], # total self sec
-

[PATCH 02 of 10 V2] statprof: fix flake8 warnings

2016-08-17 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1471227212 25200
#  Sun Aug 14 19:13:32 2016 -0700
# Node ID 3f37aba5e38717c7373101e7c868d13f6d809574
# Parent  d9790aec4f500ab47c550b033f84fc090e537fc3
statprof: fix flake8 warnings

My local flake8 hook informed me of these warnings in the upstream
code. Fix them.

diff --git a/mercurial/statprof.py b/mercurial/statprof.py
--- a/mercurial/statprof.py
+++ b/mercurial/statprof.py
@@ -99,21 +99,20 @@ Because signals only get delivered to th
 statprof only profiles the main thread. However because the time
 reporting function uses per-process timers, the results can be
 significantly off if other threads' work patterns are not similar to the
 main thread's work patterns.
 """
 # no-check-code
 from __future__ import division
 
-import inspect, json, os, signal, tempfile, sys, getopt, threading, traceback
+import inspect, json, os, signal, tempfile, sys, getopt, threading
 import time
 from collections import defaultdict
 from contextlib import contextmanager
-from subprocess import call
 
 __all__ = ['start', 'stop', 'reset', 'display', 'profile']
 
 skips = set(["util.py:check", "extensions.py:closure",
  "color.py:colorcmd", "dispatch.py:checkargs",
  "dispatch.py:", "dispatch.py:_runcatch",
  "dispatch.py:_dispatch", "dispatch.py:_runcommand",
  "pager.py:pagecmd", "dispatch.py:run",
@@ -321,17 +320,17 @@ def save_data(path=None):
 for sample in state.samples:
 time = str(sample.time)
 stack = sample.stack
 sites = ['\1'.join([s.path, str(s.lineno), s.function])
  for s in stack]
 file.write(time + '\0' + '\0'.join(sites) + '\n')
 
 file.close()
-except (IOError, OSError) as ex:
+except (IOError, OSError):
 # The home directory probably didn't exist, or wasn't writable. Oh 
well.
 pass
 
 def load_data(path=None):
 path = path or (os.environ['HOME'] + '/statprof.data')
 lines = open(path, 'r').read().splitlines()
 
 state.accumulated_time = float(lines[0])
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 2 of 2] match: remove matchessubrepo method

2016-08-17 Thread Augie Fackler
On Tue, Aug 16, 2016 at 03:42:34PM +, Hannes Oldenburg wrote:
> # HG changeset patch
> # User Hannes Oldenburg 
> # Date 1471335676 0
> #  Tue Aug 16 08:21:16 2016 +
> # Node ID 1dc53807ae73733ab5655e22f56ccf99ab7688ff
> # Parent  ad4abe10145930a1067660221fb8e06bb5d03995
> match: remove matchessubrepo method

Queued these, nice cleanup. Thanks!

>
> Since it is no more used in cmdutil.{files,remove} and scmutil.addremove
> we remove this method.
>
> diff -r ad4abe101459 -r 1dc53807ae73 mercurial/match.py
> --- a/mercurial/match.py  Tue Aug 16 08:15:12 2016 +
> +++ b/mercurial/match.py  Tue Aug 16 08:21:16 2016 +
> @@ -320,10 +320,6 @@
>  kindpats.append((kind, pat, ''))
>  return kindpats
>
> -def matchessubrepo(self, subpath):
> -return (self.exact(subpath)
> -or any(f.startswith(subpath + '/') for f in self.files()))
> -
>  def exact(root, cwd, files, badfn=None):
>  return match(root, cwd, files, exact=True, badfn=badfn)
>
> ___
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 9 of 9 RFC] profiling: make statprof the default profiler (BC)

2016-08-17 Thread Augie Fackler
On Mon, Aug 15, 2016 at 10:25:16PM -0700, Gregory Szorc wrote:
> # HG changeset patch
> # User Gregory Szorc 
> # Date 1471221528 25200
> #  Sun Aug 14 17:38:48 2016 -0700
> # Node ID d88d80210ff4351734d63b50e1af75f398af8963
> # Parent  1975493743c5f68f28bf9dcf677df09d0265581a
> profiling: make statprof the default profiler (BC)

I'm going to go ahead and queue this. It kind of bums me out to vendor
the whole profiler, but I guess this is probably enough of a better
diagnosis improvement for end users that we should just do it.

The management of https://github.com/bos/statprof.py (me!) would
appreciate sending any important improvements upstream. Thanks!

>
> The statprof sampling profiler runs with significantly less overhead.
> Its data is therefore more useful. Furthermore, its default output
> shows the hotpath by default, which I've found to be way more useful
> than the default profiler's function time table.
>
> diff --git a/mercurial/profiling.py b/mercurial/profiling.py
> --- a/mercurial/profiling.py
> +++ b/mercurial/profiling.py
> @@ -114,20 +114,20 @@ def statprofile(ui, fp):
>  def profile(ui):
>  """Start profiling.
>
>  Profiling is active when the context manager is active. When the context
>  manager exits, profiling results will be written to the configured 
> output.
>  """
>  profiler = os.getenv('HGPROF')
>  if profiler is None:
> -profiler = ui.config('profiling', 'type', default='ls')
> +profiler = ui.config('profiling', 'type', default='stat')
>  if profiler not in ('ls', 'stat', 'flame'):
>  ui.warn(_("unrecognized profiler '%s' - ignored\n") % profiler)
> -profiler = 'ls'
> +profiler = 'stat'
>
>  output = ui.config('profiling', 'output')
>
>  if output == 'blackbox':
>  fp = util.stringio()
>  elif output:
>  path = ui.expandpath(output)
>  fp = open(path, 'wb')
> ___
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 3 of 3] listkeypattern: add listkeypattern wireproto method

2016-08-17 Thread Augie Fackler
On Tue, Aug 16, 2016 at 06:39:38PM -0700, Gregory Szorc wrote:
> On Tue, Aug 16, 2016 at 5:22 PM, Pierre-Yves David 
>  wrote:
> > On 08/14/2016 07:17 PM, Gregory Szorc wrote:
[snip]
> >> I think introducing a new wire protocol command is the correct way to
> >> solve this problem (as opposed to introducing a new argument on the
> >> existing command).
> >>
> >> However, if we're introducing a new wire protocol command for obtaining
> >> pushkey values, I think we should improve deficiencies in the response
> >> encoding rather than propagate its problems.
> >>
> >> The "listkeys" response encoding can't transmit the full range of binary
> >> values. This can lead to larger (and slower) responses sizes. For
> >> example, a number of pushkey namespaces exchange lists of nodes. These
> >> have to be represented as hex instead of binary. For pushkey namespaces
> >> like phases or obsolescence that can exchange hundreds or thousands of
> >> nodes, the overhead can add up.
> >>
> >> I think the response from a new listkeys command should be using framing
> >> to encode the key names and values so the full range of binary values
> >> can be efficiently transferred. We may also want a special mechanism to
> >> represent a list of nodes, as avoiding the overhead of framing on fixed
> >> width values would be desirable.
> >>
> >> Of course, at the point you introduce a new response encoding, we may
> >> want to call the command "listkeys2." If others agree, I can code up an
> >> implementation and you can add the patterns functionality on top of it.
> >>
> >
> > Sorry to be a bit late to the discussion.
> >
> > I don't think we should introduce a new wire-protocol command for this.
> >
> > Individual listkey call have been a large source of race condition and
> > related issue. Bundle2 is able to carry listkey.pushkey call just fine and
> > I think we should prioritize its usage. As bundle2 already have framing, we
> > could just use your better encoding with bundle2 directly.
> > However, we should probably push things further and use dedicated part for
> > commonly used request. Having dedicated type will help more semantic reply
> > and request. The series we are commenting on is a good example of that need
> > for "pattern" here pretty much only apply to bookmark, having a dedicated
> > "channel" (within bundle2) for this would make is painless to add any new
> > arguments we could need.
> >
> > TL;DR; I don't think we should touch listkey.pushkey. add a bundle2 part
> > dedicated to bookmark instead.
> >
>
> I have serious objections to stuffing yet more functionality into the
> "getbundle" wire protocol command. That command is quickly becoming a "god
> object" [1]. I'd rather we freeze that wire protocol command and offer
> equivalent functionality under new, better designed commands.

In general, I agree with your feeling on the bundle kind of being a
god object, but for bookmarks in particular Pierre-Yves is absolutely
correct. I think pushkey makes a ton of sense for things like marking
code review state, which can sensibly be done outside a transaction
(perhaps only sensibly done outside a repo history transaction? I'm
not sure), but in general most of the things we historically used
pushkey for should be in the same transaction as the rest of a push.

As a bonus, a dedicated bundle2 part for bookmarks would let us
properly resolve https://bz.mercurial-scm.org/show_bug.cgi?id=5165, so
we should probably just do that. I hadn't thought of that as a
solution, but it seems like the clear winner for that case now that
someone's said it.

What are use cases for pushkey beyond things that should be part of a
repo transaction? Can we come up with enough of those that would like
binary data payloads to justify the new command?

>
> In general, I'm opposed to adding arguments to wire protocol commands.
> Introducing an argument requires advertising a server capability otherwise
> clients may send an unknown argument to an incompatible server, which the
> server will reject. At the point you introduce a new argument/capability,
> you've effectively introduced a new variation of the wire protocol command.
> You might as well call it something else so the semantics of each wire
> protocol are well-defined and constant over time. If nothing else, this
> makes the client/server code easier to understand. I only need to point out
> wireproto.getbundle() and wireproto.unbundle() for examples of overly
> complicated code as a result of introducing wire protocol command arguments
> (notably support for bundle2).
>
> I'm not strongly opposed to the idea of making bundle2 a generic framing
> protocol. I think we could do better. (I raised a number of concerns with
> bundle2 during its implementation phase. In fact, one of them was that
> bundle2 as a generic format for data in transport and at rest is not ideal:
> there are various aspects you want to optimize for and one format cannot
> 

Re: [PATCH 3 of 3] listkeypattern: add listkeypattern wireproto method

2016-08-17 Thread Pierre-Yves David



On 08/17/2016 03:39 AM, Gregory Szorc wrote:

On Tue, Aug 16, 2016 at 5:22 PM, Pierre-Yves David
>
wrote:



On 08/14/2016 07:17 PM, Gregory Szorc wrote:

On Fri, Aug 12, 2016 at 5:09 AM, Stanislau Hlebik 
>> wrote:

# HG changeset patch
# User Stanislau Hlebik 
>>

# Date 1470999441 25200
#  Fri Aug 12 03:57:21 2016 -0700
# Node ID c2ee493e216c60ff439ab93cc1efe6ac5922d8eb
# Parent  fd2185d7c2f7aa529b2ad0a6584832fb2b1b4ecb
listkeypattern: add listkeypattern wireproto method

wireproto method to list remote keys by pattern

diff --git a/mercurial/wireproto.py b/mercurial/wireproto.py
--- a/mercurial/wireproto.py
+++ b/mercurial/wireproto.py
@@ -353,6 +353,22 @@
   % (namespace, len(d)))
 yield pushkeymod.decodekeys(d)

+@batchable
+def listkeypattern(self, namespace, patterns):
+if not self.capable('pushkey'):
+yield {}, None
+f = future()
+self.ui.debug('preparing listkeys for "%s" with pattern
"%s"\n' %
+  (namespace, patterns))
+yield {
+'namespace': encoding.fromlocal(namespace),
+'patterns': encodelist(patterns)
+}, f
+d = f.value
+self.ui.debug('received listkey for "%s": %i bytes\n'
+  % (namespace, len(d)))
+yield pushkeymod.decodekeys(d)
+
 def stream_out(self):
 return self._callstream('stream_out')

@@ -676,7 +692,8 @@
 return repo.opener.tryread('clonebundles.manifest')

 wireprotocaps = ['lookup', 'changegroupsubset',
'branchmap', 'pushkey',
- 'known', 'getbundle', 'unbundlehash', 'batch']
+ 'known', 'getbundle', 'unbundlehash', 'batch',
+ 'listkeypattern']

 def _capabilities(repo, proto):
 """return a list of capabilities for a repo
@@ -791,6 +808,12 @@
 d = repo.listkeys(encoding.tolocal(namespace)).items()
 return pushkeymod.encodekeys(d)

+@wireprotocommand('listkeypattern', 'namespace patterns *')


Why the "*" here? "others" is not used in the function
implementation.


+def listkeypattern(repo, proto, namespace, patterns, others):
+patterns = decodelist(patterns)
+d = repo.listkeys(encoding.tolocal(namespace),
patterns=patterns).items()
+return pushkeymod.encodekeys(d)
+
 @wireprotocommand('lookup', 'key')
 def lookup(repo, proto, key):
 try:


I think introducing a new wire protocol command is the correct
way to
solve this problem (as opposed to introducing a new argument on the
existing command).

However, if we're introducing a new wire protocol command for
obtaining
pushkey values, I think we should improve deficiencies in the
response
encoding rather than propagate its problems.

The "listkeys" response encoding can't transmit the full range
of binary
values. This can lead to larger (and slower) responses sizes. For
example, a number of pushkey namespaces exchange lists of nodes.
These
have to be represented as hex instead of binary. For pushkey
namespaces
like phases or obsolescence that can exchange hundreds or
thousands of
nodes, the overhead can add up.

I think the response from a new listkeys command should be using
framing
to encode the key names and values so the full range of binary
values
can be efficiently transferred. We may also want a special
mechanism to
represent a list of nodes, as avoiding the overhead of framing
on fixed
width values would be desirable.

Of course, at the point you introduce a new response encoding,
we may
want to call the command "listkeys2." If others agree, I can
code up an
implementation and you can add the patterns functionality on top
of it.


Sorry to be a bit late to the discussion.

I don't think we should introduce a new wire-protocol command for this.

Individual listkey