Re: File Name Patterns Plan

2016-11-24 Thread FUJIWARA Katsunori

At Fri, 25 Nov 2016 04:22:25 +0900,
FUJIWARA Katsunori wrote:
> 
> At Thu, 24 Nov 2016 17:04:38 +0100,
> Pierre-Yves David wrote:
> > 
> > Recently, Foozy created a Plan page for the matcher issues:
> > 
> > https://www.mercurial-scm.org/wiki/FileNamePatternsPlan
> > 
> > It is a good start but there is a couple of elements that are
> > missing or still a bit fuzzy to me.
> 
> Thank you for comments !
> 
> I'll investigate and update Wiki page later, to prevent my sleepy
> brain from incorrectly thinking :-)
> 
> This reply is just FYI about easy/clear points, to refer until
> updating Wiki page, even though you may know them already.
> 
> We should open new discussion thread citing some of points, which this
> mail metions, in devel-ml ASAP, to avoid scattering discussion log
> here and there, shouldn't we ?

Oops, I overlooked that original mail was posted not to devel-ml :-<
Please forget the last paragraph above !

--
[FUJIWARA Katsunori] fo...@lares.dti.ne.jp
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 5 of 7 upgraderepo V2] repair: migrate revlogs during upgrade

2016-11-24 Thread Gregory Szorc
On Thu, Nov 24, 2016 at 7:28 PM, Gregory Szorc 
wrote:

> # HG changeset patch
> # User Gregory Szorc 
> # Date 1480040072 28800
> #  Thu Nov 24 18:14:32 2016 -0800
> # Node ID a0a73b7b8699a3b831d8274dbfb0d7918b36be54
> # Parent  17c9861cd0084c99bac7071e07fc910373568df3
> repair: migrate revlogs during upgrade
>
> Our next step for in-place upgrade is to migrate store data. Revlogs
> are the biggest source of data within the store and a store is useless
> without them, so we implement their migration first.
>
> Our strategy for migrating revlogs is to walk the store and call
> `revlog.clone()` on each revlog. There are some minor complications.
>
> Because revlogs have different storage options (e.g. changelog has
> generaldelta and delta chains disabled), we need to obtain the
> correct class of revlog so inserted data is encoded properly for its
> type.
>
> Various attempts at implementing progress indicators that didn't lead
> to frustration from false "it's almost done" indicators were made.
>
> I initially used a single progress bar based on number of revlogs.
> However, this quickly churned through all filelogs, got to 99% then
> effectively froze at 99.99% when it got to the manifest.
>
> So I converted the progress bar to total revision count. This was a
> little bit better. But the manifest was still significantly slower
> than filelogs and it took forever to process the last few percent.
>
> I then tried both revision/chunk bytes and raw bytes as the
> denominator. This had the opposite effect: because so much data is in
> manifests, it would churn through filelogs without showing much
> progress. When it got to manifests, it would fill in 90+% of the
> progress bar.
>
> I finally gave up having a unified progress bar and instead implemented
> 3 progress bars: 1 for filelog revisions, 1 for manifest revisions, and
> 1 for changelog revisions. I added extra messages indicating the total
> number of revisions of each so users know there are more progress bars
> coming.
>
> I also added extra messages before and after each stage to give extra
> details about what is happening. Strictly speaking, this isn't
> necessary. But the numbers are impressive. For example, when converting
> a non-generaldelta mozilla-central repository, the messages you see are:
>
>migrating 2475593 total revisions (1833043 in filelogs, 321156 in
> manifests, 321394 in changelog)
>migrating 1.67 GB in store; 2508 GB tracked data
>migrating 267868 filelogs containing 1833043 revisions (1.09 GB in
> store; 57.3 GB tracked data)
>finished migrating 1833043 filelog revisions across 267868 filelogs;
> change in size: -415776 bytes
>migrating 1 manifests containing 321156 revisions (518 MB in store;
> 2451 GB tracked data)
>
> That "2508 GB" figure really blew me away. I had no clue that the raw
> tracked data in mozilla-central was that large. Granted, 2451 GB is in
> the manifest and "only" 57.3 GB is in filelogs. But still.
>

While we're talking about numbers, revlog.clone() calls self.revision() for
every rev. And as part of that it needs to validate the SHA-1 of the
fulltext.

Since we have 2508 GB of data, the overhead of SHA-1 is not insignificant!

It took ~50m of CPU time to perform a `hg debugupgraderepo` on
mozilla-central on my i7-6700K. That's *without* any delta recomputation.

Assuming SHA-1 can hash at 1 GB/s on my machine (which seems to be the
ballbark I've benchmarked it at), that means 2508 of the ~3000s (83.6%) in
`hg debugupgraderepo` were spent doing SHA-1 verification!

On first glance, that percentage seems a bit high (perhaps I'm
underestimating SHA-1's speed).

The farthest I've let a conversion with statprof run before I kill things
is ~10m. But that statprof said we were spending ~32% of samples in
hashing. Considering 97.7% of the repo data is in the manifest, that
percentage can only go up since at least 40m of the ~50m wall time of an
upgrade was in the hash. So it is certainly plausible that SHA-1 hashing is
responsible for ~80% of CPU time during execution. Yikes.

I guess what I'm trying to say is a) really large manifests suck
performance via hashing (go treemanifests!) b) I'll probably do a follow-up
to mitigate (perhaps optionally) the impact of SHA-1 hashing when doing
repo upgrades.


>
> It's worth noting that gratuitous loading of source revlogs in order
> to display numbers and progress bars does serve a purpose: it ensures
> we can open all source revlogs. We don't want to spend several minutes
> copying revlogs only to encounter a permissions error or similar later.
>
> As part of this commit, we also add swapping of the store directory
> to the upgrade function. After revlogs are converted, we move the
> old store into the backup directory then move the temporary repo's
> store into the old store's location. On well-behaved systems, this
> should be 2 atomic operations and the window of inconsistency show be
> 

Re: [PATCH 1 of 7 upgraderepo V2] revlog: add clone method

2016-11-24 Thread Gregory Szorc
On Thu, Nov 24, 2016 at 7:28 PM, Gregory Szorc 
wrote:

> # HG changeset patch
> # User Gregory Szorc 
> # Date 1480031508 28800
> #  Thu Nov 24 15:51:48 2016 -0800
> # Node ID c60995fce14b0e34bd1416dab3712a6c33bb29bb
> # Parent  906a7d8e969552536fffe0df7a5e63bf5d79b34b
> revlog: add clone method
>

There were a ton of comments on version 1 of this series. I can't promise I
addressed them all.

There were a ton of other changes that went into version 2 of this series.

So if you are reviewing this series, please treat this as a version 1 and
look at everything in depth, as the delta since version 1 is very
significant.


>
> Upcoming patches will introduce functionality for in-place
> repository/store "upgrades." Copying the contents of a revlog
> feels sufficiently low-level to warrant being in the revlog
> class. So this commit implements that functionality.
>
> Because full delta recomputation can be *very* expensive (we're
> talking several hours on the Firefox repository), we support
> multiple modes of execution with regards to delta (re)use. This
> will allow repository upgrades to choose the "level" of
> processing/optimization they wish to perform when converting
> revlogs.
>
> It's not obvious from this commit, but "addrevisioncb" will be
> used for progress reporting.
>
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -1818,3 +1818,95 @@ class revlog(object):
>  if not self._inline:
>  res.append(self.datafile)
>  return res
> +
> +DELTAREUSEALWAYS = 'always'
> +DELTAREUSESAMEREVS = 'samerevs'
> +DELTAREUSENEVER = 'never'
> +
> +def clone(self, tr, destrevlog, addrevisioncb=None,
> +  deltareuse=DELTAREUSESAMEREVS, aggressivemergedeltas=None):
> +"""Copy this revlog to another, possibly with format changes.
> +
> +The destination revlog will contain the same revisions and nodes.
> +However, it may not be bit-for-bit identical due to e.g. delta
> encoding
> +differences.
> +
> +The ``deltareuse`` argument control how deltas from the existing
> revlog
> +are preserved in the destination revlog. The argument can have the
> +following values:
> +
> +DELTAREUSEALWAYS
> +   Deltas will always be reused (if possible), even if the
> destination
> +   revlog would not select the same revisions for the delta. This
> is the
> +   fastest mode of operation.
> +DELTAREUSESAMEREVS
> +   Deltas will be reused if the destination revlog would pick the
> same
> +   revisions for the delta. This mode strikes a balance between
> speed
> +   and optimization.
> +DELTAREUSENEVER
> +   Deltas will never be reused. This is the slowest mode of
> execution.
> +
> +Delta computation can be slow, so the choice of delta reuse
> policy can
> +significantly affect run time.
> +
> +The default policy (``DELTAREUSESAMEREVS``) strikes a balance
> between
> +two extremes. Deltas will be reused if they are appropriate. But
> if the
> +delta could choose a better revision, it will do so. This means
> if you
> +are converting a non-generaldelta revlog to a generaldelta revlog,
> +deltas will be recomputed if the delta's parent isn't a parent of
> the
> +revision.
> +
> +In addition to the delta policy, the ``aggressivemergedeltas``
> argument
> +controls whether to compute deltas against both parents for
> merges.
> +By default, the current default is used.
> +"""
> +if deltareuse not in (self.DELTAREUSEALWAYS,
> self.DELTAREUSESAMEREVS,
> +self.DELTAREUSENEVER):
> +raise ValueError(_('value for deltareuse invalid: %s') %
> deltareuse)
> +
> +if len(destrevlog):
> +raise ValueError(_('destination revlog is not empty'))
> +
> +# lazydeltabase controls whether to reuse a cached delta, if
> possible.
> +oldlazydeltabase = destrevlog._lazydeltabase
> +oldamd = destrevlog._aggressivemergedeltas
> +
> +if deltareuse == self.DELTAREUSEALWAYS:
> +destrevlog._lazydeltabase = True
> +elif deltareuse == self.DELTAREUSESAMEREVS:
> +destrevlog._lazydeltabase = False
> +
> +destrevlog._aggressivemergedeltas = aggressivemergedeltas or
> oldamd
> +
> +populatecachedelta = deltareuse in (self.DELTAREUSEALWAYS,
> +self.DELTAREUSESAMEREVS)
> +
> +try:
> +index = self.index
> +for rev in self:
> +entry = index[rev]
> +
> +# Some classes override linkrev to take filtered revs into
> +# account. Use raw entry from index.
> +linkrev = entry[4]
> +p1 = 

[PATCH 3 of 7 upgraderepo V2] repair: determine what upgrade will do

2016-11-24 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1480035243 28800
#  Thu Nov 24 16:54:03 2016 -0800
# Node ID 328d95ac1fd128359a2665f2959e030ee8cce046
# Parent  4fffbdc2bca22e666482b55562c0e592c8a68027
repair: determine what upgrade will do

This commit is rather large. It started off as multiple commits
but I couldn't figure out a good way to split up the functionality
while still conveying the overall strategy. To the reviewer, I
apologize.

This commit introducts a ton of functionality around determining
how a repository can be upgraded.

The first code you'll see is related to repository requirements and
updates. There are no less than 5 functions returnings sets of
requirements that control upgrading. Why so many functions?
Mainly extensions. These would ideally be module variables. But
extensions will need to inject themselves into the upgrade process.
And if you are going to make 1 set extensible, you might as well
make them all extensible.

Astute readers will see that we don't support "manifestv2" and
"treemanifest" requirements in the upgrade mechanism. I don't have
a great answer for why other than this is a complex set of patches
and I don't want to deal with the complexity of these experimental
features just yet. We can teach the upgrade mechanism about them
later, once the basic upgrade mechanism is in place.

The "upgradefindimprovements" function introduces a mechanism to
return a list of improvements that can be made to a repository.
Each improvement is effectively an action that an upgrade will
perform. Associated with each of these improvements is metadata
that will be used to inform users what's wrong and what an
upgrade will do.

Each "improvement" is categorized as a "deficiency" or an
"optimization." TBH, I'm not thrilled about the terminology and
am receptive to constructive bikeshedding. The main difference
between a "deficiency" and an "optimization" is a deficiency
is always corrected (if it deviates from the current config) and
an "optimization" is an optional action that goes above and beyond
to improve the state of the repository (usually by requiring more
CPU during upgrade).

Our initial set of improvements identifies missing repository
requirements, a single, easily correctable problem with
changelog storage, and a set of "optimizations" related to delta
recalculation.

The main "upgraderepo" function glues all of this code together.
It verifies that the repository is capable of being upgraded by
looking at the state of current and proposed requirements and
every difference therein. It then queries for the list of
improvements and determines which of them will run based on the
current repository state and user arguments. A non-trivial amount
of code is required to print all the improvements and what
actions will be taken by an upgrade. Some of this is factored into
inline functions to facilitate use in a subsequent commit.

I went through numerous iterations of the output format before
settling on a ReST-inspired definition list format. (I used
bulleted lists in the first submission of this commit and could
not get it to format just right.) Even with the various iterations,
I'm still not super thrilled with the format. But, this is a debug*
command, so that should mean we can refine the output without BC
concerns.

Rounding out the code is a series of new tests.

diff --git a/mercurial/debugcommands.py b/mercurial/debugcommands.py
--- a/mercurial/debugcommands.py
+++ b/mercurial/debugcommands.py
@@ -34,6 +34,7 @@ from . import (
 localrepo,
 lock as lockmod,
 pycompat,
+repair,
 revlog,
 scmutil,
 setdiscovery,
@@ -873,4 +874,4 @@ def debugupgraderepo(ui, repo, run=False
 should complete almost instantaneously and the chances of a consumer being
 unable to access the repository should be low.
 """
-raise error.Abort(_('not yet implemented'))
+return repair.upgraderepo(ui, repo, run=run, optimize=optimize)
diff --git a/mercurial/repair.py b/mercurial/repair.py
--- a/mercurial/repair.py
+++ b/mercurial/repair.py
@@ -360,3 +360,417 @@ def deleteobsmarkers(obsstore, indices):
 newobsstorefile.write(bytes)
 newobsstorefile.close()
 return n
+
+def upgraderequiredsourcerequirements(repo):
+"""Obtain requirements required to be present to upgrade a repo.
+
+An upgrade will not be allowed if the repository doesn't have the
+requirements returned by this function.
+"""
+return set([
+# Introduced in Mercurial 0.9.2.
+'revlogv1',
+# Introduced in Mercurial 0.9.2.
+'store',
+])
+
+def upgradeblocksourcerequirements(repo):
+"""Obtain requirements that will prevent an upgrade from occurring.
+
+An upgrade cannot be performed if the source repository contains a
+requirements in the returned set.
+"""
+return set([
+# The upgrade code does not yet support these experimental features.
+# This is an 

[PATCH 7 of 7 upgraderepo V2] repair: clean up stale lock file from store backup

2016-11-24 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1480041929 28800
#  Thu Nov 24 18:45:29 2016 -0800
# Node ID de44546b51998c59964a14b4dd470d2fafcdc8da
# Parent  65c8053026cdec8ad60e23caa55c14b05e9e25cb
repair: clean up stale lock file from store backup

Since we did a directory rename on the stores, the source
repository's lock path now references the dest repository's
lock path and the dest repository's lock path now references
a non-existent filename.

So releasing the lock on the source will unlock the dest and
releasing the lock on the dest will no-op because it fails due
to file not found. So we clean up the dest's lock manually.

diff --git a/mercurial/repair.py b/mercurial/repair.py
--- a/mercurial/repair.py
+++ b/mercurial/repair.py
@@ -917,6 +917,12 @@ def _upgraderepo(ui, srcrepo, dstrepo, r
'again\n'))
 scmutil.writerequires(srcrepo.vfs, requirements)
 
+# The lock file from the old store won't be removed because nothing has a
+# reference to its new location. So clean it up manually. Alternatively, we
+# could update srcrepo.svfs and other variables to point to the new
+# location. This is simpler.
+backupvfs.unlink('store/lock')
+
 return backuppath
 
 def upgraderepo(ui, repo, run=False, optimize=None):
diff --git a/tests/test-upgrade-repo.t b/tests/test-upgrade-repo.t
--- a/tests/test-upgrade-repo.t
+++ b/tests/test-upgrade-repo.t
@@ -303,7 +303,6 @@ old store should be backed up
   00manifest.i
   data
   fncache
-  lock
   phaseroots
   undo
   undo.backup.fncache
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 5 of 7 upgraderepo V2] repair: migrate revlogs during upgrade

2016-11-24 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1480040072 28800
#  Thu Nov 24 18:14:32 2016 -0800
# Node ID a0a73b7b8699a3b831d8274dbfb0d7918b36be54
# Parent  17c9861cd0084c99bac7071e07fc910373568df3
repair: migrate revlogs during upgrade

Our next step for in-place upgrade is to migrate store data. Revlogs
are the biggest source of data within the store and a store is useless
without them, so we implement their migration first.

Our strategy for migrating revlogs is to walk the store and call
`revlog.clone()` on each revlog. There are some minor complications.

Because revlogs have different storage options (e.g. changelog has
generaldelta and delta chains disabled), we need to obtain the
correct class of revlog so inserted data is encoded properly for its
type.

Various attempts at implementing progress indicators that didn't lead
to frustration from false "it's almost done" indicators were made.

I initially used a single progress bar based on number of revlogs.
However, this quickly churned through all filelogs, got to 99% then
effectively froze at 99.99% when it got to the manifest.

So I converted the progress bar to total revision count. This was a
little bit better. But the manifest was still significantly slower
than filelogs and it took forever to process the last few percent.

I then tried both revision/chunk bytes and raw bytes as the
denominator. This had the opposite effect: because so much data is in
manifests, it would churn through filelogs without showing much
progress. When it got to manifests, it would fill in 90+% of the
progress bar.

I finally gave up having a unified progress bar and instead implemented
3 progress bars: 1 for filelog revisions, 1 for manifest revisions, and
1 for changelog revisions. I added extra messages indicating the total
number of revisions of each so users know there are more progress bars
coming.

I also added extra messages before and after each stage to give extra
details about what is happening. Strictly speaking, this isn't
necessary. But the numbers are impressive. For example, when converting
a non-generaldelta mozilla-central repository, the messages you see are:

   migrating 2475593 total revisions (1833043 in filelogs, 321156 in manifests, 
321394 in changelog)
   migrating 1.67 GB in store; 2508 GB tracked data
   migrating 267868 filelogs containing 1833043 revisions (1.09 GB in store; 
57.3 GB tracked data)
   finished migrating 1833043 filelog revisions across 267868 filelogs; change 
in size: -415776 bytes
   migrating 1 manifests containing 321156 revisions (518 MB in store; 2451 GB 
tracked data)

That "2508 GB" figure really blew me away. I had no clue that the raw
tracked data in mozilla-central was that large. Granted, 2451 GB is in
the manifest and "only" 57.3 GB is in filelogs. But still.

It's worth noting that gratuitous loading of source revlogs in order
to display numbers and progress bars does serve a purpose: it ensures
we can open all source revlogs. We don't want to spend several minutes
copying revlogs only to encounter a permissions error or similar later.

As part of this commit, we also add swapping of the store directory
to the upgrade function. After revlogs are converted, we move the
old store into the backup directory then move the temporary repo's
store into the old store's location. On well-behaved systems, this
should be 2 atomic operations and the window of inconsistency show be
very narrow.

There are still a few improvements to be made to store copying and
upgrading. But this commit gets the bulk of the work out of the way.

diff --git a/mercurial/repair.py b/mercurial/repair.py
--- a/mercurial/repair.py
+++ b/mercurial/repair.py
@@ -11,15 +11,19 @@ from __future__ import absolute_import
 import errno
 import hashlib
 import tempfile
+import time
 
 from .i18n import _
 from .node import short
 from . import (
 bundle2,
 changegroup,
+changelog,
 error,
 exchange,
+manifest,
 obsolete,
+revlog,
 scmutil,
 util,
 )
@@ -638,6 +642,162 @@ def upgradedetermineactions(repo, improv
 
 return newactions
 
+def _revlogfrompath(repo, path):
+"""Obtain a revlog from a repo path.
+
+An instance of the appropriate class is returned.
+"""
+if path == '00changelog.i':
+return changelog.changelog(repo.svfs)
+elif path.endswith('00manifest.i'):
+mandir = path[:-len('00manifest.i')]
+return manifest.manifestrevlog(repo.svfs, dir=mandir)
+else:
+# Filelogs don't do anything special with settings. So we can use a
+# vanilla revlog.
+return revlog.revlog(repo.svfs, path)
+
+def _copyrevlogs(ui, srcrepo, dstrepo, tr, deltareuse, aggressivemergedeltas):
+"""Copy revlogs between 2 repos."""
+revcount = 0
+srcsize = 0
+srcrawsize = 0
+dstsize = 0
+fcount = 0
+frevcount = 0
+fsrcsize = 0
+frawsize = 0
+fdstsize = 0
+mcount = 0
+mrevcount = 0
+

[PATCH 2 of 7 upgraderepo V2] debugcommands: stub for debugupgraderepo command

2016-11-24 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1480033449 28800
#  Thu Nov 24 16:24:09 2016 -0800
# Node ID 4fffbdc2bca22e666482b55562c0e592c8a68027
# Parent  c60995fce14b0e34bd1416dab3712a6c33bb29bb
debugcommands: stub for debugupgraderepo command

Currently, if Mercurial introduces a new repository/store feature or
changes behavior of an existing feature, users must perform an
`hg clone` to create a new repository with hopefully the
correct/optimal settings. Unfortunately, even `hg clone` may not
give the correct results. For example, if you do a local `hg clone`,
you may get hardlinks to revlog files that inherit the old state.
If you `hg clone` from a remote or `hg clone --pull`, changegroup
application may bypass some optimization, such as converting to
generaldelta.

Optimizing a repository is harder than it seems and requires more
than a simple `hg` command invocation.

This commit starts the process of changing that. We introduce
`hg debugupgraderepo`, a command that performs an in-place upgrade
of a repository to use new, optimal features. The command is just
a stub right now. Features will be added in subsequent commits.

This commit does foreshadow some of the behavior of the new command,
notably that it doesn't do anything by default and that it takes
arguments that influence what actions it performs. These will be
explained more in subsequent commits.

diff --git a/mercurial/debugcommands.py b/mercurial/debugcommands.py
--- a/mercurial/debugcommands.py
+++ b/mercurial/debugcommands.py
@@ -849,3 +849,28 @@ def debugdeltachain(ui, repo, file_=None
  extradist=extradist, extraratio=extraratio)
 
 fm.end()
+
+@command('debugupgraderepo', [
+('o', 'optimize', [], _('extra optimization to perform'), _('NAME')),
+('', 'run', False, _('performs an upgrade')),
+])
+def debugupgraderepo(ui, repo, run=False, optimize=None):
+"""upgrade a repository to use different features
+
+If no arguments are specified, the repository is evaluated for upgrade
+and a list of problems and potential optimizations is printed.
+
+With ``--run``, a repository upgrade is performed. Behavior of the upgrade
+can be influenced via additional arguments. More details will be provided
+by the command output when run without ``--run``.
+
+During the upgrade, the repository will be locked and no writes will be
+allowed.
+
+At the end of the upgrade, the repository may not be readable while new
+repository data is swapped in. This window will be as long as it takes to
+rename some directories inside the ``.hg`` directory. On most machines, 
this
+should complete almost instantaneously and the chances of a consumer being
+unable to access the repository should be low.
+"""
+raise error.Abort(_('not yet implemented'))
diff --git a/tests/test-completion.t b/tests/test-completion.t
--- a/tests/test-completion.t
+++ b/tests/test-completion.t
@@ -109,6 +109,7 @@ Show debug commands if there are no othe
   debugsub
   debugsuccessorssets
   debugtemplate
+  debugupgraderepo
   debugwalk
   debugwireargs
 
@@ -274,6 +275,7 @@ Show all commands + options
   debugsub: rev
   debugsuccessorssets: 
   debugtemplate: rev, define
+  debugupgraderepo: optimize, run
   debugwalk: include, exclude
   debugwireargs: three, four, five, ssh, remotecmd, insecure
   files: rev, print0, include, exclude, template, subrepos
diff --git a/tests/test-help.t b/tests/test-help.t
--- a/tests/test-help.t
+++ b/tests/test-help.t
@@ -916,6 +916,8 @@ Test list of internal help commands
  show set of successors for revision
debugtemplate
  parse and apply a template
+   debugupgraderepo
+ upgrade a repository to use different features
debugwalk show how files match on given patterns
debugwireargs
  (no help text available)
diff --git a/tests/test-upgrade-repo.t b/tests/test-upgrade-repo.t
new file mode 100644
--- /dev/null
+++ b/tests/test-upgrade-repo.t
@@ -0,0 +1,5 @@
+  $ hg init empty
+  $ cd empty
+  $ hg debugupgraderepo
+  abort: not yet implemented
+  [255]
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH 4 of 7 upgraderepo V2] repair: begin implementation of in-place upgrading

2016-11-24 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1480042331 28800
#  Thu Nov 24 18:52:11 2016 -0800
# Node ID 17c9861cd0084c99bac7071e07fc910373568df3
# Parent  328d95ac1fd128359a2665f2959e030ee8cce046
repair: begin implementation of in-place upgrading

Now that all the upgrade planning work is in place, we can start
doing the real work: actually upgrading a repository.

The main goal of this commit is to get the "framework" for running
in-place upgrade actions in place.

Rather than get too clever and low-level with regards to in-place
upgrades, our strategy is to create a new, temporary repository,
copy data to it, then replace the old data with the new. This allows
us to reuse a lot of code in localrepo.py around store interaction,
which will eventually consume the bulk of the upgrade code.

But we have to start small. This patch implements adding new
repository requirements. But it still sets up a temporary
repository and locks it and the source repo before performing the
requirements file swap. This means all the plumbing is in place
to implement store copying in subsequent commits.

diff --git a/mercurial/repair.py b/mercurial/repair.py
--- a/mercurial/repair.py
+++ b/mercurial/repair.py
@@ -10,6 +10,7 @@ from __future__ import absolute_import
 
 import errno
 import hashlib
+import tempfile
 
 from .i18n import _
 from .node import short
@@ -19,6 +20,7 @@ from . import (
 error,
 exchange,
 obsolete,
+scmutil,
 util,
 )
 
@@ -636,6 +638,50 @@ def upgradedetermineactions(repo, improv
 
 return newactions
 
+def _upgraderepo(ui, srcrepo, dstrepo, requirements, actions):
+"""Do the low-level work of upgrading a repository.
+
+The upgrade is effectively performed as a copy between a source
+repository and a temporary destination repository.
+
+The source repository is unmodified for as long as possible so the
+upgrade can abort at any time without causing loss of service for
+readers and without corrupting the source repository.
+"""
+assert srcrepo.currentwlock()
+assert dstrepo.currentwlock()
+
+# TODO copy store
+
+backuppath = tempfile.mkdtemp(prefix='upgradebackup.', dir=srcrepo.path)
+backupvfs = scmutil.vfs(backuppath)
+
+# Make a backup of requires file first, as it is the first to be modified.
+util.copyfile(srcrepo.join('requires'), backupvfs.join('requires'))
+
+# We install an arbitrary requirement that clients must not support
+# as a mechanism to lock out new clients during the data swap. This is
+# better than allowing a client to continue while the repository is in
+# an inconsistent state.
+ui.write(_('marking source repository as being upgraded; clients will be '
+   'unable to read from repository\n'))
+scmutil.writerequires(srcrepo.vfs,
+  srcrepo.requirements | set(['upgradeinprogress']))
+
+ui.write(_('starting in-place swap of repository data\n'))
+ui.write(_('replaced files will be backed up at %s\n') %
+ backuppath)
+
+# TODO do the store swap here.
+
+# We first write the requirements file. Any new requirements will lock
+# out legacy clients.
+ui.write(_('finalizing requirements file and making repository readable '
+   'again\n'))
+scmutil.writerequires(srcrepo.vfs, requirements)
+
+return backuppath
+
 def upgraderepo(ui, repo, run=False, optimize=None):
 """Upgrade a repository in place."""
 # Avoid cycle: cmdutil -> repair -> localrepo -> cmdutil
@@ -774,3 +820,42 @@ def upgraderepo(ui, repo, run=False, opt
 ui.write(_('%s\n   %s\n\n') % (i['name'], i['description']))
 
 return
+
+# Else we're in the run=true case.
+ui.write(_('upgrade will perform the following actions:\n\n'))
+printrequirements()
+printupgradeactions()
+
+ui.write(_('beginning upgrade...\n'))
+with repo.wlock():
+with repo.lock():
+ui.write(_('repository locked and read-only\n'))
+# Our strategy for upgrading the repository is to create a new,
+# temporary repository, write data to it, then do a swap of the
+# data. There are less heavyweight ways to do this, but it is 
easier
+# to create a new repo object than to instantiate all the 
components
+# (like the store) separately.
+tmppath = tempfile.mkdtemp(prefix='upgrade.', dir=repo.path)
+backuppath = None
+try:
+ui.write(_('creating temporary repository to stage migrated '
+   'data: %s\n') % tmppath)
+dstrepo = localrepo.localrepository(repo.baseui,
+path=tmppath,
+create=True)
+
+with dstrepo.wlock():
+with dstrepo.lock():
+backuppath = 

[PATCH 6 of 7 upgraderepo V2] repair: copy non-revlog store files during upgrade

2016-11-24 Thread Gregory Szorc
# HG changeset patch
# User Gregory Szorc 
# Date 1480041290 28800
#  Thu Nov 24 18:34:50 2016 -0800
# Node ID 65c8053026cdec8ad60e23caa55c14b05e9e25cb
# Parent  a0a73b7b8699a3b831d8274dbfb0d7918b36be54
repair: copy non-revlog store files during upgrade

The store contains more than just revlogs. This patch teaches the
upgrade code to copy regular files as well.

As the test changes demonstrate, the phaseroots file is now copied.

diff --git a/mercurial/repair.py b/mercurial/repair.py
--- a/mercurial/repair.py
+++ b/mercurial/repair.py
@@ -10,6 +10,7 @@ from __future__ import absolute_import
 
 import errno
 import hashlib
+import stat
 import tempfile
 import time
 
@@ -798,6 +799,44 @@ def _copyrevlogs(ui, srcrepo, dstrepo, t
 ui.write(_('finished migrating %d total revisions; total change in store '
'size: %s\n') % (revcount, util.bytecount(dstsize - srcsize)))
 
+def _upgradefilterstorefile(srcrepo, dstrepo, requirements, path, mode, st):
+"""Determine whether to copy a store file during upgrade.
+
+This function is called when migrating store files from ``srcrepo`` to
+``dstrepo`` as part of upgrading a repository.
+
+Args:
+  srcrepo: repo we are copying from
+  dstrepo: repo we are copying to
+  requirements: set of requirements for ``dstrepo``
+  path: store file being examined
+  mode: the ``ST_MODE`` file type of ``path``
+  st: ``stat`` data structure for ``path``
+
+Function should return ``True`` if the file is to be copied.
+"""
+# Skip revlogs.
+if path.endswith(('.i', '.d')):
+return False
+# Skip transaction related files.
+if path.startswith('undo'):
+return False
+# Only copy regular files.
+if mode != stat.S_IFREG:
+return False
+# Skip other skipped files.
+if path in ('lock', 'fncache'):
+return False
+
+return True
+
+def _upgradefinishdatamigration(ui, srcrepo, dstrepo, requirements):
+"""Hook point for extensions to perform additional actions during upgrade.
+
+This function is called after revlogs and store files have been copied but
+before the new store is swapped into the original location.
+"""
+
 def _upgraderepo(ui, srcrepo, dstrepo, requirements, actions):
 """Do the low-level work of upgrading a repository.
 
@@ -827,7 +866,18 @@ def _upgraderepo(ui, srcrepo, dstrepo, r
 _copyrevlogs(ui, srcrepo, dstrepo, tr, deltareuse,
  'redeltamultibase' in actions)
 
-# TODO copy non-revlog store files
+# Now copy other files in the store directory.
+for p, kind, st in srcrepo.store.vfs.readdir('', stat=True):
+if not _upgradefilterstorefile(srcrepo, dstrepo, requirements,
+   p, kind, st):
+continue
+
+srcrepo.ui.write(_('copying %s\n') % p)
+src = srcrepo.store.vfs.join(p)
+dst = dstrepo.store.vfs.join(p)
+util.copyfile(src, dst, copystat=True)
+
+_upgradefinishdatamigration(ui, srcrepo, dstrepo, requirements)
 
 ui.write(_('data fully migrated to temporary repository\n'))
 
diff --git a/tests/test-upgrade-repo.t b/tests/test-upgrade-repo.t
--- a/tests/test-upgrade-repo.t
+++ b/tests/test-upgrade-repo.t
@@ -241,6 +241,7 @@ Upgrading a repository to generaldelta w
   migrating changelog containing 3 revisions (184 bytes in store; 181 bytes 
tracked data)
   finished migrating 3 changelog revisions; change in size: 0 bytes
   finished migrating 9 total revisions; total change in store size: 0 bytes
+  copying phaseroots
   data fully migrated to temporary repository
   marking source repository as being upgraded; clients will be unable to read 
from repository
   starting in-place swap of repository data
@@ -276,6 +277,7 @@ store directory has files we expect
   00manifest.i
   data
   fncache
+  phaseroots
   undo
   undo.backupfiles
   undo.phaseroots
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 7 of 7 v2] bdiff: give slight preference to removing trailing lines

2016-11-24 Thread Mike Hommey
On Thu, Nov 24, 2016 at 05:52:29PM +, Jun Wu wrote:
> Excerpts from Augie Fackler's message of 2016-11-17 12:42:26 -0500:
> > My own cursory perfbdiff runs suggest this is a perf wash (using
> > `perfbdiff -m 3041e4d59df2` in the mozilla repo). Queued. Thanks!
> 
> I'd mention this series changes the behavior of the diff output. The
> difference was caught by fastannotate test.
> 
> See the below table (old: e1d6aa0e4c3a, new: 8836f13e3c5b):
> 
>a | b | old | new
>   
>a | a |  a  | -a
>a | z | +z  |  a
>a | a |  a  | +z
>  |   | -a  |  a
>   
>a | a | a
>a | a | a
>a |   |-a
> 
> I think we would always prefer putting deletions at the end, to be consistent.

Wouldn't
 a
-a
+z
 a

Be preferable to both old and new? That's what plain diff does, by the
way.

Mike
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 05 of 10] repair: obtain and validate requirements for upgraded repo

2016-11-24 Thread Gregory Szorc
On Mon, Nov 21, 2016 at 12:50 PM, Augie Fackler  wrote:

> On Sat, Nov 05, 2016 at 09:40:21PM -0700, Gregory Szorc wrote:
> > # HG changeset patch
> > # User Gregory Szorc 
> > # Date 1478394961 25200
> > #  Sat Nov 05 18:16:01 2016 -0700
> > # Node ID b768004ef2db9c2e6dd267997e9e0c011f1b732a
> > # Parent  7518e68e2f8276e85fb68174b3055a9dd16c665d
> > repair: obtain and validate requirements for upgraded repo
> >
> > Not all existing repositories can be upgraded. Not all requirements
> > are supported in new repositories. Some transitions between
> > repository requirements (notably removing a requirement) are not
> > supported.
> >
> > This patch adds code for validating that the requirements transition
> > for repository upgrades is acceptable and aborts if it isn't.
> > Functionality is split into various functions to give extensions an
> > opportunity to monkeypatch.
> >
> > diff --git a/mercurial/repair.py b/mercurial/repair.py
> > --- a/mercurial/repair.py
> > +++ b/mercurial/repair.py
> > @@ -401,6 +401,85 @@ def upgradefinddeficiencies(repo):
> >
> >  return l, actions
> >
> > +def upgradesupporteddestrequirements(repo):
> > +"""Obtain requirements that upgrade supports in the destination.
> > +
> > +Extensions should monkeypatch this to add their custom requirements.
> > +"""
> > +return set([
> > +'dotencode',
> > +'fncache',
> > +'generaldelta',
> > +'manifestv2',
> > +'revlogv1',
> > +'store',
> > +'treemanifest',
> > +])
> > +
> > +def upgraderequiredsourcerequirements(repo):
> > +"""Obtain requirements that must be present in the source
> repository."""
> > +return set([
> > +'revlogv1',
> > +'store',
> > +])
> > +
> > +def upgradeallowednewrequirements(repo):
> > +"""Obtain requirements that can be added to a repository.
> > +
> > +This is used to disallow proposed requirements from being added when
> > +they weren't present before.
> > +
> > +We use a whitelist of allowed requirement additions instead of a
> > +blacklist of known bad additions because the whitelist approach is
> > +safer and will prevent future, unknown requirements from
> accidentally
> > +being added.
> > +"""
> > +return set([
> > +'dotencode',
> > +'fncache',
> > +'generaldelta',
> > +])
> > +
> > +def upgradereporequirements(repo):
> > +"""Obtain and validate requirements for repository after upgrade.
> > +
> > +Should raise ``Abort`` if existing or new requirements aren't
> sufficient.
> > +"""
> > +# Avoid cycle: cmdutil -> repair -> localrepo -> cmdutil
> > +from . import localrepo
>
> It looks like there should be relatively minimal work to break the dep
> from localrepo to cmdutil. That dependency definitely looks suspect to
> me (dirstateguard seems like it should be in dirstate.py just from the
> name, and checkunresolved probably belongs in mergemod? or dirstate?)
>
> Not a blocker for this series, but wanted to bring this up. I dream of
> a future without any imports hidden inside a function to avoid cycles.
>
> > +
> > +existing = repo.requirements
> > +
> > +missingreqs = upgraderequiredsourcerequirements(repo) - existing
> > +if missingreqs:
> > +raise error.Abort(_('cannot upgrade repository; requirement '
> > +'missing: %s') % ',
> '.join(sorted(missingreqs)))
> > +
> > +createreqs = localrepo.newreporequirements(repo)
> > +
> > +# This might be overly restrictive. It is definitely safer to
> enforce this
> > +# as it prevents unwanted deletion of requirements.
>
> I can't think of a world where we'd want to allow discarding
> requirements on upgrade.
>

I can :)

In the not so future world, revlogs will integrate with the compression
engine API and revlogs can store data that isn't zlib compressed. This will
require a repository requirement so clients not supporting a compression
engine don't attempt to read unknown data formats. If you convert a repo
from say zstd revlogs to lz4 revlogs, you will almost certainly drop a
requirement related to zstd. This raises an interesting question around how
we define the preferred compression format for future revlog entries. But
that's for a different series :)

TBH, I'm not yet sure how to handle this scenario in the repo upgrade code.
I'll try to come up with something future proof in v2, since this future
world is not so far off.


>
> > +removedreqs = existing - createreqs
> > +if removedreqs:
> > +raise error.Abort(_('cannot upgrade repository; requirement
> would '
> > +'be removed: %s') % ',
> '.join(sorted(removedreqs)))
> > +
> > +unsupportedreqs = createreqs - upgradesupporteddestrequiremen
> ts(repo)
> > +if unsupportedreqs:
> > +raise error.Abort(_('cannot upgrade repository; new requirement
> not '
> 

Re: [PATCH 7 of 7 v2] bdiff: give slight preference to removing trailing lines

2016-11-24 Thread Gregory Szorc
On Thu, Nov 24, 2016 at 9:52 AM, Jun Wu  wrote:

> Excerpts from Augie Fackler's message of 2016-11-17 12:42:26 -0500:
> > My own cursory perfbdiff runs suggest this is a perf wash (using
> > `perfbdiff -m 3041e4d59df2` in the mozilla repo). Queued. Thanks!
>
> I'd mention this series changes the behavior of the diff output. The
> difference was caught by fastannotate test.
>
> See the below table (old: e1d6aa0e4c3a, new: 8836f13e3c5b):
>
>a | b | old | new
>   
>a | a |  a  | -a
>a | z | +z  |  a
>a | a |  a  | +z
>  |   | -a  |  a
>   
>a | a | a
>a | a | a
>a |   |-a
>
> I think we would always prefer putting deletions at the end, to be
> consistent.
>

And herein lies an important distinction: we currently rely on bdiff for
both internal revlog and changegroup delta storage and user presentation
(we iterate bdiff blocks to drive the unified diff printing in
mdiff._unidiff). These are obviously vastly different use cases, with
internal used wanting to optimize for minimal resource consumption in many
cases and user-facing wanting to optimize for human readability.

We can preserve the format of the user-facing diffs by making the bdiff
behavior conditional on various flags or by post-processing the bdiff
output.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: File Name Patterns Plan

2016-11-24 Thread FUJIWARA Katsunori
At Thu, 24 Nov 2016 17:04:38 +0100,
Pierre-Yves David wrote:
> 
> Recently, Foozy created a Plan page for the matcher issues:
> 
> https://www.mercurial-scm.org/wiki/FileNamePatternsPlan
> 
> It is a good start but there is a couple of elements that are
> missing or still a bit fuzzy to me.

Thank you for comments !

I'll investigate and update Wiki page later, to prevent my sleepy
brain from incorrectly thinking :-)

This reply is just FYI about easy/clear points, to refer until
updating Wiki page, even though you may know them already.

We should open new discussion thread citing some of points, which this
mail metions, in devel-ml ASAP, to avoid scattering discussion log
here and there, shouldn't we ?


> 1) Default matcher:
> 
>What is the default pattern mode ?
> 
>When one do `hg files FOO` how is FOO processed. It seems like to be
>'relpath:'. Double checking this would be useful and mentioning it
>on the page is important.

Basically:

  ===  =
  (default)
  casetype recursion
  ===  =
  -I/-X   glob:   o
  "glob:" in hgignore relglob:o (*1)
  pattern in fileset  glob:   x (*2)
  other "glob:"   glob:   x (*2)
  otherwise   relpath:o (*2) (*3)
  ===  =

(*1) treated as "include" of match.match() internally
(*2) treated as "pats" of match.match() internally
(*3) usually, via scmutil.match() with default="relpath"

But:

> 2) Difference in command behavior:
> 
>There seems to be some commands behaving differently than other,
>notably `hg locates` have some strange kind of
>raw-non-recursive-any-rooted matching by default. It seems to go back to
>'relpath:' when using -I
> 
>I wonder if there is other commands like this. It might be useful to
>search for the default matcher on a command/flag basis.

Oh, I overlooked that:

  - "hg files" uses "relpath:" as default of scmutil.match(), but
  - "hg locate" uses "relglob:" explicitly

(early commits introducing "hg files" may know why)

I'll investigate.

 
> 3) Recursion behavior,
> 
> There is some data about this in the page, but I think we need more
> formal representation to have a clear view of the situation.
> 
> The existing 'path:' and 'relpath:' are recursive in all cases,
> while 'glob:' and 're:' variants are only recursive with -I/-E.
> This is a key point because as far as I understand fixing this is a
> core goal of the current plan.

  while 'glob:' variants are only recursive with -I/-X

  ('re:' is always recursive)

> However, Foozy point out that using 'set:' with -I disable the
> automatic recursion for 're' and 'glob', but not for 'path', so we
> have more "variants" here.

  using 'set:' with -I disable the automatic recursion for 'glob', but
  not for 're' and 'path'

  ('re:' is always recursive)

 
> (bonus point: Rodrigo use case can we fulfilled by adding 'set:' to
> his selector.)
> 
> I also wonder if there is other variants than "pattern", "-I" and
> "-I + set:".
> 
> Having a table with 'pattern-type / usage' listing the recursive
> case would probably be a good start.

I'll investigate.


> 4) Reading from file,
> 
>Foozy mention the pattern name in some file (hgignore) does not
>match pattern name on the command line.
> 
>I think it would be useful to be a bit more formal here. What kind
>of file do we read pattern from? Do we have difference from 1 file
>to another? what are the translation (and default), etc.

match.readpatternfile() substitutes pattern-type in files read in.

glob => relglob
re   => relre

https://www.mercurial-scm.org/repo/hg/file/4.0/mercurial/match.py#l666

In Mercurial core, .hgignore (and files indirectly included by it or
hgrc) is only one case.


> 5) Pattern-type table
> 
>Foozy made many table explaining how variants are covered by
>pattern type. Having a pattern centric summary will be useful.
> 
>Proposal for columns:
> 
>* pattern type;
>* from cli or file;
>* matching mode (raw, glob, or re),
>* rooting (root, cwd or any),
>* recursive when used as Pattern
>* recursive when used with -I
> 
>Having the same table for the proposed keyword would help to
>understand inconsistency and similarity with

I'll update Wiki page.


> 6) file:/dir:
> 
>I'm a bit confused here because Mercurial does not really track/work
>on directories. What is is benefit of 'dir:' ? 'dir:' seems very
>similar to 'path' am I missing something important?
> 
>As I understand 'file:' could be useful for the non-recursive
>part if we want to cover every single cases. Am I right?

Yes, 'file:' is used for strict non-recursive matching. 'dir:' is
listed as opposite of 'file:', for coverage :-)

I have only one example usecase for "dir:". If file and 

Re: [PATCH 1 of 2] debugcommands: introduce standalone module for debug commands

2016-11-24 Thread Gregory Szorc
On Thu, Nov 24, 2016 at 10:39 AM, Pierre-Yves David <
pierre-yves.da...@ens-lyon.org> wrote:

>
>
> On 11/23/2016 06:03 PM, Pierre-Yves David wrote:
>
>>
>>
>> On 11/10/2016 06:50 PM, Gregory Szorc wrote:
>>
>>> I started this series a few months ago, told Pierre-Yves about it, and
>>> he encouraged me to start patchbombing.
>>>
>>> While I haven't completed the work, the remainder of what I've started
>>> can be pulled from
>>> https://hg.mozilla.org/users/gszorc_mozilla.com/hg/rev/916144fdac95. I
>>> don't plan on patchbombing the rest, as I don't want to be distracted
>>> from other work.
>>>
>>
>> I've pushed 5 more of these move. I'll keep pushing small piece of it
>> from time to time.
>>
>
> All of them should be in hg-committed by now. Some debug command still
> remains in the commands modules and needs to be migrated.
>

If anyone does that work, I was using the opportunity of the migration to
reorder functions so they are in alphabetical order. I'm not sure if I was
consistent in that endeavor or if Pierre-Yves preserved that property. But
it's something we may want to continue. We may also want to establish a
linting check for it.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 2] debugcommands: introduce standalone module for debug commands

2016-11-24 Thread Pierre-Yves David



On 11/23/2016 06:03 PM, Pierre-Yves David wrote:



On 11/10/2016 06:50 PM, Gregory Szorc wrote:

I started this series a few months ago, told Pierre-Yves about it, and
he encouraged me to start patchbombing.

While I haven't completed the work, the remainder of what I've started
can be pulled from
https://hg.mozilla.org/users/gszorc_mozilla.com/hg/rev/916144fdac95. I
don't plan on patchbombing the rest, as I don't want to be distracted
from other work.


I've pushed 5 more of these move. I'll keep pushing small piece of it
from time to time.


All of them should be in hg-committed by now. Some debug command still 
remains in the commands modules and needs to be migrated.


Cheers,

--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 2] bdiff: early pruning of common prefix before doing expensive computations

2016-11-24 Thread Pierre-Yves David



On 11/17/2016 10:30 PM, Augie Fackler wrote:

On Thu, Nov 17, 2016 at 08:29:46PM +0100, Mads Kiilerich wrote:

On 11/17/2016 07:53 PM, Gregory Szorc wrote:

On Thu, Nov 17, 2016 at 9:16 AM, Mads Kiilerich > wrote:

   # HG changeset patch
   # User Mads Kiilerich >
   # Date 1479321935 -3600
   #  Wed Nov 16 19:45:35 2016 +0100
   # Node ID 7f39bccc1c96bffc83f3c6e774da57ecd38982a7
   # Parent  fe6a3576b863955a6c40ca46bd1d6c8e5384dedf
   bdiff: early pruning of common prefix before doing expensive
   computations


[...]


   diff --git a/mercurial/bdiff_module.c b/mercurial/bdiff_module.c
   --- a/mercurial/bdiff_module.c
   +++ b/mercurial/bdiff_module.c
   @@ -61,12 +61,12 @@ nomem:

static PyObject *bdiff(PyObject *self, PyObject *args)


Implementing this in the Python module means that CFFI/PyPy won't be able
to realize the perf wins :(

How do you feel about implementing this in bdiff.c?


I guess other logic also should move from bdiff_module to bdiff.c? This was
just the easy way to hook in before the two sides got split into lines.


If you're willing, I'd be a big fan of this change happening in such a
way that pypy gets the wins too. What say you?


So, what is the status of this? Should we expect a V2 with the code 
living in bdiff.c?


Cheers,

--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] posix: _really_ make sure .hg/cache/checklink points at a real file

2016-11-24 Thread Pierre-Yves David



On 11/24/2016 12:15 AM, Mads Kiilerich wrote:

On 11/23/2016 11:24 PM, Pierre-Yves David wrote:



On 11/23/2016 11:09 PM, Mads Kiilerich wrote:

# HG changeset patch
# User Mads Kiilerich 
# Date 1479938505 -3600
#  Wed Nov 23 23:01:45 2016 +0100
# Node ID 2841e0a6f97ba09dff5ffe7f42ac8c6e1b23338f
# Parent  8836f13e3c5b8eae765372708b659c55a044cbb4
posix: _really_ make sure .hg/cache/checklink points at a real file

8836f13e3c5b failed to do what it said; it did leave a dangling
symlink. As
promised, that broke setup.py sdist. It also broke stuff on Solaris
where "cp
-r" by default follows symlinks.

Instead, make it point at ../00changelog.i, which is the file that is
most
likely to exist. This adds some extra layering violation ... but not
much, in
an innocent way, and it works ...


Could we just create a empty file right next to this symlink and point
to that? That would seems more independant/robust.



Right.


If I'm understanding your reply correctly, we should expect a V2. I'm 
dropping this from patchwork


--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 7 of 7 v2] bdiff: give slight preference to removing trailing lines

2016-11-24 Thread Jun Wu
Excerpts from Augie Fackler's message of 2016-11-17 12:42:26 -0500:
> My own cursory perfbdiff runs suggest this is a perf wash (using
> `perfbdiff -m 3041e4d59df2` in the mozilla repo). Queued. Thanks!

I'd mention this series changes the behavior of the diff output. The
difference was caught by fastannotate test.

See the below table (old: e1d6aa0e4c3a, new: 8836f13e3c5b):

   a | b | old | new
  
   a | a |  a  | -a
   a | z | +z  |  a
   a | a |  a  | +z
 |   | -a  |  a
  
   a | a | a
   a | a | a
   a |   |-a

I think we would always prefer putting deletions at the end, to be consistent.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH evolve-ext] evolve: improve error message if unstable changes are disallowed

2016-11-24 Thread Pulkit Goyal
On Thu, Nov 24, 2016 at 11:01 PM, Pierre-Yves David
 wrote:
>
>
> On 11/23/2016 05:19 PM, Mateusz Kwapich wrote:
>>
>> Excerpts from Pulkit Goyal's message of 2016-11-23 21:06:29 +0530:
>>>
>>> # HG changeset patch
>>> # User Pulkit Goyal <7895pul...@gmail.com>
>>> # Date 1479915042 -19800
>>> #  Wed Nov 23 21:00:42 2016 +0530
>>> # Node ID 32083f1f0c67341e5b4c22e880b70ccc4e0fc088
>>> # Parent  cb2bac3253fbd52894ffcb4719a148fe6a3da38b
>>> evolve: improve error message if unstable changes are disallowed
>>>
>>> I saw a question on stackoverflow why evolve reports something like
>>> cannot
>>> fold chain not ending with head. Even I was confused the first time about
>>> the
>>> behavior. The error message can be improved to avoid confusion to people
>>> who
>>> are unaware about the config in future.
>>
>>
>> That sounds like a very useful information. It sucks that the errors
>> have newlines in them. I think we can avoid it - see my inline comments.
>
>
> Yet, improving that message is a good idea. As mateuz pointed, we avoid new
> line in error message:
>
>   https://www.mercurial-scm.org/wiki/UIGuideline#error_message
>
> In this case pointing to some help about the config would be a good idea
> (this might involve creating such help o:-) )

Well I send a V2 incorporating Mateusz feedback. I also thought of
referring to experimental.evolution but I couldn't find a good way, or
error message. If config thing is better, drop the V2, I will find a
way to refer to config thing. :)

> Thanks for finding it and working on this.
>
> --
> Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


[PATCH V2] evolve: improve error message if unstable changes are disallowed

2016-11-24 Thread Pulkit Goyal
# HG changeset patch
# User Pulkit Goyal <7895pul...@gmail.com>
# Date 1479915042 -19800
#  Wed Nov 23 21:00:42 2016 +0530
# Node ID 920d5946d13339d9cf4828f678fb55063cd8
# Parent  cb2bac3253fbd52894ffcb4719a148fe6a3da38b
evolve: improve error message if unstable changes are disallowed

I saw a question on stackoverflow why evolve reports something like cannot
fold chain not ending with head. Even I was confused the first time about the
behavior. The error message can be improved to avoid confusion to people who
are unaware about the config in future.

diff -r cb2bac3253fb -r 920d5946d133 hgext/evolve.py
--- a/hgext/evolve.py   Wed Nov 02 18:56:44 2016 +0100
+++ b/hgext/evolve.py   Wed Nov 23 21:00:42 2016 +0530
@@ -2514,7 +2514,8 @@
 raise error.Abort('nothing to prune')
 
 if _disallowednewunstable(repo, revs):
-raise error.Abort(_("cannot prune in the middle of a stack"))
+raise error.Abort(_("cannot prune in the middle of a stack"),
+hint = _("new unstable changesets are not allowed"))
 
 # defines successors changesets
 sucs = scmutil.revrange(repo, succs)
@@ -3234,8 +3235,9 @@
 newunstable = _disallowednewunstable(repo, revs)
 if newunstable:
 raise error.Abort(
-_('cannot edit commit information in the middle of a 
stack'),
-hint=_('%s will be affected') % repo[newunstable.first()])
+_('cannot edit commit information in the middle of a '\
+'stack'), hint=_('%s will become unstable and new 
unstable'\
+' changes are not allowed') % repo[newunstable.first()])
 root = head = repo[revs.first()]
 
 wctx = repo[None]
@@ -3299,7 +3301,8 @@
 head = repo[heads.first()]
 if _disallowednewunstable(repo, revs):
 raise error.Abort(_("cannot fold chain not ending with a head "\
-"or with branching"))
+"or with branching"), hint = _("new unstable"\
+" changesets are not allowed"))
 return root, head
 
 def _disallowednewunstable(repo, revs):
diff -r cb2bac3253fb -r 920d5946d133 tests/test-evolve.t
--- a/tests/test-evolve.t   Wed Nov 02 18:56:44 2016 +0100
+++ b/tests/test-evolve.t   Wed Nov 23 21:00:42 2016 +0530
@@ -1301,9 +1301,11 @@
   created new head
   $ hg prune '26 + 27'
   abort: cannot prune in the middle of a stack
+  (new unstable changesets are not allowed)
   [255]
   $ hg prune '19::28'
   abort: cannot prune in the middle of a stack
+  (new unstable changesets are not allowed)
   [255]
   $ hg prune '26::'
   3 changesets pruned
@@ -1338,9 +1340,11 @@
 
   $ hg fold --exact "19 + 18"
   abort: cannot fold chain not ending with a head or with branching
+  (new unstable changesets are not allowed)
   [255]
   $ hg fold --exact "18::29"
   abort: cannot fold chain not ending with a head or with branching
+  (new unstable changesets are not allowed)
   [255]
   $ hg fold --exact "19::"
   2 changesets folded
@@ -1483,10 +1487,11 @@
 check that metaedit respects allowunstable
   $ hg metaedit '.^' --config 'experimental.evolution=createmarkers, 
allnewcommands'
   abort: cannot edit commit information in the middle of a stack
-  (c904da5245b0 will be affected)
+  (c904da5245b0 will become unstable and new unstable changes are not allowed)
   [255]
   $ hg metaedit '18::20' --fold --config 
'experimental.evolution=createmarkers, allnewcommands'
   abort: cannot fold chain not ending with a head or with branching
+  (new unstable changesets are not allowed)
   [255]
   $ hg metaedit --user foobar
   0 files updated, 0 files merged, 0 files removed, 0 files unresolved
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH v2] censor: flag internal documentation

2016-11-24 Thread Pierre-Yves David



On 11/23/2016 06:39 PM, Remi Chaintron wrote:

# HG changeset patch
# User Remi Chaintron 
# Date 1479922595 0
#  Wed Nov 23 17:36:35 2016 +
# Branch stable
# Node ID 24bcb76c5e6633e740f7dcec8f1ca96b06bcc536
# Parent  819f96b82fa4c4c6d07840a2b180d112b524103f
censor: flag internal documentation


This one is pushed, thanks for the update.

--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH evolve-ext] evolve: improve error message if unstable changes are disallowed

2016-11-24 Thread Pierre-Yves David



On 11/23/2016 05:19 PM, Mateusz Kwapich wrote:

Excerpts from Pulkit Goyal's message of 2016-11-23 21:06:29 +0530:

# HG changeset patch
# User Pulkit Goyal <7895pul...@gmail.com>
# Date 1479915042 -19800
#  Wed Nov 23 21:00:42 2016 +0530
# Node ID 32083f1f0c67341e5b4c22e880b70ccc4e0fc088
# Parent  cb2bac3253fbd52894ffcb4719a148fe6a3da38b
evolve: improve error message if unstable changes are disallowed

I saw a question on stackoverflow why evolve reports something like cannot
fold chain not ending with head. Even I was confused the first time about the
behavior. The error message can be improved to avoid confusion to people who
are unaware about the config in future.


That sounds like a very useful information. It sucks that the errors
have newlines in them. I think we can avoid it - see my inline comments.


Yet, improving that message is a good idea. As mateuz pointed, we avoid 
new line in error message:


  https://www.mercurial-scm.org/wiki/UIGuideline#error_message

In this case pointing to some help about the config would be a good idea 
(this might involve creating such help o:-) )


Thanks for finding it and working on this.

--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 5 v4] revlog: merge hash checking subfunctions

2016-11-24 Thread Rémi Chaintron
On Thu, 24 Nov 2016 at 16:30 Pierre-Yves David <
pierre-yves.da...@ens-lyon.org> wrote:

>
>
> On 11/23/2016 06:39 PM, Remi Chaintron wrote:
> > # HG changeset patch
> > # User Remi Chaintron 
> > # Date 1479916365 0
> > #  Wed Nov 23 15:52:45 2016 +
> > # Branch stable
> > # Node ID e908dd63d485424df4c4a4482b742d82652e2893
> > # Parent  24bcb76c5e6633e740f7dcec8f1ca96b06bcc536
> > revlog: merge hash checking subfunctions
> >
> > This patch factors the behavior of both methods into 'checkhash' and
> returns the
> > text in order to allow subclasses of revlog and extensions to extend hash
> > computation and handle hash mismatches.
>
> Having something named "checkhash" return the text seems strange to me.
> to me, the 'check' part of the name implies that the function is read
> only//checking a state no extra logic should apply here,
>
> If I remember our sprint discussion right, this text return have been
> introduced for censors. I can think of two ways to move forward, either
> change the 'checkhash' name to refer to this text computation part or
> changing censors to not abuse the 'check' method. What do you think?
>
>
This is a mistake on my part: Although the merge of the 'checkhash'
subfunctions has been updated to not return the text following your
feedback, I forgot to update the commit summary.


> Cheers,
>
> > diff --git a/contrib/perf.py b/contrib/perf.py
> > --- a/contrib/perf.py
> > +++ b/contrib/perf.py
> > @@ -861,7 +861,7 @@
> >  def dohash(text):
> >  if not cache:
> >  r.clearcaches()
> > -r._checkhash(text, node, rev)
> > +r.checkhash(text, node, rev=rev)
> >
> >  def dorevision():
> >  if not cache:
> > diff --git a/mercurial/bundlerepo.py b/mercurial/bundlerepo.py
> > --- a/mercurial/bundlerepo.py
> > +++ b/mercurial/bundlerepo.py
> > @@ -147,7 +147,7 @@
> >  delta = self._chunk(chain.pop())
> >  text = mdiff.patches(text, [delta])
> >
> > -self._checkhash(text, node, rev)
> > +self.checkhash(text, node, rev=rev)
> >  self._cache = (node, rev, text)
> >  return text
> >
> > diff --git a/mercurial/filelog.py b/mercurial/filelog.py
> > --- a/mercurial/filelog.py
> > +++ b/mercurial/filelog.py
> > @@ -104,9 +104,9 @@
> >
> >  return True
> >
> > -def checkhash(self, text, p1, p2, node, rev=None):
> > +def checkhash(self, text, node, p1=None, p2=None, rev=None):
> >  try:
> > -super(filelog, self).checkhash(text, p1, p2, node, rev=rev)
> > +super(filelog, self).checkhash(text, node, p1=p1, p2=p2,
> rev=rev)
> >  except error.RevlogError:
> >  if _censoredtext(text):
> >  raise error.CensoredNodeError(self.indexfile, node,
> text)
> > diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> > --- a/mercurial/revlog.py
> > +++ b/mercurial/revlog.py
> > @@ -1221,9 +1221,7 @@
> >  bins = bins[1:]
> >
> >  text = mdiff.patches(text, bins)
> > -
> > -text = self._checkhash(text, node, rev)
> > -
> > +self.checkhash(text, node, rev=rev)
> >  self._cache = (node, rev, text)
> >  return text
> >
> > @@ -1235,12 +1233,14 @@
> >  """
> >  return hash(text, p1, p2)
> >
> > -def _checkhash(self, text, node, rev):
> > -p1, p2 = self.parents(node)
> > -self.checkhash(text, p1, p2, node, rev)
> > -return text
> > +def checkhash(self, text, node, p1=None, p2=None, rev=None):
> > +"""Check node hash integrity.
> >
> > -def checkhash(self, text, p1, p2, node, rev=None):
> > +Available as a function so that subclasses can extend hash
> mismatch
> > +behaviors as needed.
> > +"""
> > +if p1 is None and p2 is None:
> > +p1, p2 = self.parents(node)
> >  if node != self.hash(text, p1, p2):
> >  revornode = rev
> >  if revornode is None:
> > @@ -1415,7 +1415,7 @@
> >  basetext = self.revision(self.node(baserev), _df=fh)
> >  btext[0] = mdiff.patch(basetext, delta)
> >  try:
> > -self.checkhash(btext[0], p1, p2, node)
> > +self.checkhash(btext[0], node, p1=p1, p2=p2)
> >  if flags & REVIDX_ISCENSORED:
> >  raise RevlogError(_('node %s is not censored') %
> node)
> >  except CensoredNodeError:
> > ___
> > Mercurial-devel mailing list
> > Mercurial-devel@mercurial-scm.org
> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> >
>
> --
> Pierre-Yves David
> ___
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
-- 
Rémi
___
Mercurial-devel mailing list

Re: [PATCH 1 of 5 v4] revlog: merge hash checking subfunctions

2016-11-24 Thread Pierre-Yves David



On 11/23/2016 06:39 PM, Remi Chaintron wrote:

# HG changeset patch
# User Remi Chaintron 
# Date 1479916365 0
#  Wed Nov 23 15:52:45 2016 +
# Branch stable
# Node ID e908dd63d485424df4c4a4482b742d82652e2893
# Parent  24bcb76c5e6633e740f7dcec8f1ca96b06bcc536
revlog: merge hash checking subfunctions

This patch factors the behavior of both methods into 'checkhash' and returns the
text in order to allow subclasses of revlog and extensions to extend hash
computation and handle hash mismatches.


Having something named "checkhash" return the text seems strange to me. 
to me, the 'check' part of the name implies that the function is read 
only//checking a state no extra logic should apply here,


If I remember our sprint discussion right, this text return have been 
introduced for censors. I can think of two ways to move forward, either 
change the 'checkhash' name to refer to this text computation part or 
changing censors to not abuse the 'check' method. What do you think?


Cheers,


diff --git a/contrib/perf.py b/contrib/perf.py
--- a/contrib/perf.py
+++ b/contrib/perf.py
@@ -861,7 +861,7 @@
 def dohash(text):
 if not cache:
 r.clearcaches()
-r._checkhash(text, node, rev)
+r.checkhash(text, node, rev=rev)

 def dorevision():
 if not cache:
diff --git a/mercurial/bundlerepo.py b/mercurial/bundlerepo.py
--- a/mercurial/bundlerepo.py
+++ b/mercurial/bundlerepo.py
@@ -147,7 +147,7 @@
 delta = self._chunk(chain.pop())
 text = mdiff.patches(text, [delta])

-self._checkhash(text, node, rev)
+self.checkhash(text, node, rev=rev)
 self._cache = (node, rev, text)
 return text

diff --git a/mercurial/filelog.py b/mercurial/filelog.py
--- a/mercurial/filelog.py
+++ b/mercurial/filelog.py
@@ -104,9 +104,9 @@

 return True

-def checkhash(self, text, p1, p2, node, rev=None):
+def checkhash(self, text, node, p1=None, p2=None, rev=None):
 try:
-super(filelog, self).checkhash(text, p1, p2, node, rev=rev)
+super(filelog, self).checkhash(text, node, p1=p1, p2=p2, rev=rev)
 except error.RevlogError:
 if _censoredtext(text):
 raise error.CensoredNodeError(self.indexfile, node, text)
diff --git a/mercurial/revlog.py b/mercurial/revlog.py
--- a/mercurial/revlog.py
+++ b/mercurial/revlog.py
@@ -1221,9 +1221,7 @@
 bins = bins[1:]

 text = mdiff.patches(text, bins)
-
-text = self._checkhash(text, node, rev)
-
+self.checkhash(text, node, rev=rev)
 self._cache = (node, rev, text)
 return text

@@ -1235,12 +1233,14 @@
 """
 return hash(text, p1, p2)

-def _checkhash(self, text, node, rev):
-p1, p2 = self.parents(node)
-self.checkhash(text, p1, p2, node, rev)
-return text
+def checkhash(self, text, node, p1=None, p2=None, rev=None):
+"""Check node hash integrity.

-def checkhash(self, text, p1, p2, node, rev=None):
+Available as a function so that subclasses can extend hash mismatch
+behaviors as needed.
+"""
+if p1 is None and p2 is None:
+p1, p2 = self.parents(node)
 if node != self.hash(text, p1, p2):
 revornode = rev
 if revornode is None:
@@ -1415,7 +1415,7 @@
 basetext = self.revision(self.node(baserev), _df=fh)
 btext[0] = mdiff.patch(basetext, delta)
 try:
-self.checkhash(btext[0], p1, p2, node)
+self.checkhash(btext[0], node, p1=p1, p2=p2)
 if flags & REVIDX_ISCENSORED:
 raise RevlogError(_('node %s is not censored') % node)
 except CensoredNodeError:
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel



--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


File Name Patterns Plan

2016-11-24 Thread Pierre-Yves David

Recently, Foozy created a Plan page for the matcher issues:

https://www.mercurial-scm.org/wiki/FileNamePatternsPlan

It is a good start but there is a couple of elements that are
missing or still a bit fuzzy to me.


1) Default matcher:

  What is the default pattern mode ?

  When one do `hg files FOO` how is FOO processed. It seems like to be
  'relpath:'. Double checking this would be useful and mentioning it
  on the page is important.

2) Difference in command behavior:

  There seems to be some commands behaving differently than other,
  notably `hg locates` have some strange kind of
  raw-non-recursive-any-rooted matching by default. It seems to go back to
  'relpath:' when using -I

  I wonder if there is other commands like this. It might be useful to
  search for the default matcher on a command/flag basis.

3) Recursion behavior,

   There is some data about this in the page, but I think we need more
   formal representation to have a clear view of the situation.

   The existing 'path:' and 'relpath:' are recursive in all cases,
   while 'glob:' and 're:' variants are only recursive with -I/-E.
   This is a key point because as far as I understand fixing this is a
   core goal of the current plan.

   However, Foozy point out that using 'set:' with -I disable the
   automatic recursion for 're' and 'glob', but not for 'path', so we
   have more "variants" here.

   (bonus point: Rodrigo use case can we fulfilled by adding 'set:' to
   his selector.)

   I also wonder if there is other variants than "pattern", "-I" and
   "-I + set:".

   Having a table with 'pattern-type / usage' listing the recursive
   case would probably be a good start.

4) Reading from file,

  Foozy mention the pattern name in some file (hgignore) does not
  match pattern name on the command line.

  I think it would be useful to be a bit more formal here. What kind
  of file do we read pattern from? Do we have difference from 1 file
  to another? what are the translation (and default), etc.

5) Pattern-type table

  Foozy made many table explaining how variants are covered by
  pattern type. Having a pattern centric summary will be useful.

  Proposal for columns:

  * pattern type;
  * from cli or file;
  * matching mode (raw, glob, or re),
  * rooting (root, cwd or any),
  * recursive when used as Pattern
  * recursive when used with -I

  Having the same table for the proposed keyword would help to
  understand inconsistency and similarity with

6) file:/dir:

  I'm a bit confused here because Mercurial does not really track/work
  on directories. What is is benefit of 'dir:' ? 'dir:' seems very
  similar to 'path' am I missing something important?

  As I understand 'file:' could be useful for the non-recursive
  part if we want to cover every single cases. Am I right?

7) compatibility conclusion

  Getting a whole new set of matcher is a big step that have a
  significant confusion step, we have to get it right

  We cannot change the default behavior (raw string) and this is what
  people will find the most. So we have to be careful about
  inconsistency here because we cannot change the behavior of this
  current default. For example it is probably better that all the new
  matcher very consistent with each other and that the behavior
  mismatch between raw and the new official one is simple to grasp.

  In the same way, I do not think we'll be able to alias the old
  pattern-type to the new-ones. Because we cannot fix recursion
  behavior of the old ones.
  There will be online material with the old one and we won't be able
  to fix them. This is a lesser issue but we should probably keep it
  in mind. (Without any serious backing I expect that pattern for
  hgignore are probably the most documented online).

Cheers,

--
Pierre-Yves David
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] match: adding non-recursive directory matching

2016-11-24 Thread FUJIWARA Katsunori
At Wed, 23 Nov 2016 19:55:16 -0800,
Rodrigo Damazio wrote:
> 
> Hi guys - any comments on the preferred way forward?
> 
> (I do have a follow-up patch for optimizing visitdir accordingly, but don't
> want to send it until this one is agreed upon)

Sorry for long interval !

> On Thu, Nov 17, 2016 at 1:19 PM, Rodrigo Damazio 
> wrote:
> 
> >
> >
> > On Thu, Nov 17, 2016 at 7:52 AM, FUJIWARA Katsunori  > > wrote:
> >
> >>
> >> (sorry for late reply)
> >>
> >> At Wed, 26 Oct 2016 14:02:48 -0700,
> >> Rodrigo Damazio wrote:
> >> >
> >> > On Wed, Oct 26, 2016 at 12:17 AM, FUJIWARA Katsunori <
> >> fo...@lares.dti.ne.jp>
> >> > wrote:
> >> >
> >> > >
> >> > > At Tue, 25 Oct 2016 19:51:59 -0700,
> >> > > Rodrigo Damazio wrote:
> >> > > >
> >> > > > On Tue, Oct 25, 2016 at 4:31 PM, FUJIWARA Katsunori <
> >> > > fo...@lares.dti.ne.jp>
> >> > > > wrote:
> >> > > >
> >> > > > >
> >> > > > > At Mon, 24 Oct 2016 10:34:52 -0700,
> >> > > > > Rodrigo Damazio wrote:
> >>
> >> [snip]
> >>
> >> > > > On the other hand, you assume that newly introduced *path syntaxes
> >> > > > > will be recursive, as below. Would you assume that default
> >> > > > > recursive-ness is different between *glob and *path syntaxes ?
> >> > > > >
> >> > > >
> >> > > > path would be recursive, as will glob that ends with ** or regex
> >> that
> >> > > ends
> >> > > > with .*
> >> > > >
> >> > > >
> >> > > > > > Also, for discussion: I assume the *path patterns will be
> >> recursive
> >> > > when
> >> > > > > > they reference a directory. Do we also want a non-recursive
> >> > > equivalent
> >> > > > > > (rootexact, rootfiles, rootnonrecursive or something like that)?
> >> > >
> >> > > How about adding syntax type "file"/"dir" ?
> >> > >
> >> > >   = = =
> >> > >   type  for recursive for non-recursive
> >> > >   = = =
> >> > >   glob  use "**"  use "*"
> >> > >   reomit "$"  append "$"
> >> > >   path  always(*1)
> >> > >   file    always
> >> > >   dir   always(*2)
> >> > >   = = =
> >> > >
> >> > >   (*1) match against both file and directory
> >> > >   (*2) match against only directory
> >> > >
> >> > > "dir" might be overkill, though :-) (is it useful in resolving name
> >> > > collision at merging or so ?)
> >> > >
> >> >
> >> > foozy, thanks so much for the review and discussion.
> >> > Sounds like we do agree about the glob behavior then, so let me know if
> >> > you'd like any changes to the latest version of this patch, other than
> >> > improving documentation. I'm happy to send an updated version as soon as
> >> > someone is ready to review.
> >> >
> >> > I understand the difference between dir and path (and between the
> >> original
> >> > version of this patch and file) would be that they'd validate the type
> >> of
> >> > entry being matched (so that passing a filename to dir or dir name to
> >> file
> >> > would be an error) - is that what you have in mind?
> >>
> >> Yes > "passing a filename to dir or dir name to file would be an error"
> >>
> >>
> >> > The current matchers
> >> > don't have a good mechanism to verify the type, so some significant
> >> > rewiring would need to be done to pass that information down.
> >>
> >> Current match implement uses two additional pattern suffix '(?:/|$)'
> >> and '$' to control recursive matching of "glob" and "path". The former
> >> allows to match recursively (for "glob" and "path"), and the latter
> >> doesn't (only for "glob").
> >>
> >> I simply think using this technique to implement pattern types "file"
> >> and "dir".
> >>
> >> path:PATTERN => ESCAPED-PATTERN(?:/|$)
> >> file:PATTERN => ESCAPED-PATTERN$
> >> dif:PATTERN  => ESCAPED-PATTERN/
> >>
> >
> > Yes, "files:" was the original version of this patch and the case I really
> > care about :) I changed it to rootglob after your comments.
> > Which way would be preferred to move forward?

"files:" is "path:" family, and "rootglob:" is "glob:" family. As we
concluded before, "path:" itself can't control recursion of matching
well.

Therefore, I think that "files:" should be implemented if needed,
regardless of implementing "rootglob:".

Of course, we need high point view of this area, at first :-)


BTW, it is a little ambiguous (at least, for me) that "files:foo"
matches against both file "foo" and files just under directory
"foo". Name other than "files:" may resolve this ambiguity, but I
don't have any better (and short enough) name :-<

  ==  === ===
  patternfoo  foo/bar foo/bar/baz
  ==  === ===
  path:fooo o o

  files:foo   o o x

  file:fooo x x
  dir:foo x o o
  ==  === ===


--
[FUJIWARA Katsunori] 

Re: [PATCH 1 of 2] dispatch: move part of callcatch to scmutil

2016-11-24 Thread Yuya Nishihara
On Thu, 24 Nov 2016 13:39:25 +, Jun Wu wrote:
> Excerpts from Yuya Nishihara's message of 2016-11-24 22:28:28 +0900:
> > This seems better belonging to the high-level callcatch(), but anyway it 
> > looks
> > like a moot. If C extensions aren't compiled at all, ImportError would be
> > raised earlier.
> 
> demandimport could be a source of (surprising) ImportErrors with 3rd-party
> code. Like hgsubversion imports svn binding, while the binding does not
> exist.

Yeah, but "hg" calls util.setbinary(), where ImportError would be raised if
osutil.so is missing.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 3 of 3] patchbomb: make --git-format-patch imply --plain

2016-11-24 Thread Yuya Nishihara
On Mon, 21 Nov 2016 08:15:44 +0100, Henning Schild wrote:
> Am Mon, 21 Nov 2016 14:17:32 +0900
> schrieb Yuya Nishihara :
> > On Sun, 20 Nov 2016 12:47:44 +0100, Henning Schild wrote:
> > > I have played with templates to get "Signed-off-by"s into commit
> > > messages. That works but you have to configure it somewhere. Meaning
> > > that every fresh clone will require dreadful customization, or that
> > > global configuration will impose on every cloned repo.
> > > To me it seems like maintaining my own templates is harder than
> > > maintaining an mq in a custom mercurial. Using them you are
> > > basically programming out of the tree, in a strange and limited
> > > language. With a lot of references of what you expect the hg code
> > > to be, but no merge conflicts on updates, no testing etc. Thinking
> > > about it this way, templating is a fine mechanism for under the
> > > hood, but should never have been exposed to the configs.
> > > 
> > > Lets assume supporting the git-format-patch format was a wanted
> > > feature, and let us further assume "hg email" already used
> > > templates. hg or hg-git would than come with a template for it, but
> > > how would that get selected?  
> > 
> > Maybe "hg email -Tgit" (assuming we have a stock "git" template.)
> 
> If we can agree on that or another command line option, it would just
> be a matter of changing the first patch. Actually using templates would
> be another series.

It sounds worse to me than introducing --git-format-patch option. -T is a
semi-global option, which should work in the same way across commands. (and
I bet no one would be invested to start implementing the full templating
support just to get rid of the temporary -Tgit hack.)

So IMHO, --git-format-patch is better if we need the git-style patch email
without templates. That isn't nice, but we can mark the option as "(DEPRECATED)"
later.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH shelve-ext v2] shelve: make --keep option survive user intevention (issue5431)

2016-11-24 Thread Yuya Nishihara
On Wed, 23 Nov 2016 15:02:44 -0800, Kostia Balytskyi wrote:
> # HG changeset patch
> # User Kostia Balytskyi 
> # Date 1479941932 28800
> #  Wed Nov 23 14:58:52 2016 -0800
> # Node ID 00a022e44b62cb9340dfac94096b14f77407cefb
> # Parent  db897ddf3a8ebb8df9556ce97de11f6380a9ef0b
> shelve: make --keep option survive user intevention (issue5431)

Queued per Mateusz' review, thanks.

> @@ -782,6 +787,8 @@ def _dounshelve(ui, repo, *shelved, **op
>  
>  try:
>  state = shelvedstate.load(repo)
> +if opts['keep'] is None:

Changed this to opts.get('keep') like other uses.

> +opts['keep'] = state.keep
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 1 of 2] dispatch: move part of callcatch to scmutil

2016-11-24 Thread Yuya Nishihara
On Thu, 24 Nov 2016 01:17:49 +, Jun Wu wrote:
> # HG changeset patch
> # User Jun Wu 
> # Date 1479948520 0
> #  Thu Nov 24 00:48:40 2016 +
> # Node ID c76f0d4bdee6bfbd7bda771d5c05939d1d4cb132
> # Parent  7ef2ebf5cdf91192a66b461ff56f653a65e65dd7
> # Available At https://bitbucket.org/quark-zju/hg-draft
> #  hg pull https://bitbucket.org/quark-zju/hg-draft -r 
> c76f0d4bdee6
> dispatch: move part of callcatch to scmutil

Queued these, thanks.

> -except ImportError as inst:
> -ui.warn(_("abort: %s!\n") % inst)
> -m = str(inst).split()[-1]
> -if m in "mpatch bdiff".split():
> -ui.warn(_("(did you forget to compile extensions?)\n"))
> -elif m in "zlib".split():
> -ui.warn(_("(is your Python install correct?)\n"))

This seems better belonging to the high-level callcatch(), but anyway it looks
like a moot. If C extensions aren't compiled at all, ImportError would be
raised earlier.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 01 of 10 layering] dirstateguard: move to new module so I can break some layering violations

2016-11-24 Thread Yuya Nishihara
On Wed, 23 Nov 2016 20:42:26 -0500, timeless wrote:
> Yuya Nishihara wrote:
> > Is copy-and-remove a preferred way for this kind of refactoring? I have
> > an unsent patch that splits revset.py, so I want to know which is better.
> 
> Augie Fackler wrote:
> > I'd prefer it slightly, because it makes the blame history a little bit 
> > more intact in the long run.
> 
> Me too.

Thanks, I'll update my local patches to do that.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] py3: use pycompat.getcwd() instead of os.getcwd()

2016-11-24 Thread Yuya Nishihara
On Wed, 23 Nov 2016 00:05:45 +0530, Pulkit Goyal wrote:
> # HG changeset patch
> # User Pulkit Goyal <7895pul...@gmail.com>
> # Date 1479839591 -19800
> #  Wed Nov 23 00:03:11 2016 +0530
> # Node ID a05a33c431e39ea250261287863f4b6ede91ca91
> # Parent  b8e4cce34013b1f564a334465813e8ef171e8834
> py3: use pycompat.getcwd() instead of os.getcwd()

and queued, thanks.

> --- a/mercurial/commands.py   Tue Nov 22 22:40:37 2016 +0530
> +++ b/mercurial/commands.py   Wed Nov 23 00:03:11 2016 +0530
> @@ -2036,7 +2036,8 @@
>  spaces = opts.get('spaces')
>  dots = opts.get('dots')
>  if file_:
> -rlog = revlog.revlog(scmutil.opener(os.getcwd(), audit=False), file_)
> +rlog = revlog.revlog(scmutil.opener(pycompat.getcwd(),
> +audit=False), file_)

I've moved this change to debugcommands.py.
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH shelve-ext v2] shelve: make --keep option survive user intevention (issue5431)

2016-11-24 Thread Mateusz Kwapich
LGTM

-- 
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] match: adding support for repository-root-based globs

2016-11-24 Thread Pierre-Yves David



On 11/24/2016 04:54 AM, Rodrigo Damazio wrote:

Pierre-Yves, hope you had a great time in Morocco :)


I had,


Any other comments here?


Yep, I'll start a thread about the Plan page soon, this topic is now on 
the tip of my "complex discussion" stack and unless Denmark is hit by an 
asteroid I should post something later today.



On Tue, Nov 15, 2016 at 6:21 AM, Pierre-Yves David
>
wrote:



On 11/15/2016 04:59 AM, Rodrigo Damazio Bovendorp via
Mercurial-devel wrote:

# HG changeset patch
# User Rodrigo Damazio Bovendorp >
# Date 1475944120  25200
#  Sat Oct 08 09:28:40 2016 -0700
# Node ID 93434cce258a797fcc3997c0af994a524695e273
# Parent  b032a7b676c6637b2ac6f3ef29431013b15a08f9
match: adding support for repository-root-based globs


I saw that Foozy created a plan page about this, it seem to have
good summary of the current matcher we have but from my reading it
is a bit fuzzy about the current variation we have in behavior from
one command to another and from flag usage. I think it is important
to have a global view of the situation here to be able to
efficiently tackle the issues at hand.

I'm traveling in Marocco with poor internet connectivity until the
end of the week. I would prefer if we could not take a final
discussion until I've time to discuss it more at the beginning of
next week. Sorry for the extra delay.

The broader plan is to add explicit base directories for all
patterns:
  === ===
pattern type root-ed  cwd-ed  any-of-path
  === ===
wildcard rootglob cwdglob anyglob
regexp   rootre   cwdre   anyre
raw string   rootpath cwdpath anypath
  === ===
(table by foozy)

I'm starting by adding rootglob.
One important characteristic and difference from the older glob
types is
that rootglob does a *full* match, meaning that a * at the end
will never
match recursively, even when the glob is used as an include pattern.

diff -r b032a7b676c6 -r 93434cce258a mercurial/help/patterns.txt
--- a/mercurial/help/patterns.txt   Tue Nov 01 18:54:03 2016
-0700
+++ b/mercurial/help/patterns.txt   Sat Oct 08 09:28:40 2016
-0700
@@ -40,6 +40,11 @@
 ``-I`` or ``-X`` options), can match also against directories:
files
 under matched directories are treated as matched.

+For ``-I`` and ``-X`` options, ``glob:`` will match directories
recursively.
+``rootglob:``, on the other end, does a full match, meaning
that all files, in
+directories or subdirectories, will only match if the entire
expression matches.
+In that case, ``**`` can be used to obtain recursiveness.
+
 Plain examples::

   path:foo/bar   a name bar in a directory named foo in the root
@@ -48,13 +53,18 @@

 Glob examples::

-  glob:*.c   any name ending in ".c" in the current directory
-  *.cany name ending in ".c" in the current directory
-  **.c   any name ending in ".c" in any subdirectory of the
- current directory including itself.
-  foo/*.cany name ending in ".c" in the directory foo
-  foo/**.c   any name ending in ".c" in any subdirectory of foo
- including itself.
+  glob:*.cany name ending in ".c" in the current directory
+  *.c any name ending in ".c" in the current directory
+  **.cany name ending in ".c" in any subdirectory
of the
+  current directory including itself.
+  foo/*   any file in directory foo plus all its
subdirectories,
+  recursively
+  foo/*.c any name ending in ".c" in the directory foo
+  foo/**.cany name ending in ".c" in any subdirectory
of foo
+  including itself.
+  rootglob:*.cany name ending in ".c" in the repository root
+  rootglob:foo/*  all files inside foo but not its subdirectories
+  rootglob:foo/** all files inside foo and its subdirectories

 Regexp examples::

diff -r b032a7b676c6 -r 93434cce258a mercurial/match.py
--- a/mercurial/match.pyTue Nov 01 18:54:03 2016 -0700
+++ b/mercurial/match.pySat Oct 08 09:28:40 2016 -0700
@@ -105,6 +105,8 @@
 'glob:' - a glob relative to cwd
 're:' - a regular expression
 'path:' - a