Re: [PATCH] match: adding non-recursive directory matching

2017-01-26 Thread Rodrigo Damazio via Mercurial-devel
All sounds very reasonable, and "filesin:" or "rootfilesin:" LGTM.


On Thu, Jan 26, 2017 at 11:24 AM, Martin von Zweigbergk <
martinv...@google.com> wrote:

> On Thu, Jan 26, 2017 at 11:19 AM, FUJIWARA Katsunori
>  wrote:
> >
> > At Wed, 25 Jan 2017 20:54:37 -0800,
> > Martin von Zweigbergk wrote:
> >>
> >> On Mon, Jan 23, 2017 at 5:02 PM, Rodrigo Damazio via Mercurial-devel
> >>  wrote:
> >> > Getting back to this after the end-of-year hiatus (yes, I know it
> happens to
> >> > be during another code freeze :) I seem to have good timing).
> >> >
> >> > On Tue, Dec 27, 2016 at 2:14 AM, Pierre-Yves David
> >> >  wrote:
> >> >>
> >> >>
> >> >>
> >> >> On 12/21/2016 04:21 AM, Rodrigo Damazio wrote:
> >> >>>
> >> >>> If I got these two pieces right, it looks like we could just
> apply
> >> >>> the improvement to 'visitdir' to 'set:your/glob/*' and have your
> >> >>> usecase filled while not jumping into UI changes. Would that
> work
> >> >>> for you ?
> >> >>>
> >> >>>
> >> >>> Not without a third set of changes, since set expansion doesn't use
> >> >>> visitdir (or the matcher being built) at all - the dependency is
> that
> >> >>> building the matcher depends on expanding the set (and thus the set
> >> >>> can't depend on the matcher).
> >> >>> It would technically be doable for re:, but I'm wary of getting
> into the
> >> >>> business of parsing and special-casing regexes to assume what they
> match
> >> >>> or don't.
> >> >>
> >> >>
> >> >> Rodrigo and I chatted directly about this a couple of days ago. Here
> is a
> >> >> quick summary of my new understanding of the situation.
> >> >>
> >> >> Fileset
> >> >> ---
> >> >>
> >> >> Fileset (behind "set:") can give the right result, but it is powered
> by
> >> >> not very modern code, it follow the old revset principle of "get
> everything
> >> >> and then run filters on that everything". That does not fit Rodrigo
> needs at
> >> >> all. It was easy to make 'set:' a bit smarter in the simple case but
> then we
> >> >> get into the issue that the matcher class is using 'set:' in a
> strange,
> >> >> non-lazy, way that does not use all the 'visitdir' feature
> Rodrigo/Google
> >> >> needs.
> >> >>
> >> >> So in short, fileset needs a rework before being usable in a
> demanding
> >> >> context.
> >> >>
> >> >>
> >> >> Current path restriction capability
> >> >> ---
> >> >>
> >> >> The 'Match' class already have logic to restrict the path visited
> >> >> (implemented in the 'visitdir' method). To clarify, this logic as no
> effect
> >> >> on the returned match but is only an optimization for the directory
> we
> >> >> visit. It seems to only kicks in when treemanifest is used.
> >> >> This logic already works with a couple of patterns type (all pattern
> use
> >> >> the same class). However, that logic currently do not support the
> case were
> >> >> one want to select some subdirectory and skips the rest of the
> subtrees
> >> >> under it.
> >> >
> >> >
> >> > That is correct.
> >> >
> >> >> note: Rodrigo, you seems to have a good understanding of the logic.
> Do you
> >> >> think you could document the involved attributes (_includeroots,
> >> >> _includedirs, _excluderoots, etc) That would help a lot the next
> poor souls
> >> >> looking at the code.
> >> >
> >> >
> >> > Sure. It took me a while to understand that "roots" means "recursive
> >> > directories" and "dirs" means "non-recursive directories" in that
> code - it
> >> > all became much more clear after that. I'll be sure to add comments
> in my
> >> > patch and/or rename the attributes.
> >> >
> >> >>
> >> >>
> >> >> Way forward
> >> >> ---
> >> >>
> >> >> That limitation in the matcher class optimization is the main
> blocker for
> >> >> Rodrigo/Google needs. The optimization is independent of the UI part
> we
> >> >> actually provides to user as all patterns use the same matcher class
> and
> >> >> some existing class could already benefit from this optimization.
> >> >>
> >> >> Rodrigo seems to have a patch to update the matcher code to track and
> >> >> optimize the "subdir-but-not-subtree" case. He has not submitted
> this patch
> >> >> yet. Submitting that patches seems the next step to me. It will get
> the
> >> >> matcher code in a state that can actually be used for the
> >> >> narrowhg+treemanifest usecase.
> >> >>
> >> >> Once that code is in, it seems easy to make sure various patterns
> use it
> >> >> basic, easily recognizable cases. We poked at updating the code to
> have
> >> >> basic regexp matching a subtree recognized as such and that was
> quite easy.
> >> >>
> >> >>
> >> >> Rodrigo, does that match your current understanding of the situation?
> >> >
> >> >
> >> > It does.
> >> > And just to clarify on the patches - I sent an initial patch, then
> after
> >> > comments changes it significantly, so those are two different changes:
> >> 

Re: [PATCH] match: adding non-recursive directory matching

2017-01-26 Thread Martin von Zweigbergk via Mercurial-devel
On Thu, Jan 26, 2017 at 11:19 AM, FUJIWARA Katsunori
 wrote:
>
> At Wed, 25 Jan 2017 20:54:37 -0800,
> Martin von Zweigbergk wrote:
>>
>> On Mon, Jan 23, 2017 at 5:02 PM, Rodrigo Damazio via Mercurial-devel
>>  wrote:
>> > Getting back to this after the end-of-year hiatus (yes, I know it happens 
>> > to
>> > be during another code freeze :) I seem to have good timing).
>> >
>> > On Tue, Dec 27, 2016 at 2:14 AM, Pierre-Yves David
>> >  wrote:
>> >>
>> >>
>> >>
>> >> On 12/21/2016 04:21 AM, Rodrigo Damazio wrote:
>> >>>
>> >>> If I got these two pieces right, it looks like we could just apply
>> >>> the improvement to 'visitdir' to 'set:your/glob/*' and have your
>> >>> usecase filled while not jumping into UI changes. Would that work
>> >>> for you ?
>> >>>
>> >>>
>> >>> Not without a third set of changes, since set expansion doesn't use
>> >>> visitdir (or the matcher being built) at all - the dependency is that
>> >>> building the matcher depends on expanding the set (and thus the set
>> >>> can't depend on the matcher).
>> >>> It would technically be doable for re:, but I'm wary of getting into the
>> >>> business of parsing and special-casing regexes to assume what they match
>> >>> or don't.
>> >>
>> >>
>> >> Rodrigo and I chatted directly about this a couple of days ago. Here is a
>> >> quick summary of my new understanding of the situation.
>> >>
>> >> Fileset
>> >> ---
>> >>
>> >> Fileset (behind "set:") can give the right result, but it is powered by
>> >> not very modern code, it follow the old revset principle of "get 
>> >> everything
>> >> and then run filters on that everything". That does not fit Rodrigo needs 
>> >> at
>> >> all. It was easy to make 'set:' a bit smarter in the simple case but then 
>> >> we
>> >> get into the issue that the matcher class is using 'set:' in a strange,
>> >> non-lazy, way that does not use all the 'visitdir' feature Rodrigo/Google
>> >> needs.
>> >>
>> >> So in short, fileset needs a rework before being usable in a demanding
>> >> context.
>> >>
>> >>
>> >> Current path restriction capability
>> >> ---
>> >>
>> >> The 'Match' class already have logic to restrict the path visited
>> >> (implemented in the 'visitdir' method). To clarify, this logic as no 
>> >> effect
>> >> on the returned match but is only an optimization for the directory we
>> >> visit. It seems to only kicks in when treemanifest is used.
>> >> This logic already works with a couple of patterns type (all pattern use
>> >> the same class). However, that logic currently do not support the case 
>> >> were
>> >> one want to select some subdirectory and skips the rest of the subtrees
>> >> under it.
>> >
>> >
>> > That is correct.
>> >
>> >> note: Rodrigo, you seems to have a good understanding of the logic. Do you
>> >> think you could document the involved attributes (_includeroots,
>> >> _includedirs, _excluderoots, etc) That would help a lot the next poor 
>> >> souls
>> >> looking at the code.
>> >
>> >
>> > Sure. It took me a while to understand that "roots" means "recursive
>> > directories" and "dirs" means "non-recursive directories" in that code - it
>> > all became much more clear after that. I'll be sure to add comments in my
>> > patch and/or rename the attributes.
>> >
>> >>
>> >>
>> >> Way forward
>> >> ---
>> >>
>> >> That limitation in the matcher class optimization is the main blocker for
>> >> Rodrigo/Google needs. The optimization is independent of the UI part we
>> >> actually provides to user as all patterns use the same matcher class and
>> >> some existing class could already benefit from this optimization.
>> >>
>> >> Rodrigo seems to have a patch to update the matcher code to track and
>> >> optimize the "subdir-but-not-subtree" case. He has not submitted this 
>> >> patch
>> >> yet. Submitting that patches seems the next step to me. It will get the
>> >> matcher code in a state that can actually be used for the
>> >> narrowhg+treemanifest usecase.
>> >>
>> >> Once that code is in, it seems easy to make sure various patterns use it
>> >> basic, easily recognizable cases. We poked at updating the code to have
>> >> basic regexp matching a subtree recognized as such and that was quite 
>> >> easy.
>> >>
>> >>
>> >> Rodrigo, does that match your current understanding of the situation?
>> >
>> >
>> > It does.
>> > And just to clarify on the patches - I sent an initial patch, then after
>> > comments changes it significantly, so those are two different changes:
>> >
>> > The first implements a "files:" matcher which matches all files inside a
>> > directory, non-recursively. This has no wildcards, so special-casing it in
>> > visitdir and any other places needed results in clean and simple code ("if
>> > it's files:, don't recurse").
>> > The second implements "rootglob:" which allows any number of wildcards at
>> > point in 

[PATCH 1 of 2 remotenames-ext] lazyremotenamedict: make iteritems able to not check whether nodes exist

2017-01-26 Thread Kostia Balytskyi
# HG changeset patch
# User Kostia Balytskyi 
# Date 1485451944 28800
#  Thu Jan 26 09:32:24 2017 -0800
# Node ID 904e442f868455ba9cb66347733804ff3a6d98e0
# Parent  090d16362edd806ec4843c1f0b49d331bec6e524
lazyremotenamedict: make iteritems able to not check whether nodes exist

Similar to `keys()` method, we don't always want to resolve every node
in the changelog, sometimes all we're interested in is the node hash.

diff --git a/remotenames.py b/remotenames.py
--- a/remotenames.py
+++ b/remotenames.py
@@ -367,6 +367,19 @@ class lazyremotenamedict(UserDict.DictMi
 return self.cache.keys()
 return self.potentialentries.keys()
 
+"""Iterate over (name, node) tuples
+
+`resolvekeys` has the same meaning as in `keys()` method"""
+if not self.loaded:
+self._load()
+if resolvekeys is None:
+resolvekeys = self._repo.ui.configbool("remotenames",
+   "resolvekeys", True)
+for k, vtup in self.potentialentries.iteritems():
+if resolvekeys:
+self._fetchandcache(k)
+yield (k, [bin(vtup[0])])
+
 class remotenames(dict):
 """This class encapsulates all the remotenames state. It also contains
 methods to access that state in convenient ways. Remotenames are lazy
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel