Re: [PATCH] match: adding non-recursive directory matching
All sounds very reasonable, and "filesin:" or "rootfilesin:" LGTM. On Thu, Jan 26, 2017 at 11:24 AM, Martin von Zweigbergk < martinv...@google.com> wrote: > On Thu, Jan 26, 2017 at 11:19 AM, FUJIWARA Katsunori >wrote: > > > > At Wed, 25 Jan 2017 20:54:37 -0800, > > Martin von Zweigbergk wrote: > >> > >> On Mon, Jan 23, 2017 at 5:02 PM, Rodrigo Damazio via Mercurial-devel > >> wrote: > >> > Getting back to this after the end-of-year hiatus (yes, I know it > happens to > >> > be during another code freeze :) I seem to have good timing). > >> > > >> > On Tue, Dec 27, 2016 at 2:14 AM, Pierre-Yves David > >> > wrote: > >> >> > >> >> > >> >> > >> >> On 12/21/2016 04:21 AM, Rodrigo Damazio wrote: > >> >>> > >> >>> If I got these two pieces right, it looks like we could just > apply > >> >>> the improvement to 'visitdir' to 'set:your/glob/*' and have your > >> >>> usecase filled while not jumping into UI changes. Would that > work > >> >>> for you ? > >> >>> > >> >>> > >> >>> Not without a third set of changes, since set expansion doesn't use > >> >>> visitdir (or the matcher being built) at all - the dependency is > that > >> >>> building the matcher depends on expanding the set (and thus the set > >> >>> can't depend on the matcher). > >> >>> It would technically be doable for re:, but I'm wary of getting > into the > >> >>> business of parsing and special-casing regexes to assume what they > match > >> >>> or don't. > >> >> > >> >> > >> >> Rodrigo and I chatted directly about this a couple of days ago. Here > is a > >> >> quick summary of my new understanding of the situation. > >> >> > >> >> Fileset > >> >> --- > >> >> > >> >> Fileset (behind "set:") can give the right result, but it is powered > by > >> >> not very modern code, it follow the old revset principle of "get > everything > >> >> and then run filters on that everything". That does not fit Rodrigo > needs at > >> >> all. It was easy to make 'set:' a bit smarter in the simple case but > then we > >> >> get into the issue that the matcher class is using 'set:' in a > strange, > >> >> non-lazy, way that does not use all the 'visitdir' feature > Rodrigo/Google > >> >> needs. > >> >> > >> >> So in short, fileset needs a rework before being usable in a > demanding > >> >> context. > >> >> > >> >> > >> >> Current path restriction capability > >> >> --- > >> >> > >> >> The 'Match' class already have logic to restrict the path visited > >> >> (implemented in the 'visitdir' method). To clarify, this logic as no > effect > >> >> on the returned match but is only an optimization for the directory > we > >> >> visit. It seems to only kicks in when treemanifest is used. > >> >> This logic already works with a couple of patterns type (all pattern > use > >> >> the same class). However, that logic currently do not support the > case were > >> >> one want to select some subdirectory and skips the rest of the > subtrees > >> >> under it. > >> > > >> > > >> > That is correct. > >> > > >> >> note: Rodrigo, you seems to have a good understanding of the logic. > Do you > >> >> think you could document the involved attributes (_includeroots, > >> >> _includedirs, _excluderoots, etc) That would help a lot the next > poor souls > >> >> looking at the code. > >> > > >> > > >> > Sure. It took me a while to understand that "roots" means "recursive > >> > directories" and "dirs" means "non-recursive directories" in that > code - it > >> > all became much more clear after that. I'll be sure to add comments > in my > >> > patch and/or rename the attributes. > >> > > >> >> > >> >> > >> >> Way forward > >> >> --- > >> >> > >> >> That limitation in the matcher class optimization is the main > blocker for > >> >> Rodrigo/Google needs. The optimization is independent of the UI part > we > >> >> actually provides to user as all patterns use the same matcher class > and > >> >> some existing class could already benefit from this optimization. > >> >> > >> >> Rodrigo seems to have a patch to update the matcher code to track and > >> >> optimize the "subdir-but-not-subtree" case. He has not submitted > this patch > >> >> yet. Submitting that patches seems the next step to me. It will get > the > >> >> matcher code in a state that can actually be used for the > >> >> narrowhg+treemanifest usecase. > >> >> > >> >> Once that code is in, it seems easy to make sure various patterns > use it > >> >> basic, easily recognizable cases. We poked at updating the code to > have > >> >> basic regexp matching a subtree recognized as such and that was > quite easy. > >> >> > >> >> > >> >> Rodrigo, does that match your current understanding of the situation? > >> > > >> > > >> > It does. > >> > And just to clarify on the patches - I sent an initial patch, then > after > >> > comments changes it significantly, so those are two different changes: > >>
Re: [PATCH] match: adding non-recursive directory matching
On Thu, Jan 26, 2017 at 11:19 AM, FUJIWARA Katsunoriwrote: > > At Wed, 25 Jan 2017 20:54:37 -0800, > Martin von Zweigbergk wrote: >> >> On Mon, Jan 23, 2017 at 5:02 PM, Rodrigo Damazio via Mercurial-devel >> wrote: >> > Getting back to this after the end-of-year hiatus (yes, I know it happens >> > to >> > be during another code freeze :) I seem to have good timing). >> > >> > On Tue, Dec 27, 2016 at 2:14 AM, Pierre-Yves David >> > wrote: >> >> >> >> >> >> >> >> On 12/21/2016 04:21 AM, Rodrigo Damazio wrote: >> >>> >> >>> If I got these two pieces right, it looks like we could just apply >> >>> the improvement to 'visitdir' to 'set:your/glob/*' and have your >> >>> usecase filled while not jumping into UI changes. Would that work >> >>> for you ? >> >>> >> >>> >> >>> Not without a third set of changes, since set expansion doesn't use >> >>> visitdir (or the matcher being built) at all - the dependency is that >> >>> building the matcher depends on expanding the set (and thus the set >> >>> can't depend on the matcher). >> >>> It would technically be doable for re:, but I'm wary of getting into the >> >>> business of parsing and special-casing regexes to assume what they match >> >>> or don't. >> >> >> >> >> >> Rodrigo and I chatted directly about this a couple of days ago. Here is a >> >> quick summary of my new understanding of the situation. >> >> >> >> Fileset >> >> --- >> >> >> >> Fileset (behind "set:") can give the right result, but it is powered by >> >> not very modern code, it follow the old revset principle of "get >> >> everything >> >> and then run filters on that everything". That does not fit Rodrigo needs >> >> at >> >> all. It was easy to make 'set:' a bit smarter in the simple case but then >> >> we >> >> get into the issue that the matcher class is using 'set:' in a strange, >> >> non-lazy, way that does not use all the 'visitdir' feature Rodrigo/Google >> >> needs. >> >> >> >> So in short, fileset needs a rework before being usable in a demanding >> >> context. >> >> >> >> >> >> Current path restriction capability >> >> --- >> >> >> >> The 'Match' class already have logic to restrict the path visited >> >> (implemented in the 'visitdir' method). To clarify, this logic as no >> >> effect >> >> on the returned match but is only an optimization for the directory we >> >> visit. It seems to only kicks in when treemanifest is used. >> >> This logic already works with a couple of patterns type (all pattern use >> >> the same class). However, that logic currently do not support the case >> >> were >> >> one want to select some subdirectory and skips the rest of the subtrees >> >> under it. >> > >> > >> > That is correct. >> > >> >> note: Rodrigo, you seems to have a good understanding of the logic. Do you >> >> think you could document the involved attributes (_includeroots, >> >> _includedirs, _excluderoots, etc) That would help a lot the next poor >> >> souls >> >> looking at the code. >> > >> > >> > Sure. It took me a while to understand that "roots" means "recursive >> > directories" and "dirs" means "non-recursive directories" in that code - it >> > all became much more clear after that. I'll be sure to add comments in my >> > patch and/or rename the attributes. >> > >> >> >> >> >> >> Way forward >> >> --- >> >> >> >> That limitation in the matcher class optimization is the main blocker for >> >> Rodrigo/Google needs. The optimization is independent of the UI part we >> >> actually provides to user as all patterns use the same matcher class and >> >> some existing class could already benefit from this optimization. >> >> >> >> Rodrigo seems to have a patch to update the matcher code to track and >> >> optimize the "subdir-but-not-subtree" case. He has not submitted this >> >> patch >> >> yet. Submitting that patches seems the next step to me. It will get the >> >> matcher code in a state that can actually be used for the >> >> narrowhg+treemanifest usecase. >> >> >> >> Once that code is in, it seems easy to make sure various patterns use it >> >> basic, easily recognizable cases. We poked at updating the code to have >> >> basic regexp matching a subtree recognized as such and that was quite >> >> easy. >> >> >> >> >> >> Rodrigo, does that match your current understanding of the situation? >> > >> > >> > It does. >> > And just to clarify on the patches - I sent an initial patch, then after >> > comments changes it significantly, so those are two different changes: >> > >> > The first implements a "files:" matcher which matches all files inside a >> > directory, non-recursively. This has no wildcards, so special-casing it in >> > visitdir and any other places needed results in clean and simple code ("if >> > it's files:, don't recurse"). >> > The second implements "rootglob:" which allows any number of wildcards at >> > point in
[PATCH 1 of 2 remotenames-ext] lazyremotenamedict: make iteritems able to not check whether nodes exist
# HG changeset patch # User Kostia Balytskyi# Date 1485451944 28800 # Thu Jan 26 09:32:24 2017 -0800 # Node ID 904e442f868455ba9cb66347733804ff3a6d98e0 # Parent 090d16362edd806ec4843c1f0b49d331bec6e524 lazyremotenamedict: make iteritems able to not check whether nodes exist Similar to `keys()` method, we don't always want to resolve every node in the changelog, sometimes all we're interested in is the node hash. diff --git a/remotenames.py b/remotenames.py --- a/remotenames.py +++ b/remotenames.py @@ -367,6 +367,19 @@ class lazyremotenamedict(UserDict.DictMi return self.cache.keys() return self.potentialentries.keys() +"""Iterate over (name, node) tuples + +`resolvekeys` has the same meaning as in `keys()` method""" +if not self.loaded: +self._load() +if resolvekeys is None: +resolvekeys = self._repo.ui.configbool("remotenames", + "resolvekeys", True) +for k, vtup in self.potentialentries.iteritems(): +if resolvekeys: +self._fetchandcache(k) +yield (k, [bin(vtup[0])]) + class remotenames(dict): """This class encapsulates all the remotenames state. It also contains methods to access that state in convenient ways. Remotenames are lazy ___ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel