At Wed, 25 Jan 2017 20:54:37 -0800, Martin von Zweigbergk wrote: > > On Mon, Jan 23, 2017 at 5:02 PM, Rodrigo Damazio via Mercurial-devel > <mercurial-devel@mercurial-scm.org> wrote: > > Getting back to this after the end-of-year hiatus (yes, I know it happens to > > be during another code freeze :) I seem to have good timing). > > > > On Tue, Dec 27, 2016 at 2:14 AM, Pierre-Yves David > > <pierre-yves.da...@ens-lyon.org> wrote: > >> > >> > >> > >> On 12/21/2016 04:21 AM, Rodrigo Damazio wrote: > >>> > >>> If I got these two pieces right, it looks like we could just apply > >>> the improvement to 'visitdir' to 'set:your/glob/*' and have your > >>> usecase filled while not jumping into UI changes. Would that work > >>> for you ? > >>> > >>> > >>> Not without a third set of changes, since set expansion doesn't use > >>> visitdir (or the matcher being built) at all - the dependency is that > >>> building the matcher depends on expanding the set (and thus the set > >>> can't depend on the matcher). > >>> It would technically be doable for re:, but I'm wary of getting into the > >>> business of parsing and special-casing regexes to assume what they match > >>> or don't. > >> > >> > >> Rodrigo and I chatted directly about this a couple of days ago. Here is a > >> quick summary of my new understanding of the situation. > >> > >> Fileset > >> ------- > >> > >> Fileset (behind "set:") can give the right result, but it is powered by > >> not very modern code, it follow the old revset principle of "get everything > >> and then run filters on that everything". That does not fit Rodrigo needs > >> at > >> all. It was easy to make 'set:' a bit smarter in the simple case but then > >> we > >> get into the issue that the matcher class is using 'set:' in a strange, > >> non-lazy, way that does not use all the 'visitdir' feature Rodrigo/Google > >> needs. > >> > >> So in short, fileset needs a rework before being usable in a demanding > >> context. > >> > >> > >> Current path restriction capability > >> ----------------------------------- > >> > >> The 'Match' class already have logic to restrict the path visited > >> (implemented in the 'visitdir' method). To clarify, this logic as no effect > >> on the returned match but is only an optimization for the directory we > >> visit. It seems to only kicks in when treemanifest is used. > >> This logic already works with a couple of patterns type (all pattern use > >> the same class). However, that logic currently do not support the case were > >> one want to select some subdirectory and skips the rest of the subtrees > >> under it. > > > > > > That is correct. > > > >> note: Rodrigo, you seems to have a good understanding of the logic. Do you > >> think you could document the involved attributes (_includeroots, > >> _includedirs, _excluderoots, etc) That would help a lot the next poor souls > >> looking at the code. > > > > > > Sure. It took me a while to understand that "roots" means "recursive > > directories" and "dirs" means "non-recursive directories" in that code - it > > all became much more clear after that. I'll be sure to add comments in my > > patch and/or rename the attributes. > > > >> > >> > >> Way forward > >> ----------- > >> > >> That limitation in the matcher class optimization is the main blocker for > >> Rodrigo/Google needs. The optimization is independent of the UI part we > >> actually provides to user as all patterns use the same matcher class and > >> some existing class could already benefit from this optimization. > >> > >> Rodrigo seems to have a patch to update the matcher code to track and > >> optimize the "subdir-but-not-subtree" case. He has not submitted this patch > >> yet. Submitting that patches seems the next step to me. It will get the > >> matcher code in a state that can actually be used for the > >> narrowhg+treemanifest usecase. > >> > >> Once that code is in, it seems easy to make sure various patterns use it > >> basic, easily recognizable cases. We poked at updating the code to have > >> basic regexp matching a subtree recognized as such and that was quite easy. > >> > >> > >> Rodrigo, does that match your current understanding of the situation? > > > > > > It does. > > And just to clarify on the patches - I sent an initial patch, then after > > comments changes it significantly, so those are two different changes: > > > > The first implements a "files:" matcher which matches all files inside a > > directory, non-recursively. This has no wildcards, so special-casing it in > > visitdir and any other places needed results in clean and simple code ("if > > it's files:, don't recurse"). > > The second implements "rootglob:" which allows any number of wildcards at > > point in the path, and is part of Foozy's plan for the new set of matchers. > > This adds some complexity in splitting dirs and roots (mentioned above) by > > having to parse the wildcards, and then the visitdir change looks less clean > > ("if it's a rootglob that has a single /* wildcard at the end, then don't > > recurse" - other cases are possible but start to get more complex). > > > > For these reasons, I'd still prefer to get "files:" or similar in, but I'm > > open for doing it either way. Please advise on the preferred way and I'll > > send an updated patch (2 patches really - one for the matcher, one for the > > visitdir optimization which makes it work with narrow). > > I'm fine with not doing rootglob:, but if I read foozy's proposal > right, the proposed files: will be what he would call rootfiles:. I > liked his proposal for a systematic naming, and if I got that right, I > think we should call it that from the beginning so we don't end up > with more aliases.
Yeah, we should avoid confusion of naming ! > I'd also like "rootfiles:foo" to *not* match the > file "foo", but only files in the directory "foo/". I mention that > because last I heard, he was unsure about that himself. Foozy, do you > agree? I don't have strong opinion against mode "XXX:", which matches against both "just this file" and "files directly under this directory" Therefore, I agree with adding new mode "XXX:", if it is needed (and Rodrigo/Google think so). But, name "files:" doesn't seem to suit for "XXX:" mode (at least, for me). Even if it matches against only "files directly under this directory", "files:" doesn't yet. Maybe, root cause of my bad feel is that "foo" of "files:foo" should be the directory in both cases, even though mode name "files:" has less "(under) this directory" flavor. If it is possible to combine 2 modes below for solving issues of Rodrigo/Google, I'm +1 for splitting "XXX:" into them, because naming "YYY:" should be easier than naming "XXX:". - "file:" matching against "just this file" - "YYY:" matching against "files directly under this directory" Are there any better (and short enough) names for "XXX:" or "YYY:" than "files:" ? - "filesin:" (Files-In) for "YYY:" (<=> "files under" as "dir:") ? - "fileorin:" (File-Or-(Files-)In) for "XXX:" ? Of course, I'm also OK with naming "XXX:" or "YYY:" as "files:", to go forward :-) > > > > > Thanks > > Rodrigo > > > > > > _______________________________________________ > > Mercurial-devel mailing list > > Mercurial-devel@mercurial-scm.org > > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel > > > -- ---------------------------------------------------------------------- [FUJIWARA Katsunori] fo...@lares.dti.ne.jp _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel