Re: Proposal for cleaning up error reporting

2020-07-21 Thread Rodrigo Damazio via Mercurial-devel
On Thu, Jul 16, 2020 at 9:09 AM Pierre-Yves David <
pierre-yves.da...@ens-lyon.org> wrote:

> The first time I saw this proposal I though:
>
> - That is a great idea, I have been unhappy about these return for a
> long time

- The backward compatibility breakage implication might be huge. We
> probably need to gate this behind config at first.
>

I'm fine with gating some part of it by a config knob, and will leave it up
to the community to decide when to break backwards compatibility in the
future.


> - The proposal is over all good but I have a couple of adjustement and
> improvement in mind.
>

Would love to hear your thoughts.

And from Google's perspective, I don't feel too strongly about the return
codes - I added it because it made sense to have an end-to-end design.
What we'll likely do is intercept the exceptions in our extension and
report any failures by type to our servers, rather than relying on the
actual codes.

Then the paperwork gods swiftly jumped back at me and I got drown in
> end-of-fiscal-year stuff and forgot to reply.
>
> Since this was 1 month ago, I am sending this small answer with my
> general sentiment to avoid letting this fell through the crack again.
> I'll try to find time to give actual feedback and change proposal on
> this soon.
>
> On 6/13/20 2:10 AM, Rodrigo Damazio via Mercurial-devel wrote:
> > Hi everyone.
> >
> > Any comments on this early proposal are most welcome:
> > https://www.mercurial-scm.org/wiki/ErrorCategoriesPlan
> >
> > Our motivation is to measure the user experience our users are actually
> > getting, and have realtime pager alerts if e.g. a new client or server
> > release causes a big change, but not count or alert on errors that are
> > "not our fault", such as bad user input.
> > For instance, we have up to 70% error rates on weekends because some
> > people leave something like "hg status" or "hg log" running in a loop on
> > a terminal, and their credentials expire - those should not be counted
> > at all. Even in the middle of workdays, our running error rate is way
> > more than it "should" be, because we currently measure all non-zero
> > status codes.
> >
> > Thanks
> > Rodrigo
> >
> >
> > ___
> > Mercurial-devel mailing list
> > Mercurial-devel@mercurial-scm.org
> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> >
>
> --
> Pierre-Yves David
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: Proposal for cleaning up error reporting

2020-07-15 Thread Rodrigo Damazio via Mercurial-devel
Thanks Augie. Me too.
If nobody has thoughts I'll start sending (large, invasive) code changes
soon :)


On Wed, Jul 15, 2020 at 9:20 AM Augie Fackler  wrote:

> I think this makes sense, but I'd love to hear feedback from non-Google
> people...
>
> On Jun 12, 2020, at 20:10, Rodrigo Damazio via Mercurial-devel <
> mercurial-devel@mercurial-scm.org> wrote:
>
> Hi everyone.
>
> Any comments on this early proposal are most welcome:
> https://www.mercurial-scm.org/wiki/ErrorCategoriesPlan
>
> Our motivation is to measure the user experience our users are actually
> getting, and have realtime pager alerts if e.g. a new client or server
> release causes a big change, but not count or alert on errors that are "not
> our fault", such as bad user input.
> For instance, we have up to 70% error rates on weekends because some
> people leave something like "hg status" or "hg log" running in a loop on a
> terminal, and their credentials expire - those should not be counted at
> all. Even in the middle of workdays, our running error rate is way more
> than it "should" be, because we currently measure all non-zero status codes.
>
> Thanks
> Rodrigo
>
> ___
> Mercurial-devel mailing list
> Mercurial-devel@mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
>
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Proposal for cleaning up error reporting

2020-06-12 Thread Rodrigo Damazio via Mercurial-devel
Hi everyone.

Any comments on this early proposal are most welcome:
https://www.mercurial-scm.org/wiki/ErrorCategoriesPlan

Our motivation is to measure the user experience our users are actually
getting, and have realtime pager alerts if e.g. a new client or server
release causes a big change, but not count or alert on errors that are "not
our fault", such as bad user input.
For instance, we have up to 70% error rates on weekends because some people
leave something like "hg status" or "hg log" running in a loop on a
terminal, and their credentials expire - those should not be counted at
all. Even in the middle of workdays, our running error rate is way more
than it "should" be, because we currently measure all non-zero status codes.

Thanks
Rodrigo


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH 5 of 5] log: populate keywords if specified in custom -Tjson(...) or -Tcbor(...)

2019-10-08 Thread Rodrigo Damazio via Mercurial-devel
Thanks for doing this, it'll be very useful :)

On Tue, Oct 8, 2019 at 10:33 AM Augie Fackler  wrote:

> (+rdamazio because I believe this is of interest)
>
> > On Oct 6, 2019, at 16:00, Yuya Nishihara  wrote:
> >
> > # HG changeset patch
> > # User Yuya Nishihara 
> > # Date 1570388321 <(15)%207038-8321> 14400
> > #  Sun Oct 06 14:58:41 2019 -0400 <(41)%202019-0400>
> > # Node ID debe9628570ddfd9728f2097eb67ea0febaffd22
> > # Parent  7260fcfed7308f15554bbb0ed86e5e36f1a420fd
> > log: populate keywords if specified in custom -Tjson(...) or -Tcbor(...)
>
> These are queued, many many thanks.
>
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH v2] fancyopts: making config defaults actually override defaults

2018-03-03 Thread Rodrigo Damazio via Mercurial-devel
FYI I'm getting back to this old patch, but I'll send it through
phabricator this time.


On Sat, Apr 1, 2017 at 11:07 AM, Yuya Nishihara  wrote:

> On Fri, 24 Mar 2017 00:32:00 -0700, Rodrigo Damazio Bovendorp via
> Mercurial-devel wrote:
> > # HG changeset patch
> > # User Rodrigo Damazio 
> > # Date 1490340211 25200
> > #  Fri Mar 24 00:23:31 2017 -0700
> > # Node ID 60b3222e01f96f91ece7eda9681a89bf3bb930a6
> > # Parent  df82f375fa005b9c71b463182e6b54aa47fa5999
> > fancyopts: making config defaults actually override defaults
>
> Sorry for late review.
>
> This looks generally good to me. I have a few comments inline.
>
> >  c = list(entry[1])
> > +
> > +# override new-style defaults from config file by actually
> changing the
> > +# option defaults.
> > +for idx, opt in enumerate(c):
> > +optname = opt[1]
> > +shortname = opt[0]
> > +defaulttype = type(opt[2])
> > +rawdefault = (
> > +ui.config("commands", "%s.default.%s" % (cmd, optname))
> or
> > +ui.config("commands", "%s.default.%s" % (cmd,
> shortname)))
>
> I prefer not supporting "default." because shortname doesn't
> look
> like a config name. And if we do support it, an empty longname value should
> supersede a shortname:
>
>   ui.config(, default=ui.config())
>
> > +if rawdefault:
> > +# parse the new default using the type of the original
> default.
> > +default = fancyopts.parsevalue(optname, rawdefault,
> defaulttype,
> > +
>  util.parsebool(rawdefault))
>
> Can we use ui.configbool/list/int() instead of parsevalue() ? That will
> provide a slightly better error message.
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH v2] templates: add substring and string length operations

2017-07-18 Thread Rodrigo Damazio via Mercurial-devel
Hold on, I have a newer version I'm finishing which will support slicing
over lists as well, as Yuya suggested.


On Tue, Jul 18, 2017 at 2:01 PM, Kevin Bullock <
kbullock+mercur...@ringworld.org> wrote:

> > On Jul 14, 2017, at 18:42, Rodrigo Damazio Bovendorp via Mercurial-devel
>  wrote:
> >
> > # HG changeset patch
> > # User Rodrigo Damazio Bovendorp 
> > # Date 1500075683 25200
> > #  Fri Jul 14 16:41:23 2017 -0700
> > # Node ID c4bac4ea7b1ea923d6ba4299cd9c974469b39cb0
> > # Parent  c0d8de2724ce6240d2a4241aff78ce2ee92359c2
> > templates: add substring and string length operations
> >
> > This will allow substr(text, start, end) and strlen(text) in templates,
> > permitting various text formatting, such as making a (non-graphing) log
> line be
> > limited to terminal width ("substr(desc, 0, termwidth)")
>
> Looks straightforward, queued, thanks.
>
> pacem in terris / мир / शान्ति / ‎‫سَلاَم‬ / 平和
> Kevin R. Bullock
>
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] templates: add substring and string length operations

2017-07-14 Thread Rodrigo Damazio via Mercurial-devel
Thanks for the review.
We're seeing some weirdness in our script to use gmail's backends directly
- I sent v2 but it apparently never got to the list, so I'll probably try
to switch to Phabricator for v2/v3.

On Fri, Jul 14, 2017 at 7:27 PM, Yuya Nishihara <y...@tcha.org> wrote:

> On Fri, 14 Jul 2017 16:27:49 -0700, Rodrigo Damazio via Mercurial-devel
> wrote:
> > On Fri, Jul 14, 2017 at 3:48 PM, Rodrigo Damazio Bovendorp <
> > rdama...@google.com> wrote:
> > > # HG changeset patch
> > > # User Rodrigo Damazio Bovendorp <rdama...@google.com>
> > > # Date 1500072378 25200
> > > #  Fri Jul 14 15:46:18 2017 -0700
> > > # Node ID 0ccebbd04efbd672fc71df7f52ec243057cbed7d
> > > # Parent  c0d8de2724ce6240d2a4241aff78ce2ee92359c2
> > > templates: add substring and string length operations
>
> > > +@templatefilter('strlen')
> > > +def stringlen(text):
> > > +"""Any text. Turns the value into its length."""
> > > +return len(text)
>
> You can use "str|count" instead.
>

Ah, didn't see that one - I'll remove this then.

> > +@templatefunc('substr(text, start[, end])')
>
> I think substr() generally takes offset and length, not start:end range.
>

Yes, I considered that the start:end (like Python has) was more powerful
because it allows negative numbers to reference the end, whereas offset and
length needs to be calculated in those cases. Would you like me to switch?


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] templates: add substring and string length operations

2017-07-14 Thread Rodrigo Damazio via Mercurial-devel
Hmm hold off, there's an issue here.

On Fri, Jul 14, 2017 at 3:48 PM, Rodrigo Damazio Bovendorp <
rdama...@google.com> wrote:

> # HG changeset patch
> # User Rodrigo Damazio Bovendorp 
> # Date 1500072378 25200
> #  Fri Jul 14 15:46:18 2017 -0700
> # Node ID 0ccebbd04efbd672fc71df7f52ec243057cbed7d
> # Parent  c0d8de2724ce6240d2a4241aff78ce2ee92359c2
> templates: add substring and string length operations
>
> This will allow substr(text, start, end) and strlen(text) in templates,
> permitting various text formatting, such as making a (non-graphing) log
> line be
> limited to terminal width ("substr(desc, 0, termwidth)")
>
> diff --git a/mercurial/templatefilters.py b/mercurial/templatefilters.py
> --- a/mercurial/templatefilters.py
> +++ b/mercurial/templatefilters.py
> @@ -362,6 +362,11 @@
>  return ""
>  return pycompat.bytestr(thing)
>
> +@templatefilter('strlen')
> +def stringlen(text):
> +"""Any text. Turns the value into its length."""
> +return len(text)
> +
>  @templatefilter('stripdir')
>  def stripdir(text):
>  """Treat the text as path and strip a directory level, if
> diff --git a/mercurial/templater.py b/mercurial/templater.py
> --- a/mercurial/templater.py
> +++ b/mercurial/templater.py
> @@ -1015,6 +1015,25 @@
>  # i18n: "sub" is a keyword
>  raise error.ParseError(_("sub got an invalid replacement: %s") %
> rpl)
>
> +@templatefunc('substr(text, start[, end])')
> +def substring(context, mapping, args):
> +"""Returns a substring of the given text. Negative indices reference
> the end
> +of the string."""
> +if len(args) < 2 or len(args) > 3:
> +  raise error.ParseError(_("substring takes 2 or 3 arguments"))
> +
> +text = evalstring(context, mapping, args[0])
> +textlen = len(text)
> +start = evalinteger(context, mapping, args[1],
> +  _("start expects an integer index"))
> +end = -1
> +if len(args) > 2:
> +  end = evalinteger(context, mapping, args[2],
> +_("end expects an integer index"))
> +
> +# Python's [] already handles start and end boundary conditions.
> +return text[start:end]
> +
>  @templatefunc('startswith(pattern, text)')
>  def startswith(context, mapping, args):
>  """Returns the value from the "text" argument
> diff --git a/tests/test-command-template.t b/tests/test-command-template.t
> --- a/tests/test-command-template.t
> +++ b/tests/test-command-template.t
> @@ -4011,6 +4011,35 @@
>o  line 1
>   line 2
>
> +Test stringlen and substring
> +Full desc is "Modify, add, remove, rename".
> +String idxs:  012345678901
> +Reverse string idxs:  10987654321
> +
> +  $ hg log -R a -r . --template '{desc|strlen}\n'
> +  27
> +  $ hg log -R a -r . --template '{substr(desc, 5, 10)}\n'
> +  y, ad
> +  $ hg log -R a -r . --template '{substr(desc, 5, -10)}\n'
> +  y, add, remo
> +  $ hg log -R a -r . --template '{substr(desc, 5, strlen(desc) - 10)}\n'
> +  y, add, remo
> +  $ hg log -R a -r . --template '{substr(desc, -10, -3)}\n'
> +  ve, ren
> +
> +Test substr with invalid indices
> +
> +  $ hg log -R a -r . --template '{substr(desc, 5, 200)}\n'
> +  y, add, remove, rename
> +  $ hg log -R a -r . --template '{substr(desc, 10, 5)}\n'
> +
> +  $ hg log -R a -r . --template '{substr(desc, 100, 200)}\n'
> +
> +  $ hg log -R a -r . --template '{substr(desc, -100, -50)}\n'
> +
> +  $ hg log -R a -r . --template '{substr(desc, -50, -100)}\n'
> +
> +
>  Test bad template with better error message
>
>$ hg log -Gv -R a --template '{desc|user()}'
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] fancyopts: making config defaults actually override defaults

2017-03-24 Thread Rodrigo Damazio via Mercurial-devel
On Thu, Mar 23, 2017 at 3:07 AM, Ryan McElroy <r...@fb.com> wrote:

> On 3/22/17 6:51 PM, Martin von Zweigbergk wrote:
>
>> On Wed, Mar 22, 2017 at 10:14 AM, Rodrigo Damazio <rdama...@google.com>
>> wrote:
>>
>>> On Wed, Mar 22, 2017 at 3:50 AM, Ryan McElroy <r...@fb.com> wrote:
>>>
>>>> Rodrigo: for some reason, patchwork thinks you are Martin. Any idea why?
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwo
>>>> rk.mercurial-2Dscm.org_patch_19133_=DwIBaQ=5VD0RTtNlTh3y
>>>> cd41b3MUw=Jw8rundaE7TbmqBYd1txIQ=qgti-7e8saOlUI0k7ljLZ6H
>>>> Ah4SsKVlgoONeD60WZRA=pKbjIO05blt_27t3IcMhFfzDFctfA-bMMaXrmGmiXJM=
>>>>
>>> No idea, but I'm using a script he created for mailing patches directly
>>> to
>>> gmail's backend servers (to avoid reformatting) - Martin, any chance you
>>> hardcoded your own address there? :)
>>>
>> Nope. IIUC, patchwork has decided to associate any email from
>> mercurial-devel@mercurial-scm.org with me, because it probably first
>> received one from there from me. And the reason it is from
>> mercurial-devel@mercurial-scm.org in the first place is because we
>> (Google) set the DMARC/SPF/whatever headers to prevent spoofing of
>> @google.com addresses, so the mailing list has no choice but to
>> rewrite it as coming from the mailing list itself. I may very well
>> have misunderstood how that works, but hopefully you get the idea
>> anyway.
>>
>> On 3/14/17 10:16 PM, Rodrigo Damazio via Mercurial-devel wrote:
>>>>
>>>> On Tue, Mar 14, 2017 at 12:50 PM, David Soria Parra
>>>> <d...@experimentalworks.net> wrote:
>>>>
>>>>> On Sat, Mar 11, 2017 at 06:03:30PM -0800, Rodrigo Damazio Bovendorp via
>>>>> Mercurial-devel wrote:
>>>>>
>>>>>> # HG changeset patch
>>>>>> # User Rodrigo Damazio <rdama...@google.com>
>>>>>> # Date 1489274373 28800
>>>>>> #  Sat Mar 11 15:19:33 2017 -0800
>>>>>> # Node ID 8c833b81a994e2d3304c3b06793f536355528aab
>>>>>> # Parent  62939e0148f170b67ca8c7374f36c413b67fd387
>>>>>> fancyopts: making config defaults actually override defaults
>>>>>>
>>>>>> Overall this LGTM, and clearly makes defaults much better :).  My
>>>>> concern
>>>>> is that we are encouraging the use of defaults again, while they are
>>>>> deprecated. Defaults have inherent problems that they overwrite
>>>>> arguments
>>>>> which might be mutable exclusive with others (e.g. --graph in incoming
>>>>> and outgoing), or lead to undesired behavior if it's set by an admin.
>>>>> an
>>>>> exmaple
>>>>> is if you would specifiy defaults.update.check=True, the user will not
>>>>> find an
>>>>> --no-check option in the help message or anywhere else. This is not a
>>>>> problem if
>>>>> we assume defaults are alway set by the user and he knows about them.
>>>>>
>>>>
>>>> Thanks for the review.
>>>>
>>>> Yes, we discussed the update --check case specifically during Sprint:
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__public.
>>>> etherpad-2Dmozilla.org_p_sprint-2Dhg4.2=DwIBaQ=5VD0RTtNl
>>>> Th3ycd41b3MUw=Jw8rundaE7TbmqBYd1txIQ=qgti-7e8saOlUI0k7lj
>>>> LZ6HAh4SsKVlgoONeD60WZRA=usophD9VL71sbr6iMbVTWMVQPbQLoK4f-
>>>> qe9si7v3rU=
>>>> (search for "Flags and defaults breakout")
>>>>
>>>>
>>>> Note that I copied the notes over the the wiki for posterity:
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mer
>>>> curial-2Dscm.org_wiki_4.2sprint_Notes=DwIBaQ=5VD0RTtNlTh
>>>> 3ycd41b3MUw=Jw8rundaE7TbmqBYd1txIQ=qgti-7e8saOlUI0k7ljLZ
>>>> 6HAh4SsKVlgoONeD60WZRA=5pz07una5PeOy9ugeIPoZIk4uGiP7gE5cwdweE6HcTI=
>>>>
>>>>
>>>> If people do some cleanup passes and categorization, that would be
>>>> useful.
>>>> I may contribute here at some point as well.
>>>>
>>>>
>>>> The conclusion was that this gains us the ability to do proper
>>>> single-flag
>>>> overrides, which is good, but doesn't solve all the issues. There are
>>>> other
>>>> changes we also want to make flags and defaults useful again:
>>>> - make the passed-i

Re: [PATCH] fancyopts: making config defaults actually override defaults

2017-03-14 Thread Rodrigo Damazio via Mercurial-devel
On Tue, Mar 14, 2017 at 12:50 PM, David Soria Parra <
d...@experimentalworks.net> wrote:

> On Sat, Mar 11, 2017 at 06:03:30PM -0800, Rodrigo Damazio Bovendorp via
> Mercurial-devel wrote:
> > # HG changeset patch
> > # User Rodrigo Damazio 
> > # Date 1489274373 28800
> > #  Sat Mar 11 15:19:33 2017 -0800
> > # Node ID 8c833b81a994e2d3304c3b06793f536355528aab
> > # Parent  62939e0148f170b67ca8c7374f36c413b67fd387
> > fancyopts: making config defaults actually override defaults
> >
>
> Overall this LGTM, and clearly makes defaults much better :).  My concern
> is that we are encouraging the use of defaults again, while they are
> deprecated. Defaults have inherent problems that they overwrite arguments
> which might be mutable exclusive with others (e.g. --graph in incoming
> and outgoing), or lead to undesired behavior if it's set by an admin. an
> exmaple
> is if you would specifiy defaults.update.check=True, the user will not
> find an
> --no-check option in the help message or anywhere else. This is not a
> problem if
> we assume defaults are alway set by the user and he knows about them.
>

Thanks for the review.

Yes, we discussed the update --check case specifically during Sprint:
https://public.etherpad-mozilla.org/p/sprint-hg4.2
(search for "Flags and defaults breakout")

The conclusion was that this gains us the ability to do proper single-flag
overrides, which is good, but doesn't solve all the issues. There are other
changes we also want to make flags and defaults useful again:
- make the passed-in flag values not be simple primitive types, but rather
enhance them with addition information about where the value is coming
from, so commands like update can decide that an explicit --clean overrides
a system default of --check, and should only fail if both come from the
same level
- we want to add a --no- counterflag for every flag, not just booleans, as
a way to unset it (useful for revision-specifying flags for instance)
- we want to add environment variables to the stack of overrides along with
the different levels of config files and command-line arguments
- we want to try making all positional arguments map to flags (e.g. "hg
update 123" would be equivalent to "hg update -r 123" by making 123 be
passed to the command as the -r flag, also allowing config overrides for
those)
- we want to investigate why we support callables in flag defaults, and
remove support if that's not useful to anyone


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] match: adding non-recursive directory matching

2017-02-06 Thread Rodrigo Damazio via Mercurial-devel
On Sun, Feb 5, 2017 at 4:01 AM, FUJIWARA Katsunori 
wrote:

> At Fri, 3 Feb 2017 20:26:04 -0800,
> Rodrigo Damazio wrote:
> >
> > [1  ]
> > [1.1  ]
> > Finally working on this again.
> > On point which I discussed with Martin offline - which feels more
> intuitive
> > as a prefix, "root" or "abs"? (so, "rootfilesin" or "absfilesin"?) We
> think
> > it's "abs", but wanted to make sure that's OK with others.
>
> In FileNamePatternsPlan page, I choose "root" as a name of point, to
> which patterns are relative ("root", "cwd", and "any").
>
> I'm OK with "abs", if other thinks that it is more intuitive for
> "relative to the root".
>

Alright, I'll keep "root", it sounds more consistent when put that way.
(Augie also seems to prefer root)

BTW, are "cwd" and "any" prefixes are OK with "abs" ?


> > Thanks
> >
> >
> > On Sun, Jan 29, 2017 at 3:15 AM, FUJIWARA Katsunori <
> fo...@lares.dti.ne.jp>
> > wrote:
> >
> > >
> > > At Fri, 27 Jan 2017 15:14:38 -0800,
> > > Rodrigo Damazio wrote:
> > > >
> > > > [1  ]
> > > > [1.1  ]
> > > > On Fri, Jan 27, 2017 at 1:03 AM, FUJIWARA Katsunori <
> > > fo...@lares.dti.ne.jp>
> > > > wrote:
> > > >
> > > > >
> > > > > At Thu, 26 Jan 2017 17:27:17 -0800,
> > > > > Rodrigo Damazio wrote:
> > > > > >
> > > > > > [1  ]
> > > > > > [1.1  ]
> > > > > > All sounds very reasonable, and "filesin:" or "rootfilesin:"
> LGTM.
> > > > >
> > > > > Is it OK for your solution that "rootfilesin:FOO" doesn't match
> > > > > against "file FOO", even though your patch posted in this thread
> made
> > > > > "files:FOO" do so ? or, is combining "rootfile:" and "rootfilesin"
> > > > > acceptable for your solution ?
> > > > >
> > > >
> > > > Yes, not matching files is fine, and actually the easiest to
> implement
> > > (the
> > > > regex is simpler and our custom server doesn't support files anyway).
> > > > For that, rootfilesin:foo/bar can produce regex ^foo/bar/[^/]+$ or
> > > similar
> > > > which would not match a file called bar. visitdir would have to be
> > > updated
> > > > accordingly, of course, but that shouldn't be too hard (and i can
> take
> > > the
> > > > opportunity to add some comments to the code).
> > > >
> > > > If that looks good to you, let me know and I'll send an updated
> patch.
> > >
> > > Sure, LGTM
> > >
> > > --
> > > --
> > > [FUJIWARA Katsunori] fo...@lares.dti.ne.jp
> > >
> > [1.2  ]
> >
> > [2 S/MIME Cryptographic Signature ]
> >
>
> --
> --
> [FUJIWARA Katsunori] fo...@lares.dti.ne.jp
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] match: adding non-recursive directory matching

2017-02-03 Thread Rodrigo Damazio via Mercurial-devel
Finally working on this again.
On point which I discussed with Martin offline - which feels more intuitive
as a prefix, "root" or "abs"? (so, "rootfilesin" or "absfilesin"?) We think
it's "abs", but wanted to make sure that's OK with others.

Thanks


On Sun, Jan 29, 2017 at 3:15 AM, FUJIWARA Katsunori 
wrote:

>
> At Fri, 27 Jan 2017 15:14:38 -0800,
> Rodrigo Damazio wrote:
> >
> > [1  ]
> > [1.1  ]
> > On Fri, Jan 27, 2017 at 1:03 AM, FUJIWARA Katsunori <
> fo...@lares.dti.ne.jp>
> > wrote:
> >
> > >
> > > At Thu, 26 Jan 2017 17:27:17 -0800,
> > > Rodrigo Damazio wrote:
> > > >
> > > > [1  ]
> > > > [1.1  ]
> > > > All sounds very reasonable, and "filesin:" or "rootfilesin:" LGTM.
> > >
> > > Is it OK for your solution that "rootfilesin:FOO" doesn't match
> > > against "file FOO", even though your patch posted in this thread made
> > > "files:FOO" do so ? or, is combining "rootfile:" and "rootfilesin"
> > > acceptable for your solution ?
> > >
> >
> > Yes, not matching files is fine, and actually the easiest to implement
> (the
> > regex is simpler and our custom server doesn't support files anyway).
> > For that, rootfilesin:foo/bar can produce regex ^foo/bar/[^/]+$ or
> similar
> > which would not match a file called bar. visitdir would have to be
> updated
> > accordingly, of course, but that shouldn't be too hard (and i can take
> the
> > opportunity to add some comments to the code).
> >
> > If that looks good to you, let me know and I'll send an updated patch.
>
> Sure, LGTM
>
> --
> --
> [FUJIWARA Katsunori] fo...@lares.dti.ne.jp
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] match: adding non-recursive directory matching

2017-01-27 Thread Rodrigo Damazio via Mercurial-devel
On Fri, Jan 27, 2017 at 1:03 AM, FUJIWARA Katsunori <fo...@lares.dti.ne.jp>
wrote:

>
> At Thu, 26 Jan 2017 17:27:17 -0800,
> Rodrigo Damazio wrote:
> >
> > [1  ]
> > [1.1  ]
> > All sounds very reasonable, and "filesin:" or "rootfilesin:" LGTM.
>
> Is it OK for your solution that "rootfilesin:FOO" doesn't match
> against "file FOO", even though your patch posted in this thread made
> "files:FOO" do so ? or, is combining "rootfile:" and "rootfilesin"
> acceptable for your solution ?
>

Yes, not matching files is fine, and actually the easiest to implement (the
regex is simpler and our custom server doesn't support files anyway).
For that, rootfilesin:foo/bar can produce regex ^foo/bar/[^/]+$ or similar
which would not match a file called bar. visitdir would have to be updated
accordingly, of course, but that shouldn't be too hard (and i can take the
opportunity to add some comments to the code).

If that looks good to you, let me know and I'll send an updated patch.

> On Thu, Jan 26, 2017 at 11:24 AM, Martin von Zweigbergk <
> > martinv...@google.com> wrote:
> >
> > > On Thu, Jan 26, 2017 at 11:19 AM, FUJIWARA Katsunori
> > > <fo...@lares.dti.ne.jp> wrote:
> > > >
> > > > At Wed, 25 Jan 2017 20:54:37 -0800,
> > > > Martin von Zweigbergk wrote:
> > > >>
> > > >> On Mon, Jan 23, 2017 at 5:02 PM, Rodrigo Damazio via Mercurial-devel
> > > >> <mercurial-devel@mercurial-scm.org> wrote:
> > > >> > Getting back to this after the end-of-year hiatus (yes, I know it
> > > happens to
> > > >> > be during another code freeze :) I seem to have good timing).
> > > >> >
> > > >> > On Tue, Dec 27, 2016 at 2:14 AM, Pierre-Yves David
> > > >> > <pierre-yves.da...@ens-lyon.org> wrote:
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> On 12/21/2016 04:21 AM, Rodrigo Damazio wrote:
> > > >> >>>
> > > >> >>> If I got these two pieces right, it looks like we could just
> > > apply
> > > >> >>> the improvement to 'visitdir' to 'set:your/glob/*' and have
> your
> > > >> >>> usecase filled while not jumping into UI changes. Would that
> > > work
> > > >> >>> for you ?
> > > >> >>>
> > > >> >>>
> > > >> >>> Not without a third set of changes, since set expansion doesn't
> use
> > > >> >>> visitdir (or the matcher being built) at all - the dependency is
> > > that
> > > >> >>> building the matcher depends on expanding the set (and thus the
> set
> > > >> >>> can't depend on the matcher).
> > > >> >>> It would technically be doable for re:, but I'm wary of getting
> > > into the
> > > >> >>> business of parsing and special-casing regexes to assume what
> they
> > > match
> > > >> >>> or don't.
> > > >> >>
> > > >> >>
> > > >> >> Rodrigo and I chatted directly about this a couple of days ago.
> Here
> > > is a
> > > >> >> quick summary of my new understanding of the situation.
> > > >> >>
> > > >> >> Fileset
> > > >> >> ---
> > > >> >>
> > > >> >> Fileset (behind "set:") can give the right result, but it is
> powered
> > > by
> > > >> >> not very modern code, it follow the old revset principle of "get
> > > everything
> > > >> >> and then run filters on that everything". That does not fit
> Rodrigo
> > > needs at
> > > >> >> all. It was easy to make 'set:' a bit smarter in the simple case
> but
> > > then we
> > > >> >> get into the issue that the matcher class is using 'set:' in a
> > > strange,
> > > >> >> non-lazy, way that does not use all the 'visitdir' feature
> > > Rodrigo/Google
> > > >> >> needs.
> > > >> >>
> > > >> >> So in short, fileset needs a rework before being usable in a
> > > demanding
> > > >> >> context.
> > > >> >>
> > > >> >>
> > > >> >

Re: [PATCH] match: adding non-recursive directory matching

2017-01-26 Thread Rodrigo Damazio via Mercurial-devel
All sounds very reasonable, and "filesin:" or "rootfilesin:" LGTM.


On Thu, Jan 26, 2017 at 11:24 AM, Martin von Zweigbergk <
martinv...@google.com> wrote:

> On Thu, Jan 26, 2017 at 11:19 AM, FUJIWARA Katsunori
> <fo...@lares.dti.ne.jp> wrote:
> >
> > At Wed, 25 Jan 2017 20:54:37 -0800,
> > Martin von Zweigbergk wrote:
> >>
> >> On Mon, Jan 23, 2017 at 5:02 PM, Rodrigo Damazio via Mercurial-devel
> >> <mercurial-devel@mercurial-scm.org> wrote:
> >> > Getting back to this after the end-of-year hiatus (yes, I know it
> happens to
> >> > be during another code freeze :) I seem to have good timing).
> >> >
> >> > On Tue, Dec 27, 2016 at 2:14 AM, Pierre-Yves David
> >> > <pierre-yves.da...@ens-lyon.org> wrote:
> >> >>
> >> >>
> >> >>
> >> >> On 12/21/2016 04:21 AM, Rodrigo Damazio wrote:
> >> >>>
> >> >>> If I got these two pieces right, it looks like we could just
> apply
> >> >>> the improvement to 'visitdir' to 'set:your/glob/*' and have your
> >> >>> usecase filled while not jumping into UI changes. Would that
> work
> >> >>> for you ?
> >> >>>
> >> >>>
> >> >>> Not without a third set of changes, since set expansion doesn't use
> >> >>> visitdir (or the matcher being built) at all - the dependency is
> that
> >> >>> building the matcher depends on expanding the set (and thus the set
> >> >>> can't depend on the matcher).
> >> >>> It would technically be doable for re:, but I'm wary of getting
> into the
> >> >>> business of parsing and special-casing regexes to assume what they
> match
> >> >>> or don't.
> >> >>
> >> >>
> >> >> Rodrigo and I chatted directly about this a couple of days ago. Here
> is a
> >> >> quick summary of my new understanding of the situation.
> >> >>
> >> >> Fileset
> >> >> ---
> >> >>
> >> >> Fileset (behind "set:") can give the right result, but it is powered
> by
> >> >> not very modern code, it follow the old revset principle of "get
> everything
> >> >> and then run filters on that everything". That does not fit Rodrigo
> needs at
> >> >> all. It was easy to make 'set:' a bit smarter in the simple case but
> then we
> >> >> get into the issue that the matcher class is using 'set:' in a
> strange,
> >> >> non-lazy, way that does not use all the 'visitdir' feature
> Rodrigo/Google
> >> >> needs.
> >> >>
> >> >> So in short, fileset needs a rework before being usable in a
> demanding
> >> >> context.
> >> >>
> >> >>
> >> >> Current path restriction capability
> >> >> ---
> >> >>
> >> >> The 'Match' class already have logic to restrict the path visited
> >> >> (implemented in the 'visitdir' method). To clarify, this logic as no
> effect
> >> >> on the returned match but is only an optimization for the directory
> we
> >> >> visit. It seems to only kicks in when treemanifest is used.
> >> >> This logic already works with a couple of patterns type (all pattern
> use
> >> >> the same class). However, that logic currently do not support the
> case were
> >> >> one want to select some subdirectory and skips the rest of the
> subtrees
> >> >> under it.
> >> >
> >> >
> >> > That is correct.
> >> >
> >> >> note: Rodrigo, you seems to have a good understanding of the logic.
> Do you
> >> >> think you could document the involved attributes (_includeroots,
> >> >> _includedirs, _excluderoots, etc) That would help a lot the next
> poor souls
> >> >> looking at the code.
> >> >
> >> >
> >> > Sure. It took me a while to understand that "roots" means "recursive
> >> > directories" and "dirs" means "non-recursive directories" in that
> code - it
> >> > all became much more clear after that. I'll be sure to add comments
> in my
> >> > patch and/or rename the attributes.
> >> >
> >> >>
> >&g

Re: [PATCH] match: adding non-recursive directory matching

2017-01-23 Thread Rodrigo Damazio via Mercurial-devel
Getting back to this after the end-of-year hiatus (yes, I know it happens
to be during another code freeze :) I seem to have good timing).

On Tue, Dec 27, 2016 at 2:14 AM, Pierre-Yves David <
pierre-yves.da...@ens-lyon.org> wrote:

>
>
> On 12/21/2016 04:21 AM, Rodrigo Damazio wrote:
>
>> If I got these two pieces right, it looks like we could just apply
>> the improvement to 'visitdir' to 'set:your/glob/*' and have your
>> usecase filled while not jumping into UI changes. Would that work
>> for you ?
>>
>>
>> Not without a third set of changes, since set expansion doesn't use
>> visitdir (or the matcher being built) at all - the dependency is that
>> building the matcher depends on expanding the set (and thus the set
>> can't depend on the matcher).
>> It would technically be doable for re:, but I'm wary of getting into the
>> business of parsing and special-casing regexes to assume what they match
>> or don't.
>>
>
> Rodrigo and I chatted directly about this a couple of days ago. Here is a
> quick summary of my new understanding of the situation.
>
> Fileset
> ---
>
> Fileset (behind "set:") can give the right result, but it is powered by
> not very modern code, it follow the old revset principle of "get everything
> and then run filters on that everything". That does not fit Rodrigo needs
> at all. It was easy to make 'set:' a bit smarter in the simple case but
> then we get into the issue that the matcher class is using 'set:' in a
> strange, non-lazy, way that does not use all the 'visitdir' feature
> Rodrigo/Google needs.
>
> So in short, fileset needs a rework before being usable in a demanding
> context.
>
>
> Current path restriction capability
> ---
>
> The 'Match' class already have logic to restrict the path visited
> (implemented in the 'visitdir' method). To clarify, this logic as no effect
> on the returned match but is only an optimization for the directory we
> visit. It seems to only kicks in when treemanifest is used.
> This logic already works with a couple of patterns type (all pattern use
> the same class). However, that logic currently do not support the case were
> one want to select some subdirectory and skips the rest of the subtrees
> under it.
>

That is correct.

note: Rodrigo, you seems to have a good understanding of the logic. Do you
> think you could document the involved attributes (_includeroots,
> _includedirs, _excluderoots, etc) That would help a lot the next poor souls
> looking at the code.
>

Sure. It took me a while to understand that "roots" means "recursive
directories" and "dirs" means "non-recursive directories" in that code - it
all became much more clear after that. I'll be sure to add comments in my
patch and/or rename the attributes.


>
> Way forward
> ---
>
> That limitation in the matcher class optimization is the main blocker for
> Rodrigo/Google needs. The optimization is independent of the UI part we
> actually provides to user as all patterns use the same matcher class and
> some existing class could already benefit from this optimization.
>
> Rodrigo seems to have a patch to update the matcher code to track and
> optimize the "subdir-but-not-subtree" case. He has not submitted this patch
> yet. Submitting that patches seems the next step to me. It will get the
> matcher code in a state that can actually be used for the
> narrowhg+treemanifest usecase.
>
> Once that code is in, it seems easy to make sure various patterns use it
> basic, easily recognizable cases. We poked at updating the code to have
> basic regexp matching a subtree recognized as such and that was quite easy.
>
>
> Rodrigo, does that match your current understanding of the situation?
>

It does.
And just to clarify on the patches - I sent an initial patch, then after
comments changes it significantly, so those are two different changes:

   - The first implements a "files:" matcher which matches all files inside
   a directory, non-recursively. This has no wildcards, so special-casing it
   in visitdir and any other places needed results in clean and simple code
   ("if it's files:, don't recurse").
   - The second implements "rootglob:" which allows any number of wildcards
   at point in the path, and is part of Foozy's plan for the new set of
   matchers. This adds some complexity in splitting dirs and roots (mentioned
   above) by having to parse the wildcards, and then the visitdir change looks
   less clean ("if it's a rootglob that has a single /* wildcard at the end,
   then don't recurse" - other cases are possible but start to get more
   complex).

For these reasons, I'd still prefer to get "files:" or similar in, but I'm
open for doing it either way. Please advise on the preferred way and I'll
send an updated patch (2 patches really - one for the matcher, one for the
visitdir optimization which makes it work with narrow).

Thanks
Rodrigo


smime.p7s
Description: S/MIME Cryptographic Signature

Re: [PATCH] match: adding non-recursive directory matching

2016-12-20 Thread Rodrigo Damazio via Mercurial-devel
On Tue, Dec 20, 2016 at 5:47 AM, Pierre-Yves David <
pierre-yves.da...@ens-lyon.org> wrote:

>
>
> On 12/20/2016 06:00 AM, Rodrigo Damazio wrote:
>
>> Unfortunately, while set would match the right files, because of the way
>> the code is structured, it provides no way to not try visiting the
>> directories inside the non-recursive match - the set needs to first
>> collect all the files in all subdirectories (match.py, _expandset) and
>> then filter that down to the desired ones. In plain hg repos, that's
>> just much slower - in the context of narrowhg, the repo will simply not
>> have the manifests for those subdirectories and trying to visit them
>> will crash.
>>
>
> Okay, so this seems like the current tools allow you to specify the right
> request but shortcoming of the -implementation- are preventing that request
> to work probably with narrowhg (and have performance impacts)
>
> Did I got that right ?


Yes.

The follow-up change to this one (which I haven't sent yet but is
>> written) is updating visitdir to allow non-recursiveness, which btw
>> makes something like "hg files -I rootglob:browser/*" about 4-5x faster
>> in the firefox repo.
>>
>
> And, If I read you right, the implementation of 'rootglob:' you provided
> in your patch have the same implementation issue, but you have another
> patch to improve the implementation to behave a way you can use (and is
> faster).
>
> Did I got that right too ?
>

Yes.

If I got these two pieces right, it looks like we could just apply the
> improvement to 'visitdir' to 'set:your/glob/*' and have your usecase filled
> while not jumping into UI changes. Would that work for you ?
>

Not without a third set of changes, since set expansion doesn't use
visitdir (or the matcher being built) at all - the dependency is that
building the matcher depends on expanding the set (and thus the set can't
depend on the matcher).
It would technically be doable for re:, but I'm wary of getting into the
business of parsing and special-casing regexes to assume what they match or
don't.


> On Fri, Dec 16, 2016 at 6:21 AM, Pierre-Yves David
>> >
>> wrote:
>>
>>
>>
>> On 12/16/2016 02:19 AM, Augie Fackler wrote:
>>
>>
>> On Nov 24, 2016, at 10:28 AM, FUJIWARA Katsunori
>> > wrote:
>>
>> Yes, "files:" was the original version of this patch
>> and the case I really
>> care about :) I changed it to rootglob after your
>> comments.
>> Which way would be preferred to move forward?
>>
>>
>> "files:" is "path:" family, and "rootglob:" is "glob:"
>> family. As we
>> concluded before, "path:" itself can't control recursion of
>> matching
>> well.
>>
>> Therefore, I think that "files:" should be implemented if
>> needed,
>> regardless of implementing "rootglob:".
>>
>> Of course, we need high point view of this area, at first :-)
>>
>>
>> BTW, it is a little ambiguous (at least, for me) that
>> "files:foo"
>> matches against both file "foo" and files just under directory
>> "foo". Name other than "files:" may resolve this ambiguity,
>> but I
>> don't have any better (and short enough) name :-<
>>
>>  ==  === ===
>>  patternfoo  foo/bar foo/bar/baz
>>  ==  === ===
>>  path:fooo o o
>>
>>  files:foo   o o x
>>
>>  file:fooo x x
>>  dir:foo x o o
>>  ==  === ===
>>
>>
>> Scanning the plan page, I see that there’s a *lot* of work that
>> could be done and no consensus as yet, but that the only
>> immediate use case seems to be the rootfile/rootglob case. Is
>> there some path forward we could agree on that would unblock
>> those immediate needs for narrowhg and not make things harder in
>> the future?
>>
>> Alternatively, would we be okay with a slight refactor of the
>> matcher so that narrowhg can introduce a custom filesonly:
>> matcher for the time being so we can keep making forward
>> progress there?  I don’t know the matcher code well enough to be
>> able to guess if this is a reasonable path so we can be unblocked.
>>
>> (It’s very hard for to justify the amount of work implied by
>> reaching consensus on FileNamePatternsPlan and then executing
>> the entire thing when what we need is solvable today with a
>> sub-hour patch to existing code, thus my trying to find a
>> solution we can all live with.)
>>
>>

Re: [PATCH] match: adding non-recursive directory matching

2016-12-19 Thread Rodrigo Damazio via Mercurial-devel
Unfortunately, while set would match the right files, because of the way
the code is structured, it provides no way to not try visiting the
directories inside the non-recursive match - the set needs to first collect
all the files in all subdirectories (match.py, _expandset) and then filter
that down to the desired ones. In plain hg repos, that's just much slower -
in the context of narrowhg, the repo will simply not have the manifests for
those subdirectories and trying to visit them will crash.

The follow-up change to this one (which I haven't sent yet but is written)
is updating visitdir to allow non-recursiveness, which btw makes something
like "hg files -I rootglob:browser/*" about 4-5x faster in the firefox repo.


On Fri, Dec 16, 2016 at 6:21 AM, Pierre-Yves David <
pierre-yves.da...@ens-lyon.org> wrote:

>
>
> On 12/16/2016 02:19 AM, Augie Fackler wrote:
>
>>
>> On Nov 24, 2016, at 10:28 AM, FUJIWARA Katsunori 
>>> wrote:
>>>
>>> Yes, "files:" was the original version of this patch and the case I
> really
> care about :) I changed it to rootglob after your comments.
> Which way would be preferred to move forward?
>

>>> "files:" is "path:" family, and "rootglob:" is "glob:" family. As we
>>> concluded before, "path:" itself can't control recursion of matching
>>> well.
>>>
>>> Therefore, I think that "files:" should be implemented if needed,
>>> regardless of implementing "rootglob:".
>>>
>>> Of course, we need high point view of this area, at first :-)
>>>
>>>
>>> BTW, it is a little ambiguous (at least, for me) that "files:foo"
>>> matches against both file "foo" and files just under directory
>>> "foo". Name other than "files:" may resolve this ambiguity, but I
>>> don't have any better (and short enough) name :-<
>>>
>>>  ==  === ===
>>>  patternfoo  foo/bar foo/bar/baz
>>>  ==  === ===
>>>  path:fooo o o
>>>
>>>  files:foo   o o x
>>>
>>>  file:fooo x x
>>>  dir:foo x o o
>>>  ==  === ===
>>>
>>>
>> Scanning the plan page, I see that there’s a *lot* of work that could be
>> done and no consensus as yet, but that the only immediate use case seems to
>> be the rootfile/rootglob case. Is there some path forward we could agree on
>> that would unblock those immediate needs for narrowhg and not make things
>> harder in the future?
>>
>> Alternatively, would we be okay with a slight refactor of the matcher so
>> that narrowhg can introduce a custom filesonly: matcher for the time being
>> so we can keep making forward progress there?  I don’t know the matcher
>> code well enough to be able to guess if this is a reasonable path so we can
>> be unblocked.
>>
>> (It’s very hard for to justify the amount of work implied by reaching
>> consensus on FileNamePatternsPlan and then executing the entire thing when
>> what we need is solvable today with a sub-hour patch to existing code, thus
>> my trying to find a solution we can all live with.)
>>
>
> As far as I understand, Foozy finding shows that the feature narrowhg
> needs is already there and nothing new is necessary.
>
> You can add "set:" in front of your glob to make it non recursive in all
> cases "set:your/directory/you/want/to/match/files/in/*"
>
> If this does not fits your needs, this probably mean I got your usecase
> wrong. In that case can you re-explain the issue you are trying to solve
> here?
>
>
> At the project level, it will make sense to clean up the Pattern Matching
> at some point, and Foozy wiki work will help us to do that.
>
> Cheers.
>
> --
> Pierre-Yves David
>
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] match: adding non-recursive directory matching

2016-11-23 Thread Rodrigo Damazio via Mercurial-devel
Hi guys - any comments on the preferred way forward?

(I do have a follow-up patch for optimizing visitdir accordingly, but don't
want to send it until this one is agreed upon)


On Thu, Nov 17, 2016 at 1:19 PM, Rodrigo Damazio 
wrote:

>
>
> On Thu, Nov 17, 2016 at 7:52 AM, FUJIWARA Katsunori  > wrote:
>
>>
>> (sorry for late reply)
>>
>> At Wed, 26 Oct 2016 14:02:48 -0700,
>> Rodrigo Damazio wrote:
>> >
>> > On Wed, Oct 26, 2016 at 12:17 AM, FUJIWARA Katsunori <
>> fo...@lares.dti.ne.jp>
>> > wrote:
>> >
>> > >
>> > > At Tue, 25 Oct 2016 19:51:59 -0700,
>> > > Rodrigo Damazio wrote:
>> > > >
>> > > > On Tue, Oct 25, 2016 at 4:31 PM, FUJIWARA Katsunori <
>> > > fo...@lares.dti.ne.jp>
>> > > > wrote:
>> > > >
>> > > > >
>> > > > > At Mon, 24 Oct 2016 10:34:52 -0700,
>> > > > > Rodrigo Damazio wrote:
>>
>> [snip]
>>
>> > > > On the other hand, you assume that newly introduced *path syntaxes
>> > > > > will be recursive, as below. Would you assume that default
>> > > > > recursive-ness is different between *glob and *path syntaxes ?
>> > > > >
>> > > >
>> > > > path would be recursive, as will glob that ends with ** or regex
>> that
>> > > ends
>> > > > with .*
>> > > >
>> > > >
>> > > > > > Also, for discussion: I assume the *path patterns will be
>> recursive
>> > > when
>> > > > > > they reference a directory. Do we also want a non-recursive
>> > > equivalent
>> > > > > > (rootexact, rootfiles, rootnonrecursive or something like that)?
>> > >
>> > > How about adding syntax type "file"/"dir" ?
>> > >
>> > >   = = =
>> > >   type  for recursive for non-recursive
>> > >   = = =
>> > >   glob  use "**"  use "*"
>> > >   reomit "$"  append "$"
>> > >   path  always(*1)
>> > >   file    always
>> > >   dir   always(*2)
>> > >   = = =
>> > >
>> > >   (*1) match against both file and directory
>> > >   (*2) match against only directory
>> > >
>> > > "dir" might be overkill, though :-) (is it useful in resolving name
>> > > collision at merging or so ?)
>> > >
>> >
>> > foozy, thanks so much for the review and discussion.
>> > Sounds like we do agree about the glob behavior then, so let me know if
>> > you'd like any changes to the latest version of this patch, other than
>> > improving documentation. I'm happy to send an updated version as soon as
>> > someone is ready to review.
>> >
>> > I understand the difference between dir and path (and between the
>> original
>> > version of this patch and file) would be that they'd validate the type
>> of
>> > entry being matched (so that passing a filename to dir or dir name to
>> file
>> > would be an error) - is that what you have in mind?
>>
>> Yes > "passing a filename to dir or dir name to file would be an error"
>>
>>
>> > The current matchers
>> > don't have a good mechanism to verify the type, so some significant
>> > rewiring would need to be done to pass that information down.
>>
>> Current match implement uses two additional pattern suffix '(?:/|$)'
>> and '$' to control recursive matching of "glob" and "path". The former
>> allows to match recursively (for "glob" and "path"), and the latter
>> doesn't (only for "glob").
>>
>> I simply think using this technique to implement pattern types "file"
>> and "dir".
>>
>> path:PATTERN => ESCAPED-PATTERN(?:/|$)
>> file:PATTERN => ESCAPED-PATTERN$
>> dif:PATTERN  => ESCAPED-PATTERN/
>>
>
> Yes, "files:" was the original version of this patch and the case I really
> care about :) I changed it to rootglob after your comments.
> Which way would be preferred to move forward?
>
>


smime.p7s
Description: S/MIME Cryptographic Signature
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] match: adding support for repository-root-based globs

2016-11-02 Thread Rodrigo Damazio via Mercurial-devel
On Wed, Nov 2, 2016 at 7:58 AM, Pierre-Yves David <
pierre-yves.da...@ens-lyon.org> wrote:

>
>
> On 11/01/2016 01:50 PM, Yuya Nishihara wrote:
>
>> On Mon, 31 Oct 2016 21:47:35 -0400, Augie Fackler wrote:
>>
>>> On Oct 28, 2016, at 4:40 AM, Pierre-Yves David <
 pierre-yves.da...@ens-lyon.org> wrote:
 On 10/25/2016 09:41 AM, Rodrigo Damazio Bovendorp via Mercurial-devel
 wrote:

> # HG changeset patch
> # User Rodrigo Damazio Bovendorp 
> # Date 1475944120 25200
> #  Sat Oct 08 09:28:40 2016 -0700
> # Node ID e8454de81600e092f05aa22ecbac32925b70d074
> # Parent  260af19891f2bed679a662be07d1379bb8207592
> match: adding support for repository-root-based globs
>
> The broader plan is to add explicit base directories for all patterns:
>   === ===
> pattern type root-ed  cwd-ed  any-of-path
>   === ===
> wildcard rootglob cwdglob anyglob
> regexp   rootre   cwdre   anyre
> raw string   rootpath cwdpath anypath
>   === ===
> (table by foozy)
>

 The subject seems complicated enough that creating a Plan page seems
 relevant. This would help other people to follow what the problem space.

 https://www.mercurial-scm.org/wiki/WriteANewFeaturePlan

>>>
>>> I’m not sure if it needs a plan page. It strikes me as believable
>>> (perhaps even likely?) that rootglob will be the only part of this
>>> implemented for a long time, perhaps ever. (Having the whole table makes
>>> good sense to me though, because now we’ve though through the future in
>>> case it ever comes.)
>>>
>>
> If I'm counting correctly, this is at least the third time matcher
> specification is touched in the history of the project. This instance have
> already created multiple email threads with interesting data "hidden" into
> them.
>
> I think a plan page is important to give clear picture of the situation
> and the proposed solution in a single place. This would help increased the
> number of eyes on this topic that history have proved complex.


> Especially give how often this topic came up I think it important to make
> sure we actually map the problem space and nail the issue once and for all.
> I'm a bit concerned we could be taking the minimal step to snipe a specific
> requirement here without actually moving toward a real closure providing a
> simplified solution that fit all all users.
>
> Regarding how far we'll go in that plan, I hope that at least
> "rootre:/rootpath:" will be implement for feature parity with glob.
>
> I'm starting by adding rootglob.
> One important characteristic and difference from the older glob types
> is
> that * will never match recursively, even when the glob is used as an
> include
> pattern.
>

 This seems a bit strange to me. Given that the current glob matcher is
 already not recursive, why don't we work on an alternative non recursive -I
 flag instead?

>>>
>>> That means we’ll have a new flag that alters the behavior of existing
>>> matchers in subtle ways. It also requires cooperation from every command
>>> that we want to add new globbing features to, including extensions, whereas
>>> if we can add a couple of new atoms (as outlined in the proposal from
>>> foozy) we get consistent behavior across all matchers.
>>>
>>
> Command using -I/-X option usually just add the 'walkopts' list to their
> option and then pass their 'opts' dict to 'scmutil.match(…)'. So improving
> the command flag situation in a unified way does not seems too scary.
>
> My main concern here is that the vast majority of user will still use the
> basic glob we expose without any prefix. If the most common matcher behave
> "differently" than all the others, that will get confusing.
> That said, the exact state of all existing matchers behavior is getting
> fuzzier in my head as the discussion progress. This is one of the reason I
> would like to see the current situation and the Foozy plan summarized into
> a plan page.
>
> A good way to test¹ design is "how embarrassing is it to explain to new
> user" (that is correlated to the simplicity of the design). The recent
> discussion and confusion around matcher show that we are not great for the
> moment.
>
> It’s a mistake that existing matchers have a recursive * behavior with
>>> --include, but it’s too late to change that. As such, I’d much rather have
>>> this proposal as currently stated.
>>>
>>
> Given the current globsuffix thing is just for backward compatibility, new
>> rootglob prefix makes sense to me. Fileset, which is relatively new,
>> behaves
>> in that way.
>>
>
> On one hand, this extra bits from Yuya increase my wish for a nice summary
> of the current situation + objective. On the other hand there it start to
> have enough core people who seems to know what they are doing on this topic
> to 

Re: [PATCH] match: adding non-recursive directory matching

2016-10-25 Thread Rodrigo Damazio via Mercurial-devel
Sending updated patch via pushgate (description changed).


On Mon, Oct 24, 2016 at 10:34 AM, Rodrigo Damazio 
wrote:

> It sounds like we'd like to do 3 somewhat orthogonal things:
> - allow user to specify the directory the pattern is relative to
> (root/cwd/any)
> - allow the user to specify recursiveness/non-recursiveness consistently
> (not covered by the *path patterns, but could be the defined behavior for
> the globs)
> - clean up the matcher API (discussed during Sprint)
>
> Doing all 3 together would probably take some time and a lot of
> back-and-forth, so I'm wondering if it'd be ok to start by updating this
> patch to implement "rootglob" with consistent recursiveness behavior, and
> we can then more slowly add the other patterns and converge on a cleaner
> API?
>
> Also, for discussion: I assume the *path patterns will be recursive when
> they reference a directory. Do we also want a non-recursive equivalent
> (rootexact, rootfiles, rootnonrecursive or something like that)?
>
> Thanks
> Rodrigo
>
>
>
> On Mon, Oct 24, 2016 at 6:21 AM, Pierre-Yves David <
> pierre-yves.da...@ens-lyon.org> wrote:
>
>>
>>
>> On 10/21/2016 05:13 PM, FUJIWARA Katsunori wrote:
>>
>>> At Tue, 18 Oct 2016 10:12:07 -0400,
>>> Augie Fackler wrote:
>>>

 On Tue, Oct 18, 2016 at 9:52 AM, Yuya Nishihara  wrote:

> On Tue, 18 Oct 2016 09:40:36 -0400, Augie Fackler wrote:
>
>> On Oct 18, 2016, at 09:38, Yuya Nishihara  wrote:
>>>
 After coordinating on irc to figure out what this proposal actually
 is, I've noticed that the semantics of this "exact" proposal are
 exactly what "glob" does today, which means (I think) that
 "files:foo/bar" should be representable as "glob:foo/bar/*" - what
 am
 I missing?

>>>
>>> Maybe we want a "glob" relative to the repo root?
>>>
>>
>> As far as I can tell, it already is. "relglob:" is relative to your
>> location in the repo according to the docs.
>>
>
> Unfortunately that isn't.
>
> 'glob:' - a glob relative to cwd
> 'relglob:' - an unrooted glob (*.c matches C files in
> all dirs)
>
> Don't ask me why. ;-)
>

 Oh wat. It looks like narrowhg might change this behavior in narrowed
 repositories, thus my additional confusion.

 Maybe we should add "absglob" that is always repo-root-absolute. How
 do we feel about that overall?

>>>
>>> FYI, current pattern matching is implemented as below. This was
>>> chatted in "non-recursive directory matching" session of 4.0 sprint,
>>> and sorry for my late posting of this translation from
>>> http://d.hatena.ne.jp/flying-foozy/20140107/1389087728 in Japanese, as
>>> my backlog of the last sprint.
>>>
>>>    === === ===
>>>   pattern type root-ed cwd-ed  any-of-path
>>>    === === ===
>>>   wildcard --- globrelglob
>>>   regexp   re  --- relre
>>>   raw string   pathrelpath ---
>>>    === === ===
>>>
>>>   If rule is read in from file (e.g. .hgignore):
>>>
>>> * "glob" is treated as "relglob"
>>> * "re" is treated as "relre"
>>>
>>>   This is mentioned in "hg help patterns" and "hg help hgignore", but
>>>   syntax name "relglob" and "relre" themselves aren't explained.
>>>
>>>   "end of name" matching is required:
>>>
>>> * for glob/relglob as PATTERN (e.g. argument in command line), but
>>> * not for glob/relglob as INCLUDES/EXCLUDES, or other pattern
>>> syntaxes
>>>
>>>   For example, file "foo/bar/baz" is:
>>>
>>> * not matched at "hg files glob:foo/bar"
>>> * but matched at "hg file -I glob:foo/bar"
>>>
>>>   This isn't mentioned in any help document :-<, and the latter seems
>>>   to cause the issue mentioned in this patch series.
>>>
>>> How about introducing new systematic names like below to re-organize
>>> current complicated mapping between names and matching ? (and enable
>>> "end of name" matching by "-eon" suffix or so)
>>>
>>>     === ===
>>>   pattern type root-ed  cwd-ed  any-of-path
>>>     === ===
>>>   wildcard rootglob cwdglob anyglob
>>>   regexp   rootre   cwdre   anyre
>>>   raw string   rootpath cwdpath anypath
>>>     === ===
>>>
>>
>> Moving toward a more regular and clear feature set and naming seems a
>> win. I'm +1 for moving in that direction.
>>
>> Cheers,
>>
>> --
>> Pierre-Yves David
>>
>
>
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] match: adding non-recursive directory matching

2016-10-24 Thread Rodrigo Damazio via Mercurial-devel
It sounds like we'd like to do 3 somewhat orthogonal things:
- allow user to specify the directory the pattern is relative to
(root/cwd/any)
- allow the user to specify recursiveness/non-recursiveness consistently
(not covered by the *path patterns, but could be the defined behavior for
the globs)
- clean up the matcher API (discussed during Sprint)

Doing all 3 together would probably take some time and a lot of
back-and-forth, so I'm wondering if it'd be ok to start by updating this
patch to implement "rootglob" with consistent recursiveness behavior, and
we can then more slowly add the other patterns and converge on a cleaner
API?

Also, for discussion: I assume the *path patterns will be recursive when
they reference a directory. Do we also want a non-recursive equivalent
(rootexact, rootfiles, rootnonrecursive or something like that)?

Thanks
Rodrigo



On Mon, Oct 24, 2016 at 6:21 AM, Pierre-Yves David <
pierre-yves.da...@ens-lyon.org> wrote:

>
>
> On 10/21/2016 05:13 PM, FUJIWARA Katsunori wrote:
>
>> At Tue, 18 Oct 2016 10:12:07 -0400,
>> Augie Fackler wrote:
>>
>>>
>>> On Tue, Oct 18, 2016 at 9:52 AM, Yuya Nishihara  wrote:
>>>
 On Tue, 18 Oct 2016 09:40:36 -0400, Augie Fackler wrote:

> On Oct 18, 2016, at 09:38, Yuya Nishihara  wrote:
>>
>>> After coordinating on irc to figure out what this proposal actually
>>> is, I've noticed that the semantics of this "exact" proposal are
>>> exactly what "glob" does today, which means (I think) that
>>> "files:foo/bar" should be representable as "glob:foo/bar/*" - what am
>>> I missing?
>>>
>>
>> Maybe we want a "glob" relative to the repo root?
>>
>
> As far as I can tell, it already is. "relglob:" is relative to your
> location in the repo according to the docs.
>

 Unfortunately that isn't.

 'glob:' - a glob relative to cwd
 'relglob:' - an unrooted glob (*.c matches C files in all
 dirs)

 Don't ask me why. ;-)

>>>
>>> Oh wat. It looks like narrowhg might change this behavior in narrowed
>>> repositories, thus my additional confusion.
>>>
>>> Maybe we should add "absglob" that is always repo-root-absolute. How
>>> do we feel about that overall?
>>>
>>
>> FYI, current pattern matching is implemented as below. This was
>> chatted in "non-recursive directory matching" session of 4.0 sprint,
>> and sorry for my late posting of this translation from
>> http://d.hatena.ne.jp/flying-foozy/20140107/1389087728 in Japanese, as
>> my backlog of the last sprint.
>>
>>    === === ===
>>   pattern type root-ed cwd-ed  any-of-path
>>    === === ===
>>   wildcard --- globrelglob
>>   regexp   re  --- relre
>>   raw string   pathrelpath ---
>>    === === ===
>>
>>   If rule is read in from file (e.g. .hgignore):
>>
>> * "glob" is treated as "relglob"
>> * "re" is treated as "relre"
>>
>>   This is mentioned in "hg help patterns" and "hg help hgignore", but
>>   syntax name "relglob" and "relre" themselves aren't explained.
>>
>>   "end of name" matching is required:
>>
>> * for glob/relglob as PATTERN (e.g. argument in command line), but
>> * not for glob/relglob as INCLUDES/EXCLUDES, or other pattern syntaxes
>>
>>   For example, file "foo/bar/baz" is:
>>
>> * not matched at "hg files glob:foo/bar"
>> * but matched at "hg file -I glob:foo/bar"
>>
>>   This isn't mentioned in any help document :-<, and the latter seems
>>   to cause the issue mentioned in this patch series.
>>
>> How about introducing new systematic names like below to re-organize
>> current complicated mapping between names and matching ? (and enable
>> "end of name" matching by "-eon" suffix or so)
>>
>>     === ===
>>   pattern type root-ed  cwd-ed  any-of-path
>>     === ===
>>   wildcard rootglob cwdglob anyglob
>>   regexp   rootre   cwdre   anyre
>>   raw string   rootpath cwdpath anypath
>>     === ===
>>
>
> Moving toward a more regular and clear feature set and naming seems a win.
> I'm +1 for moving in that direction.
>
> Cheers,
>
> --
> Pierre-Yves David
>
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


Re: [PATCH] match: adding non-recursive directory matching

2016-10-20 Thread Rodrigo Damazio via Mercurial-devel
The issue is that glob:foo/* is recursive in some cases - e.g. "hg files -I
glob:contrib/*" in the hg repo gives me subdirectories of contrib
recursively (including e.g. contrib/docker/apache-server, two levels down).
After discussing a bit more offline with Martin: I'll check if that's a bug
in the matcher's visitdir (rather than a design limitation of glob) before
following up on this change.


On Tue, Oct 18, 2016 at 7:39 AM, Yuya Nishihara  wrote:

> On Tue, 18 Oct 2016 10:12:07 -0400, Augie Fackler wrote:
> > On Tue, Oct 18, 2016 at 9:52 AM, Yuya Nishihara  wrote:
> > > On Tue, 18 Oct 2016 09:40:36 -0400, Augie Fackler wrote:
> > >> > On Oct 18, 2016, at 09:38, Yuya Nishihara  wrote:
> > >> >> After coordinating on irc to figure out what this proposal actually
> > >> >> is, I've noticed that the semantics of this "exact" proposal are
> > >> >> exactly what "glob" does today, which means (I think) that
> > >> >> "files:foo/bar" should be representable as "glob:foo/bar/*" - what
> am
> > >> >> I missing?
> > >> >
> > >> > Maybe we want a "glob" relative to the repo root?
> > >>
> > >> As far as I can tell, it already is. "relglob:" is relative to your
> > >> location in the repo according to the docs.
> > >
> > > Unfortunately that isn't.
> > >
> > > 'glob:' - a glob relative to cwd
> > > 'relglob:' - an unrooted glob (*.c matches C files in
> all dirs)
> > >
> > > Don't ask me why. ;-)
> >
> > Oh wat. It looks like narrowhg might change this behavior in narrowed
> > repositories, thus my additional confusion.
> >
> > Maybe we should add "absglob" that is always repo-root-absolute. How
> > do we feel about that overall?
>
> Sounds good to me.
>
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel