Re: [PATCH] revset: introduce the summary predicate

Matt Harbison Tue, 10 Jan 2017 17:10:23 -0800

On Mon, 09 Jan 2017 23:33:17 -0500, Matt Harbison <mharbiso...@gmail.com>wrote:

On Mon, 09 Jan 2017 05:49:23 -0500, Pierre-Yves David<pierre-yves.da...@ens-lyon.org> wrote:
On 01/08/2017 09:34 PM, Matt Harbison wrote:
On Sun, 08 Jan 2017 07:59:36 -0500, Pierre-Yves David
<pierre-yves.da...@ens-lyon.org> wrote:
(ha, I wrote my previous reply in a train and it got sent when I
connected again (and received that one). I'm going to try to adress
the new content in this email and sometime repeat some of my other
reply content for clarity)

On 01/08/2017 04:23 AM, Matt Harbison wrote:
On Sat, 07 Jan 2017 02:56:48 -0500, Yuya Nishihara <y...@tcha.org>
wrote:
On Fri, 6 Jan 2017 21:29:43 -0500, Matt Harbison wrote:
> On Jan 6, 2017, at 11:19 AM, Pierre-Yves David
<pierre-yves.da...@ens-lyon.org> wrote:
>> On 01/04/2017 07:04 PM, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbi...@yahoo.com>
>> # Date 1483550016 18000
>> #      Wed Jan 04 12:13:36 2017 -0500
>> # Node ID 76d95ab94b9e206363629059fb7824002e19a9e5
>> # Parent  0064a1eb28e246ded9b726c696d048143d1b23f1
>> revset: introduce the summary predicate
>>
Perhaps stringmatcher can have 3 types, icase literal, literal, and
re, and
the default of desc() is icase literal for backward compatibility.You
can
build a case-insensitive regexp object from a literal pattern.

https://docs.python.org/2/library/re.html#re.I
Yep, that's the API I was thinking of.

I'm confused by the rest of your comments.  When I first skimmed your
message, adding support for 'icasere:' using this API popped into my
mind.  And that could support a case insensitive literal, because
'icasere:foo' should be equivalent to looking for the substring 'foo'
(leaving aside efficiency, how discoverable that is, and that
stringmatcher matches the whole string for literals). But you seemto
be suggesting adding 'icaseliteral:'.
I'm not 100% sure of what Yuya actually has in mind but here is my
understanding of the situation and how we could move forward.

Currently:
----------

   desc(X) → X is customly matched as a case insensitive litteral,

   We have a "generic" pattern definition syntax used by various other
reveset (implemented in "stringmatcher")

     foo(X)
       → X is matched as a case sensitive litteral
     foo('literal:X')
       → X is matched as a case sensitive literal (same as the above)
     food('re:X')
       → X is matched as a regular expression (case sensitive)

Proposal: (might be what yuya says)
---------

extend the string matcher to

   foo('literal:X')
     → X is matched as a case sensitive literal
See the comment in the new patch I sent about 'user()' already
lowercasing 'literal:' and 're:'.  I'd consider it a bug, but it's been
in since mid 2012.  Attempting to channel Matt, I'm guessing we are
stuck with that since it is so old, but wanted to see what othersthink.
1) Yep, we are stuck with whatever existing behavior we have forexisting predicate because of BC. (but we can augment it)
2) Congratulation you seems to have unearthed an area where we havemany predicated with close but slightly different behavior. At thatpoint I'll ask you an inventory of what we currently have so that wecan devise a sound and as consistent as possible way forward.
   Can you provide us with a table that at least keep track of:

* predicate
* default behavior
* support 'rich' stringmatcher ?
* are 'literal:' case sensitive ?
* are 're:' case sensitive (and supported at all) ?
TL;DR: desc, grep, keyword, and author/user are the oddballs. grep andkeyword are the regex and literal halves respectively, of the samesearch.
After stripping out non string predicates, we basically end up with 4groups:
- util.stringmatcher based:

     Predicate:            Case Sensitive?
     "author"                   N
     "user"                     N
     "bookmark"                 Y
     "branch"                   Y
     "extra"                    Y
     "named"                    Y
     "subrepo"  [1]             Y
     "tag"                      Y
These all support 'literal:' and 're:'. Case sensitivity applies thesame to both prefixes, and raw pattern.
[1] Not documented to support 'literal:' or 're:' prefixes.


- Local method implementation based:

     Predicate:            Case Sensitive?
     "desc"                     N
     "grep"                     Y
     "keyword"                  N
None of these support prefixes. The grep param is a regex without're:', so it doesn't make a lot of sense to support stringmatcher here-what stringmatcher thinks is literal is really regex. If we internallybolt on 're:', it still can't support literal matches.
- match.py based (not relevant, but for completeness):

     "adds", "contains", "file", "filelog", "follow",
     "followlines", "modifies", "removes"

- "bisect" (I can't see how 're:' support here would be meaningful.)
From there we'll be able to see if a pattern emerge and pick the bestway to move forward.
The following thoughts come to mind:
A) author/user has been thoroughly corrupted with the lower casing.Maybe come up with a 3rd one that follows modern rules, anddeprecate/hide these? Sad, because these seem natural. OTOH, havingraw pattern and prefixed patterns behave the same everywhere is elegantand simple. Not sure what to call it. 'committer' comes to mind, butthat has git implications. That was suggested when I proposed recordingwho performed a graft in the 'extra' dict. Then it was pointed out thistracking was related to the chain of custody proposal by Greg, so I letit drop [1]. My enthusiasm was dampened some when I saw templates alsohave {author} and {user}, though the latter is a filter, not a referenceto the field.
[1]https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-April/068692.html
B) We should maybe fold grep and keyword into a new predicate('search'?) that follows modern formatting, and deprecate/hide thesetwo. Simple, and drops the visible revset count by 1. I'm not sure if'grep' was borrowed from git, or if it's just inspired by the unixcommand.

I had second thoughts about this. I think if we just add documentation to'keyword' that says "If you want to search these fields by regex or casesensitively, use 'grep'". Then the user can see how to get everythingstringmatcher will provide, and it makes it clear to developers that'keyword' doesn't need string stringmatcher support. I'll submit a patchduring the freeze.

That leaves only 'author' and 'desc' as not providing the fullfunctionality of the others.

C) If we do A + B, that means 'desc' is the only oddball left. I don'tlike the idea that case sensitivity for a raw pattern and a 'literal:'prefixed pattern would differ. They are both literals in my mind, andit would be the one remaining exception. The 're:' prefix could followregular rules.
D) I guess we could hide/replace 'desc' with something new as well.('message'?) But I'm wondering about the BC rules. I think generally,we don't change behavior without the user doing some action to opt in.But the 're:' prefix was added after many of these predicates existedfor years. Obviously, that broke anybody's search that started with're:', just by them upgrading Mercurial. If making them change thequery to prefix 'literal:' was acceptable, why would changing the casesensitivity not be, assuming we provide 'icase-literal:'? Is it just ageneral feeling that searches for 're:' are rare? I understand theprinciple and motivation, but sometimes it's hard to figure out how theyare applied.
   foo('icase-literal:X')
     → X is matched as a case insensitive literal
   food('re:X')
     → X is matched as a regular expression (case sensitive)

Then, desc move to use string matcher (default to "icase-literal").

We do not need a 'icase-re:' spec, because one can easily achieve it
using 're:(?i)foo'
Ah! I missed the part in the docs where flags could be set in thestring
with (?<flag>). I thought you needed to compile with re.FLAG.  When he
said string literal, my mind went right to the 'literal:' prefix.
Agreed, no need for 'icase-re:'.
Someone getting slightly confused with regular expression? Impossible!;-)
[…]
I'm about to submit a patch to add the current 're:' support to'desc'
in the meantime, to hopefully move this along.
Great!
 I'd also be curious if
you have thoughts on how to conditionally limit this predicate to the
first line, without limiting future functionality.
So having digged the regexp part a bit more, it seems like one could
just use 're:^.*issue1337' to match "issue1337" on the first line ('.'
does not match new-line by default).
Thanks for looking at that.  It's way less horrible than I thought it
would be.  I'm curious what Sean thinks, since he mentioned {firstline}
being put in as a substitute for a complex regex.  I'd be fine with
skipping the firstline=True param if this case is mentioned in the help
for desc().
I've missed that '{firstline}' proposal from Sean, can you point me atit? (or summarize it ?)
Not a proposal, so much as historical knowledge about the template?

https://www.mercurial-scm.org/pipermail/mercurial-devel/2017-January/092099.html
Thanks a lot for looking into this!

Cheers,

_______________________________________________
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

Re: [PATCH] revset: introduce the summary predicate

Reply via email to