Re: [PATCH v2 5/6] grep: remove regflags from the public grep_opt API

2017-06-30 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason   writes:

> @@ -169,6 +167,24 @@ void grep_init(struct grep_opt *opt, const char *prefix)
>  
>  static void grep_set_pattern_type_option(enum grep_pattern_type 
> pattern_type, struct grep_opt *opt)
>  {
> + /*
> +  * When committing to the pattern type by setting the relevant
> +  * fields in grep_opt it's generally not necessary to zero out
> +  * the fields we're not choosing, since they won't have been
> +  * set by anything. The extended_regexp_option field is the
> +  * only exception to this.
> +  *
> +  * This is because in the process of parsing grep.patternType
> +  * & grep.extendedRegexp we set opt->pattern_type_option and
> +  * opt->extended_regexp_option, respectively. We then
> +  * internally use opt->extended_regexp_option to see if we're
> +  * compiling an ERE. It must be unset if that's not actually
> +  * the case.
> +  */
> + if (pattern_type != GREP_PATTERN_TYPE_ERE &&
> + opt->extended_regexp_option)
> + opt->extended_regexp_option = 0;

Good to have the reasoning in an in-code comment like the above.
But after reading these two paragraphs and then before reading the
three line code, a more natural embodiment in the code of the
commentary that came to my mind was

if (pattern_type != GREP_PATTERN_TYPE_ERE)
opt->extended_regexp_option = 0;

The end-result is the same as yours, of course, but I somehow found
it match the reasoning better.

Now, I wonder if this can further be tweaked to

opt->extended_regexp_option = (pattern_type == GREP_PATTERN_TYPE_ERE);

which might lead us in a direction to really unify the two related
fields extended_regexp_option and pattern_type_option.

Even if that were a good longer term direction to go in, it is
outside the scope of this step, of course.  I am merely bringing it
up as an conversation item ;-).



[PATCH v2 5/6] grep: remove regflags from the public grep_opt API

2017-06-29 Thread Ævar Arnfjörð Bjarmason
Refactor calls to the grep machinery to always pass opt.ignore_case &
opt.extended_regexp_option instead of setting the equivalent regflags
bits.

The bug fixed when making -i work with -P in commit 9e3cbc59d5 ("log:
make --regexp-ignore-case work with --perl-regexp", 2017-05-20) was
really just plastering over the code smell which this change fixes.

The reason for adding the extensive commentary here is that I
discovered some subtle complexity in implementing this that really
should be called out explicitly to future readers.

Before this change we'd rely on the difference between
`extended_regexp_option` and `regflags` to serve as a membrane between
our preliminary parsing of grep.extendedRegexp and grep.patternType,
and what we decided to do internally.

Now that those two are the same thing, it's necessary to unset
`extended_regexp_option` just before we commit in cases where both of
those config variables are set. See 84befcd0a4 ("grep: add a
grep.patternType configuration setting", 2012-08-03) for the code and
documentation related to that.

The explanation of why the if/else branches in
grep_commit_pattern_type() are ordered the way they are exists in that
commit message, but I think it's worth calling this subtlety out
explicitly with a comment for future readers.

Even though grep_commit_pattern_type() is the only caller of
grep_set_pattern_type_option() it's simpler to reset the
extended_regexp_option flag in the latter, since 2/3 branches in the
former would otherwise need to reset it, this way we can do it in one
place.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 builtin/grep.c |  2 --
 grep.c | 43 ++-
 grep.h |  1 -
 revision.c |  2 --
 4 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index f61a9d938b..b682966439 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1169,8 +1169,6 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
 
if (!opt.pattern_list)
die(_("no pattern given."));
-   if (!opt.fixed && opt.ignore_case)
-   opt.regflags |= REG_ICASE;
 
/*
 * We have to find "--" in a separate pass, because its presence
diff --git a/grep.c b/grep.c
index 7fcdaa0753..11a86548d6 100644
--- a/grep.c
+++ b/grep.c
@@ -35,7 +35,6 @@ void init_grep_defaults(void)
memset(opt, 0, sizeof(*opt));
opt->relative = 1;
opt->pathname = 1;
-   opt->regflags = REG_NEWLINE;
opt->max_depth = -1;
opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
color_set(opt->color_context, "");
@@ -153,7 +152,6 @@ void grep_init(struct grep_opt *opt, const char *prefix)
opt->linenum = def->linenum;
opt->max_depth = def->max_depth;
opt->pathname = def->pathname;
-   opt->regflags = def->regflags;
opt->relative = def->relative;
opt->output = def->output;
 
@@ -169,6 +167,24 @@ void grep_init(struct grep_opt *opt, const char *prefix)
 
 static void grep_set_pattern_type_option(enum grep_pattern_type pattern_type, 
struct grep_opt *opt)
 {
+   /*
+* When committing to the pattern type by setting the relevant
+* fields in grep_opt it's generally not necessary to zero out
+* the fields we're not choosing, since they won't have been
+* set by anything. The extended_regexp_option field is the
+* only exception to this.
+*
+* This is because in the process of parsing grep.patternType
+* & grep.extendedRegexp we set opt->pattern_type_option and
+* opt->extended_regexp_option, respectively. We then
+* internally use opt->extended_regexp_option to see if we're
+* compiling an ERE. It must be unset if that's not actually
+* the case.
+*/
+   if (pattern_type != GREP_PATTERN_TYPE_ERE &&
+   opt->extended_regexp_option)
+   opt->extended_regexp_option = 0;
+
switch (pattern_type) {
case GREP_PATTERN_TYPE_UNSPECIFIED:
/* fall through */
@@ -177,7 +193,7 @@ static void grep_set_pattern_type_option(enum 
grep_pattern_type pattern_type, st
break;
 
case GREP_PATTERN_TYPE_ERE:
-   opt->regflags |= REG_EXTENDED;
+   opt->extended_regexp_option = 1;
break;
 
case GREP_PATTERN_TYPE_FIXED:
@@ -207,6 +223,11 @@ void grep_commit_pattern_type(enum grep_pattern_type 
pattern_type, struct grep_o
else if (opt->pattern_type_option != GREP_PATTERN_TYPE_UNSPECIFIED)
grep_set_pattern_type_option(opt->pattern_type_option, opt);
else if (opt->extended_regexp_option)
+   /*
+* This branch *must* happen after setting from the
+* opt->pattern_type_option above, we don't want
+* grep.extendedRegexp to override grep.patternType!
+*/