Re: [RFC/PATCH] diff: simplify cpp funcname regex

2014-03-18 Thread René Scharfe

Am 18.03.2014 09:02, schrieb Johannes Sixt:

Cc René; do you have any comments regarding grep --function-context?

Am 3/18/2014 6:24, schrieb Jeff King:

On Fri, Mar 14, 2014 at 07:56:46AM +0100, Johannes Sixt wrote:


Consider this code:

   void above()
   {}
   static int Y;
   static int A;
   int bar()
   {
 return X;
   }
   void below()
   {}


Thanks, this example is very helpful.


When you 'git grep --function-context X', then you get this output with
the current pattern, you proposal, and my proposal (file name etc omitted
for brevity):

   int bar()
   {
 return X;
   }


Right, that makes sense to me.


When you 'git grep --function-context Y', what do you want to see? With
the current pattern, and with your pattern that forbids semicolon we get:

   void above()
   {}
   static int Y;
   static int A;

and with my simple pattern, which allows semicolon, we get merely

   static int Y;

because the line itself is a hunk header (and we do not look back any
further) and the next line is as well. That is not exactly "function
context", and that is what I'm a bit worried about.


In global context there is no "function context", of course, so the 
latter makes sense.


"grep --function-context" is about useful context and its implementation 
piggy-backs on the hunk header definitions.  If those are useful then 
the grep output should be fine as well.  IAW: No worries, go ahead. :)


However, I only use the defaults heuristic (which shows just the Y-line 
as well) and don't know C++, so I my opinion on this matter isn't worth 
that much.


René

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] diff: simplify cpp funcname regex

2014-03-18 Thread Johannes Sixt
Cc René; do you have any comments regarding grep --function-context?

Am 3/18/2014 6:24, schrieb Jeff King:
> On Fri, Mar 14, 2014 at 07:56:46AM +0100, Johannes Sixt wrote:
> 
>> Consider this code:
>>
>>   void above()
>>   {}
>>   static int Y;
>>   static int A;
>>   int bar()
>>   {
>> return X;
>>   }
>>   void below()
>>   {}
> 
> Thanks, this example is very helpful.
> 
>> When you 'git grep --function-context X', then you get this output with
>> the current pattern, you proposal, and my proposal (file name etc omitted
>> for brevity):
>>
>>   int bar()
>>   {
>> return X;
>>   }
> 
> Right, that makes sense to me.
> 
>> When you 'git grep --function-context Y', what do you want to see? With
>> the current pattern, and with your pattern that forbids semicolon we get:
>>
>>   void above()
>>   {}
>>   static int Y;
>>   static int A;
>>
>> and with my simple pattern, which allows semicolon, we get merely
>>
>>   static int Y;
>>
>> because the line itself is a hunk header (and we do not look back any
>> further) and the next line is as well. That is not exactly "function
>> context", and that is what I'm a bit worried about.
> 
> Hmm. To be honest, I do not see yours as all that bad. Is "above()" or
> "A" actually interesting here? I'm not sure that they are. But then I do
> not use --function-context myself.
> 
> I guess it violates the "show things that are vaguely nearby, rather
> than a container" view of context that we discussed earlier. But somehow
> that seems less important to me with "--function-context".
> 
> So I dunno. I kind of like your version.

Then I'll polish my patch series (it also rewrites the test framework) and
post it.

-- Hannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] diff: simplify cpp funcname regex

2014-03-17 Thread Jeff King
On Fri, Mar 14, 2014 at 07:56:46AM +0100, Johannes Sixt wrote:

> Consider this code:
> 
>   void above()
>   {}
>   static int Y;
>   static int A;
>   int bar()
>   {
> return X;
>   }
>   void below()
>   {}

Thanks, this example is very helpful.

> When you 'git grep --function-context X', then you get this output with
> the current pattern, you proposal, and my proposal (file name etc omitted
> for brevity):
> 
>   int bar()
>   {
> return X;
>   }

Right, that makes sense to me.

> When you 'git grep --function-context Y', what do you want to see? With
> the current pattern, and with your pattern that forbids semicolon we get:
> 
>   void above()
>   {}
>   static int Y;
>   static int A;
> 
> and with my simple pattern, which allows semicolon, we get merely
> 
>   static int Y;
> 
> because the line itself is a hunk header (and we do not look back any
> further) and the next line is as well. That is not exactly "function
> context", and that is what I'm a bit worried about.

Hmm. To be honest, I do not see yours as all that bad. Is "above()" or
"A" actually interesting here? I'm not sure that they are. But then I do
not use --function-context myself.

I guess it violates the "show things that are vaguely nearby, rather
than a container" view of context that we discussed earlier. But somehow
that seems less important to me with "--function-context".

So I dunno. I kind of like your version.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] diff: simplify cpp funcname regex

2014-03-13 Thread Johannes Sixt
Am 3/14/2014 4:54, schrieb Jeff King:
> On Fri, Mar 07, 2014 at 08:23:05AM +0100, Johannes Sixt wrote:
> 
>> No, I meant lines like
>>
>> static double var;
>>-static int old;
>>+static int new;
>>
>> The motivation is to show hints where in a file the change is located:
>> Anything that could be of significance for the author should be picked up.
> 
> I see. Coupled with what you said below:
> 
>> As I said, my motivation is not so much to find a "container", but rather
>> a clue to help locate a change while reading the patch text. I can speak
>> for myself, but I have no idea what is more important for the majority.
> 
> your proposal makes a lot more sense to me, and I think that is really
> at the center of our discussion. I do not have a gut feeling for which
> is "more right" (i.e., "container" versus "context").
> 
> But given that most of the cases we are discussing are ones where the
> "diff -p" default regex gets it right (or at least better than) compared
> to the C regex, I am tempted to say that we should be erring in the
> direction of simplicity, and just finding interesting lines without
> worrying about containers (i.e., it argues for your patch).

We are in agreement here. So, let's base a decision on the implications on
git grep.

> ... but I'm not sure if I understand all
> of the implications for "git grep". Can you give some concrete examples?

Consider this code:

  void above()
  {}
  static int Y;
  static int A;
  int bar()
  {
return X;
  }
  void below()
  {}

When you 'git grep --function-context X', then you get this output with
the current pattern, you proposal, and my proposal (file name etc omitted
for brevity):

  int bar()
  {
return X;
  }

because we start the context at the last hunk header pattern above *or
including the matching line* (and write it in the output), and we stop at
the next hunk header pattern below the match (and do not write it in the
output).

When you 'git grep --function-context Y', what do you want to see? With
the current pattern, and with your pattern that forbids semicolon we get:

  void above()
  {}
  static int Y;
  static int A;

and with my simple pattern, which allows semicolon, we get merely

  static int Y;

because the line itself is a hunk header (and we do not look back any
further) and the next line is as well. That is not exactly "function
context", and that is what I'm a bit worried about.

-- Hannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] diff: simplify cpp funcname regex

2014-03-13 Thread Jeff King
On Fri, Mar 07, 2014 at 08:23:05AM +0100, Johannes Sixt wrote:

> No, I meant lines like
> 
> static double var;
>-static int old;
>+static int new;
> 
> The motivation is to show hints where in a file the change is located:
> Anything that could be of significance for the author should be picked up.

I see. Coupled with what you said below:

> As I said, my motivation is not so much to find a "container", but rather
> a clue to help locate a change while reading the patch text. I can speak
> for myself, but I have no idea what is more important for the majority.

your proposal makes a lot more sense to me, and I think that is really
at the center of our discussion. I do not have a gut feeling for which
is "more right" (i.e., "container" versus "context").

But given that most of the cases we are discussing are ones where the
"diff -p" default regex gets it right (or at least better than) compared
to the C regex, I am tempted to say that we should be erring in the
direction of simplicity, and just finding interesting lines without
worrying about containers (i.e., it argues for your patch).

> > Makes sense. I noticed your fix is to look for end-of-line or comments
> > afterwards.  Would it be simpler to just check for a non-colon, like:
> > 
> >   !^[ \t]*[A-Za-z_][A-Za-z_0-9]*:($|[^:])
> 
> I want to match [[:space:]] after the label's colon, because I have lot's
> of C++ files with CRLF, and I need to match the CR. Your more liberal
> pattern would fit as well, although it would pick up a bit field as in
> 
>struct foo {
>   unsigned
> flag: 1;
>-old
>+new

Thanks, I was having trouble thinking of another good use of a colon,
but bitfields are what I was missing. Your pattern is probably better
here.

So I am leaning towards your patch, but I'm not sure if I understand all
of the implications for "git grep". Can you give some concrete examples?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] diff: simplify cpp funcname regex

2014-03-06 Thread Johannes Sixt
Am 3/6/2014 22:28, schrieb Jeff King:
> On Wed, Mar 05, 2014 at 08:58:26AM +0100, Johannes Sixt wrote:
>> The pattern I chose also catches variable definition, not just
>> functions. That is what I need, but it hurts grep --function-context
>> That's the reason I didn't forward the patch, yet.
> 
> If by variable definition you mean:
> 
>struct foo bar = {
>   -   old
>   +   new
>};
> 
> I'd think that would be covered by the existing "struct|class|enum".
> Though I think we'd want to also allow keywords in front of it, like
> "static". I suspect the original was more meant to find:
> 
>struct foo {
>   -old
>   +new
>};

No, I meant lines like

static double var;
   -static int old;
   +static int new;

The motivation is to show hints where in a file the change is located:
Anything that could be of significance for the author should be picked up.

But that does not necessarily help grep --function-context. For example,
when there is a longish block of global variable definitions and there is
a match in the middle, then --function-context would provide no context
because the line itself would be regarded as the beginning of a
"function", i.e., the context, and the next line (which also matches the
pattern) would be the beginning of the *next* function, and would not be
in the context anymore.

> 
>> The parts of the pattern have the following flaws:
>>
>> - The first part matches an identifier followed immediately by a colon and
>>   arbitrary text and is intended to reject goto labels and C++ access
>>   specifiers (public, private, protected). But this pattern also rejects
>>   C++ constructs, which look like this:
>>
>> MyClass::MyClass()
>> MyClass::~MyClass()
>> MyClass::Item MyClass::Find(...
> 
> Makes sense. I noticed your fix is to look for end-of-line or comments
> afterwards.  Would it be simpler to just check for a non-colon, like:
> 
>   !^[ \t]*[A-Za-z_][A-Za-z_0-9]*:($|[^:])

I want to match [[:space:]] after the label's colon, because I have lot's
of C++ files with CRLF, and I need to match the CR. Your more liberal
pattern would fit as well, although it would pick up a bit field as in

   struct foo {
  unsigned
flag: 1;
   -old
   +new

I would not mind ignoring this case ("garbage in, garbage out" ;-).

>> - The second part matches an identifier followed by a list of qualified
>>   names (i.e. identifiers separated by the C++ scope operator '::')
>> [...]
> 
> A tried to keep the "looks like a function definition" bit in mine, and
> yours loosens this quite a bit more. I think that may be OK. That is, I
> do not think there is any reason for somebody to do:
> 
> void foo() {
> call_to_bar();
>-old
>+new
> }
> 
> That is, nobody would put a function _call_ without indentation. If
> something has alphanumerics at the left-most column, then it is probably
> interesting no matter what.
> 
>> - The third part of the pattern finally matches compound definitions. But
>>   it forgets about unions and namespaces, and also skips single-line
>>   definitions
>>
>> struct random_iterator_tag {};
>>
>>   because no semicolon can occur on the line.
> 
> I don't see how that is an interesting line. The point is to find a
> block that is surrounding the changes, but that is not surrounding
> the lines below.

I more often than not want to have an answer to the question "where?", not
"wherein?" Then anything that helps locate a hunk is useful.

(The particular example, an empty struct, looks strange for C programmers,
of course, but it's a common idiom in C++ when it comes to template
meta-programming.)

>> Notice that all interesting anchor points begin with an identifier or
>> keyword. But since there is a large variety of syntactical constructs after
>> the first "word", the simplest is to require only this word and accept
>> everything else. Therefore, this boils down to a line that begins with a
>> letter or underscore (optionally preceded by the C++ scope operator '::'
>> to accept functions returning a type anchored at the global namespace).
>> Replace the second and third part by a single pattern that picks such a
>> line.
> 
> Yeah, this bit makes sense to me.
> 
> Both yours and mine will find the first line here in things like:
> 
>void foo(void);
>   -void bar(void);
>   +void bar(int arg);
> 
> but I think that is OK. There _isn't_ any interesting surrounding
> context here. The current code will sometimes come up with an empty
> funcline (which is good), but it may just as easily come up with a
> totally bogus funcline in a case like:
> 
>void unrelated(void)
>{
>}
> 
>void foo(void);
>   -void bar(void);
>   +void bar(int arg);
> 
> So trying to be very restrictive and say "that doesn't look like a
> function" does not really buy us anything (and it creates tons of false
> negatives, as you documented, because C++ syntax has all kinds of crazy
> stuff).
> 
> _If_ the backwards search learned to

Re: [RFC/PATCH] diff: simplify cpp funcname regex

2014-03-06 Thread Jeff King
On Wed, Mar 05, 2014 at 08:58:26AM +0100, Johannes Sixt wrote:

> Here is a patch that I'm carrying around since... a while.
> What do you think?
> 
> The pattern I chose also catches variable definition, not just
> functions. That is what I need, but it hurts grep --function-context
> That's the reason I didn't forward the patch, yet.

If by variable definition you mean:

   struct foo bar = {
  -   old
  +   new
   };

I'd think that would be covered by the existing "struct|class|enum".
Though I think we'd want to also allow keywords in front of it, like
"static". I suspect the original was more meant to find:

   struct foo {
  -old
  +new
   };

> The parts of the pattern have the following flaws:
> 
> - The first part matches an identifier followed immediately by a colon and
>   arbitrary text and is intended to reject goto labels and C++ access
>   specifiers (public, private, protected). But this pattern also rejects
>   C++ constructs, which look like this:
> 
> MyClass::MyClass()
> MyClass::~MyClass()
> MyClass::Item MyClass::Find(...

Makes sense. I noticed your fix is to look for end-of-line or comments
afterwards.  Would it be simpler to just check for a non-colon, like:

  !^[ \t]*[A-Za-z_][A-Za-z_0-9]*:($|[^:])

> - The second part matches an identifier followed by a list of qualified
>   names (i.e. identifiers separated by the C++ scope operator '::')
> [...]

A tried to keep the "looks like a function definition" bit in mine, and
yours loosens this quite a bit more. I think that may be OK. That is, I
do not think there is any reason for somebody to do:

void foo() {
call_to_bar();
   -old
   +new
}

That is, nobody would put a function _call_ without indentation. If
something has alphanumerics at the left-most column, then it is probably
interesting no matter what.

> - The third part of the pattern finally matches compound definitions. But
>   it forgets about unions and namespaces, and also skips single-line
>   definitions
> 
> struct random_iterator_tag {};
> 
>   because no semicolon can occur on the line.

I don't see how that is an interesting line. The point is to find a
block that is surrounding the changes, but that is not surrounding
the lines below.

> Notice that all interesting anchor points begin with an identifier or
> keyword. But since there is a large variety of syntactical constructs after
> the first "word", the simplest is to require only this word and accept
> everything else. Therefore, this boils down to a line that begins with a
> letter or underscore (optionally preceded by the C++ scope operator '::'
> to accept functions returning a type anchored at the global namespace).
> Replace the second and third part by a single pattern that picks such a
> line.

Yeah, this bit makes sense to me.

Both yours and mine will find the first line here in things like:

   void foo(void);
  -void bar(void);
  +void bar(int arg);

but I think that is OK. There _isn't_ any interesting surrounding
context here. The current code will sometimes come up with an empty
funcline (which is good), but it may just as easily come up with a
totally bogus funcline in a case like:

   void unrelated(void)
   {
   }

   void foo(void);
  -void bar(void);
  +void bar(int arg);

So trying to be very restrictive and say "that doesn't look like a
function" does not really buy us anything (and it creates tons of false
negatives, as you documented, because C++ syntax has all kinds of crazy
stuff).

_If_ the backwards search learned to terminate (e.g., seeing the "^}"
line and saying "well, we can't be inside a function"), then such
negative lines might be useful for coming up with an empty funcname
rather than the bogus "void foo(void);". But we do not do that
currently, and I do not think it is that useful (the funcname above is
arguably just as or more useful than an empty one).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] diff: simplify cpp funcname regex

2014-03-05 Thread Junio C Hamano
Jeff King  writes:

> types, we simply look for an identifier at the start of the
> line that contains a "(", meaning it is either a function
> definition or a function call, and then not containing ";"
> which would indicate it is a call or declaration.

It is not worth worrying about:

foo(arg,
another);

that is not indented, so I think that simplicity is good.

> For example, for top-level changes
> outside functions, we might find:
>
>   N_("some text that is long"
>
> that is part of:
>
>   const char *foo =
>   N_("some text that is long"
>   "and spans multiple lines");

Unfortunate, but cannot be avoided.

>
> Before this change, we would skip past it (using the cpp regex, that is;
> the default one tends to find the same line) and either report nothing,
> or whatever random function was before us. So it's a behavior change,
> but the existing behavior is really no better.

True.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] diff: simplify cpp funcname regex

2014-03-04 Thread Johannes Sixt
Am 3/5/2014 1:36, schrieb Jeff King:
> The current funcname matcher for C files requires one or
> more words before the function name, like:
> 
>   static int foo(int arg)
>   {
> 
> However, some coding styles look like this:
> 
>   static int
>   foo(int arg)
>   {
> 
> and we do not match, even though the default regex would.
> 
> This patch simplifies the regex significantly. Rather than
> trying to handle all the variants of keywords and return
> types, we simply look for an identifier at the start of the
> line that contains a "(", meaning it is either a function
> definition or a function call, and then not containing ";"
> which would indicate it is a call or declaration.

Here is a patch that I'm carrying around since... a while.
What do you think?

The pattern I chose also catches variable definition, not just
functions. That is what I need, but it hurts grep --function-context
That's the reason I didn't forward the patch, yet.

--- 8< ---
From: Johannes Sixt 
Date: Tue, 25 Sep 2012 14:08:02 +0200
Subject: [PATCH] userdiff: have 'cpp' hunk header pattern catch more C++ anchor 
points

The hunk header pattern 'cpp' is intended for C and C++ source code, but
it is actually not very useful for the latter, and even hurts some
use-cases for the former.

The parts of the pattern have the following flaws:

- The first part matches an identifier followed immediately by a colon and
  arbitrary text and is intended to reject goto labels and C++ access
  specifiers (public, private, protected). But this pattern also rejects
  C++ constructs, which look like this:

MyClass::MyClass()
MyClass::~MyClass()
MyClass::Item MyClass::Find(...

- The second part matches an identifier followed by a list of qualified
  names (i.e. identifiers separated by the C++ scope operator '::')
  separated by space or '*' followed by an opening parenthesis (with space
  between the tokens). It matches function declarations like

struct item* get_head(...
int Outer::Inner::Func(...

  Since the pattern requires at least two identifiers, GNU-style function
  definitions are ignored:

void
func(...

  Moreover, since the pattern does not allow punctuation other than '*',
  the following C++ constructs are not recognized:

  . template definitions:
  template int func(T arg)

  . functions returning references:
  const string& get_message()

  . functions returning templated types:
  vector foo()

  . operator definitions:
  Value operator+(Value l, Value r)

- The third part of the pattern finally matches compound definitions. But
  it forgets about unions and namespaces, and also skips single-line
  definitions

struct random_iterator_tag {};

  because no semicolon can occur on the line.

Change the first pattern to require a colon at the end of the line (except
for trailing space and comments), so that it does not reject constructor
or destructor definitions.

Notice that all interesting anchor points begin with an identifier or
keyword. But since there is a large variety of syntactical constructs after
the first "word", the simplest is to require only this word and accept
everything else. Therefore, this boils down to a line that begins with a
letter or underscore (optionally preceded by the C++ scope operator '::'
to accept functions returning a type anchored at the global namespace).
Replace the second and third part by a single pattern that picks such a
line.

This has the following desirable consequence:

- All constructs mentioned above are recognized.

and the following likely desirable consequences:

- Definitions of global variables and typedefs are recognized:

int num_entries = 0;
extern const char* help_text;
typedef basic_string wstring;

- Commonly used marco-ized boilerplate code is recognized:

BEGIN_MESSAGE_MAP(CCanvas,CWnd)
Q_DECLARE_METATYPE(MyStruct)
PATTERNS("tex",...)

  (The last one is from this very patch.)

but also the following possibly undesirable consequence:

- When a label is not on a line by itself (except for a comment) it is no
  longer rejected, but can appear as a hunk header if it occurs at the
  beginning of a line:

next:;

IMO, the benefits of the change outweigh the (possible) regressions by a
large margin.

Signed-off-by: Johannes Sixt 
---
 userdiff.c | 8 +++-
 13 files changed, 3 insertions(+), 17 deletions(-)

diff --git a/userdiff.c b/userdiff.c
index ed958ef..49b2094 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -125,11 +125,9 @@ PATTERNS("tex", 
"^(((sub)*section|chapter|part)\\*{0,1}\\{.*)$",
 "[a-zA-Z@]+|.|[a-zA-Z0-9\x80-\xff]+"),
 PATTERNS("cpp",
 /* Jump targets or access declarations */
-"!^[ \t]*[A-Za-z_][A-Za-z_0-9]*:.*$\n"
-/* C/++ functions/methods at top level */
-"^([A-Za-z_][A-Za-z_0-9]*([ \t*]+[A-Za-z_][A-Za-z_0-9]*([ \t]*::[ 
\t]*[^[:space:]]+)?){1,}[ \t]*\\([^;]*)$\n"
-/* compound type at top level *

[RFC/PATCH] diff: simplify cpp funcname regex

2014-03-04 Thread Jeff King
The current funcname matcher for C files requires one or
more words before the function name, like:

  static int foo(int arg)
  {

However, some coding styles look like this:

  static int
  foo(int arg)
  {

and we do not match, even though the default regex would.

This patch simplifies the regex significantly. Rather than
trying to handle all the variants of keywords and return
types, we simply look for an identifier at the start of the
line that contains a "(", meaning it is either a function
definition or a function call, and then not containing ";"
which would indicate it is a call or declaration.

This can be fooled by something like:

if (foo)

but it is unlikely to find that at the very start of a line
with no identation (and without having a complete set of
keywords, we cannot distinguish that from a function called
"if" taking "foo" anyway).

We had no tests at all for .c files, so this attempts to add
a few obvious cases (including the one we are fixing here).

Signed-off-by: Jeff King 
---
I tried accommodating this one case in the current regex, but it just
kept getting more complicated and unreadable. Maybe I am being naive to
think that this much simpler version can work. We don't have a lot of
tests or a known-good data set to check.

I did try "git log -1000 -p" before and after on git.git, diff'd the
results and manually inspected. The results were a mix of strict
improvement (mostly instances of the style I was trying to fix here) and
cases where there is no good answer. For example, for top-level changes
outside functions, we might find:

  N_("some text that is long"

that is part of:

  const char *foo =
  N_("some text that is long"
  "and spans multiple lines");

Before this change, we would skip past it (using the cpp regex, that is;
the default one tends to find the same line) and either report nothing,
or whatever random function was before us. So it's a behavior change,
but the existing behavior is really no better.

 t/t4018-diff-funcname.sh | 36 
 userdiff.c   |  2 +-
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh
index 38a092a..1e80b15 100755
--- a/t/t4018-diff-funcname.sh
+++ b/t/t4018-diff-funcname.sh
@@ -93,6 +93,29 @@ sed -e '
s/song;/song();/
 ' Beer-correct.perl
 
+cat >Beer.c <<\EOF
+static int foo(void)
+{
+label:
+   int x = old;
+}
+
+struct foo; /* catch failure below */
+static int
+gnu(int arg)
+{
+   int x = old;
+}
+
+struct foo; /* catch failure below */
+int multiline(int arg,
+ char *arg2)
+{
+   int x = old;
+}
+EOF
+sed s/old/new/ Beer-correct.c
+
 test_expect_funcname () {
lang=${2-java}
test_expect_code 1 git diff --no-index -U1 \
@@ -127,6 +150,7 @@ test_expect_success 'set up .gitattributes declaring 
drivers to test' '
cat >.gitattributes <<-\EOF
*.java diff=java
*.perl diff=perl
+   *.c diff=cpp
EOF
 '
 
@@ -158,6 +182,18 @@ test_expect_success 'perl pattern is not distracted by 
forward declaration' '
test_expect_funcname "package Beer;\$" perl
 '
 
+test_expect_success 'c pattern skips labels' '
+   test_expect_funcname "static int foo(void)" c
+'
+
+test_expect_success 'c pattern matches GNU-style functions' '
+   test_expect_funcname "gnu(int arg)\$" c
+'
+
+test_expect_success 'c pattern matches multiline functions' '
+   test_expect_funcname "int multiline(int arg,\$" c
+'
+
 test_expect_success 'custom pattern' '
test_config diff.java.funcname "!static
 !String
diff --git a/userdiff.c b/userdiff.c
index 10b61ec..b9d52b7 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -127,7 +127,7 @@
 /* Jump targets or access declarations */
 "!^[ \t]*[A-Za-z_][A-Za-z_0-9]*:.*$\n"
 /* C/++ functions/methods at top level */
-"^([A-Za-z_][A-Za-z_0-9]*([ \t*]+[A-Za-z_][A-Za-z_0-9]*([ \t]*::[ 
\t]*[^[:space:]]+)?){1,}[ \t]*\\([^;]*)$\n"
+"^([A-Za-z_].*\\([^;]*)$\n"
 /* compound type at top level */
 "^((struct|class|enum)[^;]*)$",
 /* -- */
-- 
1.8.5.2.500.g8060133
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html