Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-10-02 Thread René Scharfe
Am 15.09.2016 um 23:39 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> Am 15.09.2016 um 22:01 schrieb Junio C Hamano:
>>> René Scharfe  writes:
>>>
 Take this for example:

 -  strbuf_addf(>obuf, _("(bad commit)\n"));
 +  strbuf_addstr(>obuf, _("(bad commit)\n"));

 If there's a language that uses percent signs instead of parens or as
 regular letters, then they need to be escaped in the translated string
 before, but not after the patch.  As I wrote: silly.
>>>
>>> Ahh, OK, so "This use of addf only has format part and nothing else,
>>> hence the format part can be taken as-is" which is the Coccinelle rule
>>> used to produce this patch is incomplete and always needs manual
>>> inspection, in case the format part wanted to give a literal % in
>>> the output.  E.g. it is a bug to convert this
>>>
>>> strbuf_addf(, _("this is 100%% wrong!"));
>>>
>>> to
>>>
>>> strbuf_addstr(, _("this is 100%% wrong!"));
>>
>> Right.  Such strings seem to be quite rare in practice, though. 
>>
>>> Thanks for clarification.  Perhaps the strbuf.cocci rule file can
>>> have some comment to warn the person who builds *.patch file to look
>>> for % in E2, or something?
>>
>> Something like this?
> 
> Yup, with something like that I would understdood where that
> puzzling question came from.

Here's something better than a comment:

-- >8 --
Subject: [PATCH] coccicheck: make transformation for strbuf_addf(sb, "...") 
more precise

We can replace strbuf_addf() calls that just add a simple string with
calls to strbuf_addstr() to make the intent clearer.  We need to be
careful if that string contains printf format specifications like %%,
though, as a simple replacement would change the output.

Add checks to the semantic patch to make sure we only perform the
transformation if the second argument is a string constant (possibly
translated) that doesn't contain any percent signs.

Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/strbuf.cocci | 29 ++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index 1e24298..63995f2 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -1,8 +1,31 @@
+@ strbuf_addf_with_format_only @
+expression E;
+constant fmt;
 @@
-expression E1, E2;
+  strbuf_addf(E,
+(
+  fmt
+|
+  _(fmt)
+)
+  );
+
+@ script:python @
+fmt << strbuf_addf_with_format_only.fmt;
 @@
-- strbuf_addf(E1, E2);
-+ strbuf_addstr(E1, E2);
+cocci.include_match("%" not in fmt)
+
+@ extends strbuf_addf_with_format_only @
+@@
+- strbuf_addf
++ strbuf_addstr
+  (E,
+(
+  fmt
+|
+  _(fmt)
+)
+  );
 
 @@
 expression E1, E2;
-- 
2.10.0


Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread brian m. carlson
On Thu, Sep 15, 2016 at 08:31:00PM +0200, René Scharfe wrote:
> Replace uses of strbuf_addf() for adding strings with more lightweight
> strbuf_addstr() calls.  This makes the intent clearer and avoids
> potential issues with printf format specifiers.
> 
> 02962d36845b89145cd69f8bc65e015d78ae3434 already converted six cases,
> this patch covers eleven more.
> 
> A semantic patch for Coccinelle is included for easier checking for
> new cases that might be introduced in the future.

I think all three of these patches look good.  I'm glad to see us
getting better use out of Coccinelle.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread Junio C Hamano
René Scharfe  writes:

> Am 15.09.2016 um 22:01 schrieb Junio C Hamano:
>> René Scharfe  writes:
>> 
>>> Take this for example:
>>>
>>> -   strbuf_addf(>obuf, _("(bad commit)\n"));
>>> +   strbuf_addstr(>obuf, _("(bad commit)\n"));
>>>
>>> If there's a language that uses percent signs instead of parens or as
>>> regular letters, then they need to be escaped in the translated string
>>> before, but not after the patch.  As I wrote: silly.
>> 
>> Ahh, OK, so "This use of addf only has format part and nothing else,
>> hence the format part can be taken as-is" which is the Coccinelle rule
>> used to produce this patch is incomplete and always needs manual
>> inspection, in case the format part wanted to give a literal % in
>> the output.  E.g. it is a bug to convert this
>> 
>>  strbuf_addf(, _("this is 100%% wrong!"));
>> 
>> to
>> 
>>  strbuf_addstr(, _("this is 100%% wrong!"));
>
> Right.  Such strings seem to be quite rare in practice, though. 
>
>> Thanks for clarification.  Perhaps the strbuf.cocci rule file can
>> have some comment to warn the person who builds *.patch file to look
>> for % in E2, or something?
>
> Something like this?

Yup, with something like that I would understdood where that
puzzling question came from.

Thanks.

>
> ---
>  contrib/coccinelle/strbuf.cocci | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
> index 7932d48..3f535ca 100644
> --- a/contrib/coccinelle/strbuf.cocci
> +++ b/contrib/coccinelle/strbuf.cocci
> @@ -1,3 +1,5 @@
> +// Careful, this is not fully equivalent: "%" is no longer treated
> +// specially.  Check for "%%", "%m" etc. in the format string (E2).
>  @@
>  expression E1, E2;
>  @@


Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread René Scharfe
Am 15.09.2016 um 22:01 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> Take this for example:
>>
>> -strbuf_addf(>obuf, _("(bad commit)\n"));
>> +strbuf_addstr(>obuf, _("(bad commit)\n"));
>>
>> If there's a language that uses percent signs instead of parens or as
>> regular letters, then they need to be escaped in the translated string
>> before, but not after the patch.  As I wrote: silly.
> 
> Ahh, OK, so "This use of addf only has format part and nothing else,
> hence the format part can be taken as-is" which is the Coccinelle rule
> used to produce this patch is incomplete and always needs manual
> inspection, in case the format part wanted to give a literal % in
> the output.  E.g. it is a bug to convert this
> 
>   strbuf_addf(, _("this is 100%% wrong!"));
> 
> to
> 
>   strbuf_addstr(, _("this is 100%% wrong!"));

Right.  Such strings seem to be quite rare in practice, though. 

> Thanks for clarification.  Perhaps the strbuf.cocci rule file can
> have some comment to warn the person who builds *.patch file to look
> for % in E2, or something?

Something like this?

---
 contrib/coccinelle/strbuf.cocci | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index 7932d48..3f535ca 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -1,3 +1,5 @@
+// Careful, this is not fully equivalent: "%" is no longer treated
+// specially.  Check for "%%", "%m" etc. in the format string (E2).
 @@
 expression E1, E2;
 @@
-- 
2.10.0





Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread Junio C Hamano
René Scharfe  writes:

> Take this for example:
>
> - strbuf_addf(>obuf, _("(bad commit)\n"));
> + strbuf_addstr(>obuf, _("(bad commit)\n"));
>
> If there's a language that uses percent signs instead of parens or as
> regular letters, then they need to be escaped in the translated string
> before, but not after the patch.  As I wrote: silly.

Ahh, OK, so "This use of addf only has format part and nothing else,
hence the format part can be taken as-is" which is the Coccinelle rule
used to produce this patch is incomplete and always needs manual
inspection, in case the format part wanted to give a literal % in
the output.  E.g. it is a bug to convert this

strbuf_addf(, _("this is 100%% wrong!"));

to

strbuf_addstr(, _("this is 100%% wrong!"));

Thanks for clarification.  Perhaps the strbuf.cocci rule file can
have some comment to warn the person who builds *.patch file to look
for % in E2, or something?



Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread René Scharfe

Am 15.09.2016 um 21:38 schrieb Jeff King:

On Thu, Sep 15, 2016 at 12:25:43PM -0700, Junio C Hamano wrote:


Silly question: Is there a natural language that uses percent signs
as letters or e.g. instead of commas? :)


I don't know, but if they do, they'd better get used to escaping them.
:)


I do not know either, but I am curious where that question comes
from.  I stared at this patch for a few minutes but couldn't guess.


My initial thought is that the next step after picking this low-hanging
fruit would be to find cases where the strings do not contain "%", and
thus we do not have to care about formatting. But a case like:

  strbuf_addf(, "this does not have any percents!", foo);

is simply broken (albeit in a way that we ignore foo, so it's just ugly
code, not a real bug).

So I dunno. I too am curious.


Take this for example:

-   strbuf_addf(>obuf, _("(bad commit)\n"));
+   strbuf_addstr(>obuf, _("(bad commit)\n"));

If there's a language that uses percent signs instead of parens or as 
regular letters, then they need to be escaped in the translated string 
before, but not after the patch.  As I wrote: silly.


René


Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread Jeff King
On Thu, Sep 15, 2016 at 12:25:43PM -0700, Junio C Hamano wrote:

> >> Silly question: Is there a natural language that uses percent signs
> >> as letters or e.g. instead of commas? :)
> >
> > I don't know, but if they do, they'd better get used to escaping them.
> > :)
> 
> I do not know either, but I am curious where that question comes
> from.  I stared at this patch for a few minutes but couldn't guess.

My initial thought is that the next step after picking this low-hanging
fruit would be to find cases where the strings do not contain "%", and
thus we do not have to care about formatting. But a case like:

  strbuf_addf(, "this does not have any percents!", foo);

is simply broken (albeit in a way that we ignore foo, so it's just ugly
code, not a real bug).

So I dunno. I too am curious.

-Peff


Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Sep 15, 2016 at 08:31:00PM +0200, René Scharfe wrote:
>
>> Replace uses of strbuf_addf() for adding strings with more lightweight
>> strbuf_addstr() calls.  This makes the intent clearer and avoids
>> potential issues with printf format specifiers.
>> 
>> 02962d36845b89145cd69f8bc65e015d78ae3434 already converted six cases,
>> this patch covers eleven more.
>
> Great, these all look obviously correct.

Yes.

>> Silly question: Is there a natural language that uses percent signs
>> as letters or e.g. instead of commas? :)
>
> I don't know, but if they do, they'd better get used to escaping them.
> :)

I do not know either, but I am curious where that question comes
from.  I stared at this patch for a few minutes but couldn't guess.


Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread Jeff King
On Thu, Sep 15, 2016 at 08:31:00PM +0200, René Scharfe wrote:

> Replace uses of strbuf_addf() for adding strings with more lightweight
> strbuf_addstr() calls.  This makes the intent clearer and avoids
> potential issues with printf format specifiers.
> 
> 02962d36845b89145cd69f8bc65e015d78ae3434 already converted six cases,
> this patch covers eleven more.

Great, these all look obviously correct.

> A semantic patch for Coccinelle is included for easier checking for
> new cases that might be introduced in the future.

I think there was some discussion in brian's object_id patches about
whether we wanted to carry Coccinelle transformations in the tree, but I
don't remember the outcome. I don't have an opinion myself.

> Silly question: Is there a natural language that uses percent signs
> as letters or e.g. instead of commas? :)

I don't know, but if they do, they'd better get used to escaping them.
:)

-Peff