Re: [PATCH 1/3] add QSORT

2016-10-05 Thread Kevin Bracey

On 04/10/2016 23:31, René Scharfe wrote:



So let's summarize; here's the effect of a raw qsort(3) call:

array == NULL  nmemb  bug  QSORT  following NULL check
-  -  ---  -  
0  0  no   qsort  is skipped
0 >0  no   qsort  is skipped
1  0  no   qsort  is skipped (bad!) **
1 >0  yes  qsort  is skipped **

Right - row 3 may not be a bug from the point of view of your internals, 
but it means you violate the API of qsort.Therefore a fix is required.



With the micro-optimization removed (nmemb > 0) the matrix gets simpler:

array == NULL  nmemb  bug  QSORT  following NULL check
-  -  ---  -  
0  0  no   noop   is executed
0 >0  no   qsort  is skipped
1  0  no   noop   is executed
1 >0  yes  qsort  is skipped **

And with your NULL check (array != NULL) we'd get:

array == NULL  nmemb  bug  QSORT  following NULL check
-  -  ---  -  
0  0  no   qsort  reuses check result
0 >0  no   qsort  reuses check result
1  0  no   noop   reuses check result
1 >0  yes  noop   reuses check result

Did I get it right?  AFAICS all variants (except raw qsort) are safe 
-- no useful NULL checks are removed, and buggy code should be noticed 
by segfaults in code accessing the sorted array.

I think your tables are correct.

But I disagree that you could ever call invoking the "" lines safe. 
Unless you have documentation on what limit GCC (and your other 
compilers) are prepared to put on the undefined behaviour of violating 
that "non-null" constraint.


Up to now dereferencing a null pointer has been implicitly (or 
explicitly?) defined as simply generating SIGSEGV. And that has 
naturally extended into NULL passed to library implementations. But 
that's no longer true - it seems bets are somewhat off.


But, as long as you are confident you never invoke that line without a 
program bug - ie an API precondition of your own QSORT is that NULL is 
legal iff nmemb is zero, then I guess it's fine. Behaviour is defined, 
as long as you don't violate your internal preconditions.


Kevin




Re: [PATCH 1/3] add QSORT

2016-10-04 Thread René Scharfe

Am 04.10.2016 um 07:28 schrieb Kevin Bracey:

On 04/10/2016 01:00, René Scharfe wrote:

Am 03.10.2016 um 19:09 schrieb Kevin Bracey:

As such, NULL checks can still be elided even with your change. If you
effectively change your example to:

if (nmemb > 1)
qsort(array, nmemb, size, cmp);
if (!array)
printf("array is NULL\n");

array may only be checked for NULL if nmemb <= 1. You can see GCC doing
that in the compiler explorer - it effectively turns that into "else
if".


We don't support array == NULL together with nmemb > 1, so a segfault
is to be expected in such cases, and thus NULL checks can be removed
safely.


Possibly true in practice.

But technically wrong by the C standard - behaviour is undefined if the
qsort pointer is invalid. You can't formally expect the defined
behaviour of a segfault when sending NULL into qsort. (Hell, maybe the
qsort has its own NULL check and silently returns!


A qsort(3) implementation that doesn't segfault is inconvenient, but 
still safe.  I'm more concerned about NULL checks being removed from our 
code.



So if it's not a program error for array to be NULL and nmemb to be zero
in your code, and you want a diagnostic for array=NULL, nmemb non-zero,
I think you should put that diagnostic into sane_qsort as an assert or
something, not rely on qsort's undefined behaviour being a segfault.

sane_qsort(blah)
{
 if (nmemb >= 1) {
 assert(array);
 qsort(array, nmemb, ...);
 }
}

Can't invoke undefined behaviour from NULL without triggering the
assert. (Could still have other invalid pointers, of course).


We could do that, but I think it's not necessary.  We'd get a segfault 
when accessing the sorted array anyway.  (If we don't look at the data 
after sorting then we can get rid of the sorting step altogether.)



Usually I am on the side of "no NULL checks", as I make the assumption
that we will get a segfault as soon as NULL pointers are used, and those
are generally easy to diagnose. But seeing a compiler invoking this sort
of new trickery due to invoking undefined behaviour is making me more
nervous about doing so...


I was shocked a bit myself when I learned about this, but let's not 
panic. :)



To make that check really work, you have to do:

if (array)
qsort(array, nmemb, size, cmp);
else
printf("array is NULL\n");

So maybe your "sane_qsort" should be checking array, not nmemb.


It would be safe, but arguably too much so, because non-empty arrays
with NULL wouldn't segfault anymore, and thus become harder to
identify as the programming errors they are.

Well, you get the print. Although I guess you're worrying about the
second if being real code, not a debugging check.


Yes, but the optimization is valid: If nmemb > 0 then array can only be 
NULL if we have a bug, and then we'd get a segfault eventually.  So such 
checks can be removed safely.



I must say, this is quite a courageous new optimisation from GCC. It
strikes me as finding a language lawyer loophole that seems to have been
intended for something else (mapping library functions directly onto
CISCy CPU intrinsics), and using it to invent a whole new optimisation
that seems more likely to trigger bugs than optimise any significant
amount of code in a desirable way.


Yeah, and the bugs triggered are quite obscure in this case.  But having 
richer type information and thus restricting the range of possible 
values for variables *can* enable useful optimizations.



Doubly weird as there's no (standard) language support for this. I don't
know how you'd define "my_qsort" that triggered the same optimisations.


The nonnull attribute is a GCC extension, but it's also supported by clang:

  http://clang.llvm.org/docs/AttributeReference.html#nonnull-gnu-nonnull

I don't know if other compilers support it as well, or if there are 
efforts underway to standardize it.



I've seen similar
library-knowledge-without-any-way-to-reproduce-in-user-code
optimisations like "malloc returns a new pointer that doesn't alias with
anything existing" (and no way to reproduce the optimisation with
my_malloc_wrapper). But those seemed to have a clear performance
benefit, without any obvious traps. Doubtful about this one.


Still we have to deal with it..

So let's summarize; here's the effect of a raw qsort(3) call:

array == NULL  nmemb  bug  QSORT  following NULL check
-  -  ---  -  
0  0  no   qsort  is skipped
0 >0  no   qsort  is skipped
1  0  no   qsort  is skipped (bad!)
1 >0  yes  qsort  is skipped

Here's what the current implementation (nmemb > 1) does:

array == NULL  nmemb  bug  QSORT  following NULL check
-  -  ---  -  
0  0  no   noop   is executed
0  1  no   noop   is executed
0 >1  no   qsort  is skipped
 

Re: [PATCH 1/3] add QSORT

2016-10-04 Thread Kevin Bracey

On 04/10/2016 01:00, René Scharfe wrote:

Am 03.10.2016 um 19:09 schrieb Kevin Bracey:

As such, NULL checks can still be elided even with your change. If you
effectively change your example to:

if (nmemb > 1)
qsort(array, nmemb, size, cmp);
if (!array)
printf("array is NULL\n");

array may only be checked for NULL if nmemb <= 1. You can see GCC doing
that in the compiler explorer - it effectively turns that into "else
if".


We don't support array == NULL together with nmemb > 1, so a segfault 
is to be expected in such cases, and thus NULL checks can be removed 
safely.



Possibly true in practice.

But technically wrong by the C standard - behaviour is undefined if the 
qsort pointer is invalid. You can't formally expect the defined 
behaviour of a segfault when sending NULL into qsort. (Hell, maybe the 
qsort has its own NULL check and silently returns! cf printf - some 
printfs will segfault when passed NULL, some print "(null)"). I've 
worked on systems that don't fault reads to NULL, only writes, so those 
might not segfault there, if NULL appeared sorted...


And obviously there's the language lawyer favourite possibility of the 
call causing nasal flying monkeys or whatever.


So if it's not a program error for array to be NULL and nmemb to be zero 
in your code, and you want a diagnostic for array=NULL, nmemb non-zero, 
I think you should put that diagnostic into sane_qsort as an assert or 
something, not rely on qsort's undefined behaviour being a segfault.


sane_qsort(blah)
{
 if (nmemb >= 1) {
 assert(array);
 qsort(array, nmemb, ...);
 }
}

Can't invoke undefined behaviour from NULL without triggering the 
assert. (Could still have other invalid pointers, of course).


Usually I am on the side of "no NULL checks", as I make the assumption 
that we will get a segfault as soon as NULL pointers are used, and those 
are generally easy to diagnose. But seeing a compiler invoking this sort 
of new trickery due to invoking undefined behaviour is making me more 
nervous about doing so...



To make that check really work, you have to do:

if (array)
qsort(array, nmemb, size, cmp);
else
printf("array is NULL\n");

So maybe your "sane_qsort" should be checking array, not nmemb.


It would be safe, but arguably too much so, because non-empty arrays 
with NULL wouldn't segfault anymore, and thus become harder to 
identify as the programming errors they are.
Well, you get the print. Although I guess you're worrying about the 
second if being real code, not a debugging check.


I must say, this is quite a courageous new optimisation from GCC. It 
strikes me as finding a language lawyer loophole that seems to have been 
intended for something else (mapping library functions directly onto 
CISCy CPU intrinsics), and using it to invent a whole new optimisation 
that seems more likely to trigger bugs than optimise any significant 
amount of code in a desirable way.


Doubly weird as there's no (standard) language support for this. I don't 
know how you'd define "my_qsort" that triggered the same optimisations.


I've seen similar 
library-knowledge-without-any-way-to-reproduce-in-user-code 
optimisations like "malloc returns a new pointer that doesn't alias with 
anything existing" (and no way to reproduce the optimisation with 
my_malloc_wrapper). But those seemed to have a clear performance 
benefit, without any obvious traps. Doubtful about this one.


Kevin



Re: [PATCH 1/3] add QSORT

2016-10-03 Thread René Scharfe

Am 03.10.2016 um 19:09 schrieb Kevin Bracey:

As such, NULL checks can still be elided even with your change. If you
effectively change your example to:

if (nmemb > 1)
qsort(array, nmemb, size, cmp);
if (!array)
printf("array is NULL\n");

array may only be checked for NULL if nmemb <= 1. You can see GCC doing
that in the compiler explorer - it effectively turns that into "else
if".


We don't support array == NULL together with nmemb > 1, so a segfault is 
to be expected in such cases, and thus NULL checks can be removed safely.



To make that check really work, you have to do:

if (array)
qsort(array, nmemb, size, cmp);
else
printf("array is NULL\n");

So maybe your "sane_qsort" should be checking array, not nmemb.


It would be safe, but arguably too much so, because non-empty arrays 
with NULL wouldn't segfault anymore, and thus become harder to identify 
as the programming errors they are.


The intention is to support NULL pointers only for empty arrays (in 
addition to valid pointers).  That we also support NULL pointers for 
arrays with a single member might be considered to be the result of a 
premature optimization, but it should be safe -- the compiler won't 
remove checks unexpectedly.


Does that make sense (it's getting late here, so my logic might already 
be resting..)?


René


Re: [PATCH 1/3] add QSORT

2016-10-03 Thread Kevin Bracey

On 01/10/2016 19:19, René Scharfe wrote:

It's hard to imagine an implementation of qsort(3) that can't handle
zero elements.  QSORT's safety feature is that it prevents the compiler
from removing NULL checks for the array pointer.  E.g. the last two
lines in the following example could be optimized away:

qsort(ptr, n, sizeof(*ptr), fn);
if (!ptr)
do_stuff();

You can see that on https://godbolt.org/g/JwS99b -- an awesome website
for exploring compilation results for small snippets, by the way.

This optimization is dangerous when combined with the convention of
using a NULL pointer for empty arrays.  Diagnosing an affected NULL
check is probably quite hard -- it's right there in the code after all
and not all compilers remove it.


Hang on, hang on. This is either a compiler bug, or you're wrong on your 
assumption about the specification of qsort.


Either way, the extra layer of indirection is not proper protection. The 
unwanted compiler optimisation you're inadvertently triggering could 
still be triggered through the inline.


Now, looking at the C standard, this isn't actually clear to me. The 
standard says that if you call qsort with nmemb zero, the pointer still 
has to be "valid". Not totally clear to me if NULL is valid, by their 
definition in C99 7.1.4. Googling hasn't given me a concrete answer.


The compiler seems to think that NULL wouldn't be valid, so because 
you've called qsort on it, you've invoked undefined behaviour if it's 
NULL, so it's free to elide the NULL check.


Kevin



Re: [PATCH 1/3] add QSORT

2016-10-03 Thread Kevin Bracey

On 01/10/2016 19:19, René Scharfe wrote:


It's hard to imagine an implementation of qsort(3) that can't handle
zero elements.  QSORT's safety feature is that it prevents the compiler
from removing NULL checks for the array pointer.  E.g. the last two
lines in the following example could be optimized away:

qsort(ptr, n, sizeof(*ptr), fn);
if (!ptr)
do_stuff();

You can see that on https://godbolt.org/g/JwS99b -- an awesome website
for exploring compilation results for small snippets, by the way.

Ah, second attempt. Originally misread the original code, and didn't 
understand what it was doing.


I get it now.

A nasty trap I hadn't been aware of - I was under the impression NULL + 
zero length was generally legal, but the C standard does indeed not give 
you a specific out for NULL to library functions in that case.


As such, NULL checks can still be elided even with your change. If you 
effectively change your example to:


if (nmemb > 1)
qsort(array, nmemb, size, cmp);
if (!array)
printf("array is NULL\n");

array may only be checked for NULL if nmemb <= 1. You can see GCC doing 
that in the compiler explorer - it effectively turns that into "else 
if".  To make that check really work, you have to do:


if (array)
qsort(array, nmemb, size, cmp);
else
printf("array is NULL\n");

So maybe your "sane_qsort" should be checking array, not nmemb.

Kevin



Re: [PATCH 1/3] add QSORT

2016-10-01 Thread René Scharfe
Am 30.09.2016 um 00:36 schrieb Junio C Hamano:
> 3. builtin/show-branch.c does this:
> 
> qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]),
>   compare_ref_name);
> 
> where ref_name[] is a file-scope global:
> 
> static char *ref_name[MAX_REVS + 1];
> 
> and top and bottom are plain integers.  The sizeof() does not take
> the size of *base, so it is understandable that this does not get
> automatically converted.
> 
> It seems that some calls to this function _could_ send the same top
> and bottom, asking for 0 element array to be sorted, by the way.

It's hard to imagine an implementation of qsort(3) that can't handle
zero elements.  QSORT's safety feature is that it prevents the compiler
from removing NULL checks for the array pointer.  E.g. the last two
lines in the following example could be optimized away:

qsort(ptr, n, sizeof(*ptr), fn);
if (!ptr)
do_stuff();

You can see that on https://godbolt.org/g/JwS99b -- an awesome website
for exploring compilation results for small snippets, by the way.

This optimization is dangerous when combined with the convention of
using a NULL pointer for empty arrays.  Diagnosing an affected NULL
check is probably quite hard -- it's right there in the code after all
and not all compilers remove it.

builtin/show-branch.c never passes NULL, so it's not affected by that
hazard.  We can (and should, IMHO) still use QSORT there for
consistency and convenience, though:

-- >8 --
Subject: [PATCH] show-branch: use QSORT

Shorten the code by using QSORT instead of calling qsort(3) directly,
as the former determines the element size automatically and checks if
there are at least two elements to sort already.

Signed-off-by: Rene Scharfe 
---
 builtin/show-branch.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 623ca56..974f340 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -353,8 +353,7 @@ static int compare_ref_name(const void *a_, const void *b_)
 
 static void sort_ref_range(int bottom, int top)
 {
-   qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]),
- compare_ref_name);
+   QSORT(ref_name + bottom, top - bottom, compare_ref_name);
 }
 
 static int append_ref(const char *refname, const struct object_id *oid,
@@ -540,8 +539,7 @@ static void append_one_rev(const char *av)
if (saved_matches == ref_name_cnt &&
ref_name_cnt < MAX_REVS)
error(_("no matching refs with %s"), av);
-   if (saved_matches + 1 < ref_name_cnt)
-   sort_ref_range(saved_matches, ref_name_cnt);
+   sort_ref_range(saved_matches, ref_name_cnt);
return;
}
die("bad sha1 reference %s", av);
-- 
2.10.0




Re: [PATCH 1/3] add QSORT

2016-09-29 Thread René Scharfe
Am 30.09.2016 um 01:21 schrieb René Scharfe:
> Am 30.09.2016 um 00:36 schrieb Junio C Hamano:
>> René Scharfe  writes:
>>
>>> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
>>> size of the array elements and supports the convention of initializing
>>> empty arrays with a NULL pointer, which we use in some places.
>>>
>>> Calling qsort(3) directly with a NULL pointer is undefined -- even with
>>> an element count of zero -- and allows the compiler to optimize away any
>>> following NULL checks.  Using the macro avoids such surprises.
>>>
>>> Add a semantic patch as well to demonstrate the macro's usage and to
>>> automate the transformation of trivial cases.
>>>
>>> Signed-off-by: Rene Scharfe 
>>> ---
>>>  contrib/coccinelle/qsort.cocci | 19 +++
>>>  git-compat-util.h  |  8 
>>>  2 files changed, 27 insertions(+)
>>>  create mode 100644 contrib/coccinelle/qsort.cocci
>>
>> The direct calls to qsort(3) that this series leaves behind are
>> interesting.
>>
>> 1. builtin/index-pack.c has this:
>>
>>  if (1 < opts->anomaly_nr)
>>  qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), 
>> cmp_uint32);
>>
>> where opts->anomaly is coming from pack.h:
>>
>> struct pack_idx_option {
>> unsigned flags;
>> ...
>> int anomaly_alloc, anomaly_nr;
>> uint32_t *anomaly;
>> };
>>
>> I cannot quite see how the automated conversion misses it?  It's not
>> like base and nmemb are type-restricted in the rule (they are both
>> just "expression"s).
>>
>> 2. builtin/shortlog.c has this:
>>
>>  qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
>>log->summary ? compare_by_counter : compare_by_list);
>>
>> where log->list is coming from shortlog.h:
>>
>> struct shortlog {
>> struct string_list list;
>> };
>>
>> and string-list.h says:
>>
>> struct string_list {
>> struct string_list_item *items;
>> unsigned int nr, alloc;
>> ...
>> };
>>
>> which seems to be a good candidate for this rule:
>>
>> type T;
>> T *base;
>> expression nmemb, compar;
>> @@
>> - qsort(base, nmemb, sizeof(T), compar);
>> + QSORT(base, nmemb, compar);
>>
>> if we take "T == struct string_list_item".
> 
> Transformations for these two are generated if we pass --all-includes
> to spatch.  So let's do that.

And here's the result:

-- >8 --
Subject: [PATCH] use QSORT, part 2

Convert two more qsort(3) calls to QSORT to reduce code size and for
better safety and consistency.

Signed-off-by: Rene Scharfe 
---
Squashable.

 builtin/index-pack.c | 3 +--
 builtin/shortlog.c   | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 7657d0a..0a27bab 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1531,8 +1531,7 @@ static void read_v2_anomalous_offsets(struct packed_git 
*p,
opts->anomaly[opts->anomaly_nr++] = ntohl(idx2[off * 2 + 1]);
}
 
-   if (1 < opts->anomaly_nr)
-   qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), 
cmp_uint32);
+   QSORT(opts->anomaly, opts->anomaly_nr, cmp_uint32);
 }
 
 static void read_idx_option(struct pack_idx_option *opts, const char 
*pack_name)
diff --git a/builtin/shortlog.c b/builtin/shortlog.c
index 25fa8a6..ba0e115 100644
--- a/builtin/shortlog.c
+++ b/builtin/shortlog.c
@@ -308,7 +308,7 @@ void shortlog_output(struct shortlog *log)
struct strbuf sb = STRBUF_INIT;
 
if (log->sort_by_number)
-   qsort(log->list.items, log->list.nr, sizeof(struct 
string_list_item),
+   QSORT(log->list.items, log->list.nr,
  log->summary ? compare_by_counter : compare_by_list);
for (i = 0; i < log->list.nr; i++) {
const struct string_list_item *item = >list.items[i];
-- 
2.10.0




Re: [PATCH 1/3] add QSORT

2016-09-29 Thread René Scharfe
Am 30.09.2016 um 00:36 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
>> size of the array elements and supports the convention of initializing
>> empty arrays with a NULL pointer, which we use in some places.
>>
>> Calling qsort(3) directly with a NULL pointer is undefined -- even with
>> an element count of zero -- and allows the compiler to optimize away any
>> following NULL checks.  Using the macro avoids such surprises.
>>
>> Add a semantic patch as well to demonstrate the macro's usage and to
>> automate the transformation of trivial cases.
>>
>> Signed-off-by: Rene Scharfe 
>> ---
>>  contrib/coccinelle/qsort.cocci | 19 +++
>>  git-compat-util.h  |  8 
>>  2 files changed, 27 insertions(+)
>>  create mode 100644 contrib/coccinelle/qsort.cocci
> 
> The direct calls to qsort(3) that this series leaves behind are
> interesting.
> 
> 1. builtin/index-pack.c has this:
> 
>   if (1 < opts->anomaly_nr)
>   qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), 
> cmp_uint32);
> 
> where opts->anomaly is coming from pack.h:
> 
> struct pack_idx_option {
> unsigned flags;
> ...
> int anomaly_alloc, anomaly_nr;
> uint32_t *anomaly;
> };
> 
> I cannot quite see how the automated conversion misses it?  It's not
> like base and nmemb are type-restricted in the rule (they are both
> just "expression"s).
> 
> 2. builtin/shortlog.c has this:
> 
>   qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
> log->summary ? compare_by_counter : compare_by_list);
> 
> where log->list is coming from shortlog.h:
> 
> struct shortlog {
> struct string_list list;
> };
> 
> and string-list.h says:
> 
> struct string_list {
> struct string_list_item *items;
> unsigned int nr, alloc;
> ...
> };
> 
> which seems to be a good candidate for this rule:
> 
> type T;
> T *base;
> expression nmemb, compar;
> @@
> - qsort(base, nmemb, sizeof(T), compar);
> + QSORT(base, nmemb, compar);
> 
> if we take "T == struct string_list_item".

Transformations for these two are generated if we pass --all-includes
to spatch.  So let's do that.

-- >8 --
Subject: [PATCH] coccicheck: use --all-includes by default

Add a make variable, SPATCH_FLAGS, for specifying flags for spatch, and
set it to --all-includes by default.  This option lets it consider
header files which would otherwise be ignored.  That's important for
some rules that rely on type information.  It doubles the duration of
coccicheck, however.

Signed-off-by: Rene Scharfe 
---
 Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 1aad150..d15bf8d 100644
--- a/Makefile
+++ b/Makefile
@@ -467,6 +467,7 @@ SPATCH = spatch
 export TCL_PATH TCLTK_PATH
 
 SPARSE_FLAGS =
+SPATCH_FLAGS = --all-includes
 
 
 
@@ -2314,7 +2315,7 @@ C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ))
 %.cocci.patch: %.cocci $(C_SOURCES)
@echo '' SPATCH $<; \
for f in $(C_SOURCES); do \
-   $(SPATCH) --sp-file $< $$f; \
+   $(SPATCH) --sp-file $< $$f $(SPATCH_FLAGS); \
done >$@ 2>$@.log; \
if test -s $@; \
then \
-- 
2.10.0



Re: [PATCH 1/3] add QSORT

2016-09-29 Thread Junio C Hamano
René Scharfe  writes:

> Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
> size of the array elements and supports the convention of initializing
> empty arrays with a NULL pointer, which we use in some places.
>
> Calling qsort(3) directly with a NULL pointer is undefined -- even with
> an element count of zero -- and allows the compiler to optimize away any
> following NULL checks.  Using the macro avoids such surprises.
>
> Add a semantic patch as well to demonstrate the macro's usage and to
> automate the transformation of trivial cases.
>
> Signed-off-by: Rene Scharfe 
> ---
>  contrib/coccinelle/qsort.cocci | 19 +++
>  git-compat-util.h  |  8 
>  2 files changed, 27 insertions(+)
>  create mode 100644 contrib/coccinelle/qsort.cocci

The direct calls to qsort(3) that this series leaves behind are
interesting.

1. builtin/index-pack.c has this:

if (1 < opts->anomaly_nr)
qsort(opts->anomaly, opts->anomaly_nr, sizeof(uint32_t), 
cmp_uint32);

where opts->anomaly is coming from pack.h:

struct pack_idx_option {
unsigned flags;
...
int anomaly_alloc, anomaly_nr;
uint32_t *anomaly;
};

I cannot quite see how the automated conversion misses it?  It's not
like base and nmemb are type-restricted in the rule (they are both
just "expression"s).

2. builtin/shortlog.c has this:

qsort(log->list.items, log->list.nr, sizeof(struct string_list_item),
  log->summary ? compare_by_counter : compare_by_list);

where log->list is coming from shortlog.h:

struct shortlog {
struct string_list list;
};

and string-list.h says:

struct string_list {
struct string_list_item *items;
unsigned int nr, alloc;
...
};

which seems to be a good candidate for this rule:

type T;
T *base;
expression nmemb, compar;
@@
- qsort(base, nmemb, sizeof(T), compar);
+ QSORT(base, nmemb, compar);

if we take "T == struct string_list_item".

3. builtin/show-branch.c does this:

qsort(ref_name + bottom, top - bottom, sizeof(ref_name[0]),
  compare_ref_name);

where ref_name[] is a file-scope global:

static char *ref_name[MAX_REVS + 1];

and top and bottom are plain integers.  The sizeof() does not take
the size of *base, so it is understandable that this does not get
automatically converted.

It seems that some calls to this function _could_ send the same top
and bottom, asking for 0 element array to be sorted, by the way.

Thanks for an amusing read.



[PATCH 1/3] add QSORT

2016-09-29 Thread René Scharfe
Add the macro QSORT, a convenient wrapper for qsort(3) that infers the
size of the array elements and supports the convention of initializing
empty arrays with a NULL pointer, which we use in some places.

Calling qsort(3) directly with a NULL pointer is undefined -- even with
an element count of zero -- and allows the compiler to optimize away any
following NULL checks.  Using the macro avoids such surprises.

Add a semantic patch as well to demonstrate the macro's usage and to
automate the transformation of trivial cases.

Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/qsort.cocci | 19 +++
 git-compat-util.h  |  8 
 2 files changed, 27 insertions(+)
 create mode 100644 contrib/coccinelle/qsort.cocci

diff --git a/contrib/coccinelle/qsort.cocci b/contrib/coccinelle/qsort.cocci
new file mode 100644
index 000..a094e7c
--- /dev/null
+++ b/contrib/coccinelle/qsort.cocci
@@ -0,0 +1,19 @@
+@@
+expression base, nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(*base), compar);
++ QSORT(base, nmemb, compar);
+
+@@
+expression base, nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(base[0]), compar);
++ QSORT(base, nmemb, compar);
+
+@@
+type T;
+T *base;
+expression nmemb, compar;
+@@
+- qsort(base, nmemb, sizeof(T), compar);
++ QSORT(base, nmemb, compar);
diff --git a/git-compat-util.h b/git-compat-util.h
index 8aab0c3..d7ed137 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -977,6 +977,14 @@ void git_qsort(void *base, size_t nmemb, size_t size,
 #define qsort git_qsort
 #endif
 
+#define QSORT(base, n, compar) sane_qsort((base), (n), sizeof(*(base)), compar)
+static void inline sane_qsort(void *base, size_t nmemb, size_t size,
+ int(*compar)(const void *, const void *))
+{
+   if (nmemb > 1)
+   qsort(base, nmemb, size, compar);
+}
+
 #ifndef REG_STARTEND
 #error "Git requires REG_STARTEND support. Compile with NO_REGEX=NeedsStartEnd"
 #endif
-- 
2.10.0