Re: preferring ptrdiff_t to size_t

2019-01-05 Thread Bruno Haible
Paul Eggert wrote:
> Using signed types is better nowadays than using unsigned types, since 
> many platforms now check for signed integer overflow and this can catch many 
> bugs, some of them security-relevant, whereas unsigned arithmetic is well 
> defined to wrap around with no overflow check (something that can be quite 
> dangerous when doing size calculations). So, for reliability and security 
> reasons, C programs should now prefer ptrdiff_t to size_t when dealing with 
> object sizes.

In the thread that starts at
http://lists.gnu.org/archive/html/bug-gnulib/2017-06/msg9.html
I suggest to use a typedef, not ptrdiff_t directly, for values that are
known to be in the range 0..PTRDIFF_MAX.

Bruno




Re: preferring ptrdiff_t to size_t for object counts

2017-06-07 Thread Bruno Haible
Hi Paul,

> The name I'm currently 
> thinking of is 'in_t', short for "index type". That's an 
> easy-to-remember name (the type is like 'int', but possibly wider).

Fine with me.

It doesn't collide: Only very few packages use this identifier 'in_t', and
only in isolated places.

> One other advantage of having our own signed type is that we can 
> guarantee that it's at least as wide as int (something that is not true 
> for ptrdiff_t). That way, some of my current code that says 'MIN 
> (INT_MAX, PTRDIFF_MAX)' can be simplified to the more-natural INT_MAX. 
> This is helpful for traditional interfaces that use int counters.

Indeed. (Although portability to Windows 3.1 is not in the focus of gnulib
nor of GNU programs any more.)

Bruno




Re: preferring ptrdiff_t to size_t for object counts

2017-06-07 Thread Paul Eggert

On 06/07/2017 02:53 PM, Bruno Haible wrote:

I don't really mind the name of the type - as
long as it's a typedef.


I've been leaning towards a name that doesn't start with 'w', since the 
type is not specific to the walloc module family. The name I'm currently 
thinking of is 'in_t', short for "index type". That's an 
easy-to-remember name (the type is like 'int', but possibly wider).


One other advantage of having our own signed type is that we can 
guarantee that it's at least as wide as int (something that is not true 
for ptrdiff_t). That way, some of my current code that says 'MIN 
(INT_MAX, PTRDIFF_MAX)' can be simplified to the more-natural INT_MAX. 
This is helpful for traditional interfaces that use int counters.





Re: preferring ptrdiff_t to size_t for object counts

2017-06-07 Thread Bruno Haible
I wrote:
>   typedef ptrdiff_t wsize_t;

'wsize_t' or 'wcount_t'. I don't really mind the name of the type - as
long as it's a typedef.

Bruno




Re: preferring ptrdiff_t to size_t for object counts

2017-06-05 Thread Bruno Haible
Hi Paul,

I'd like to understand how much better this "ptrdiff_t world" is.

> This has the advantage that signed integer overflow can be detected 
> automatically on some platforms

You mean "-fsanitize=undefined", right?

Does this also catch the following situations?

  a) Pointer subtraction. ISO C11 ยง J.2 says:
 "The behavior is undefined in the following circumstances: ...
  The result of subtracting two pointers is not representable in an object
  of type ptrdiff_t (6.5.6)."

  b) When assigning a 'size_t' value > PTRDIFF_MAX to a 'ptrdiff_t' variable,
 is that undefined behaviour? Is that caught by "-fsanitize=undefined"?

Bruno




Re: preferring ptrdiff_t to size_t for object counts

2017-06-05 Thread Bruno Haible
Hi Paul,

> GNU Emacs has long been using signed types (typically ptrdiff_t) to count 
> objects. This has the advantage that signed integer overflow can be detected 
> automatically on some platforms (unfortunately, size_t arithmetic silently 
> wraps 
> around).

I have one objection, but a big one: The direct use of ptrdiff_t.

Reasons:

1) Like you, I spend time reviewing code other people have written. In these
   code reviews, it is important to know whether a variable is known to always
   be >= 0 or not.

   For example, when we have
 int n = ...;
 for (int i = 0; i < n; i++) ...
   I always have to spend brain cycles around the question "what if n < 0?
   Does the code still achieve its goal in this case?"

   Whereas if the type clearly states the intent to store only values >= 0,
   there is no issue; no extra brain cycles required.

2) Standards change, and the considerations behind 'walloc' may also change.

   Do you want, 5 or 10 years from now, to go through hundreds of uses of
   'ptrdiff_t' and separate those uses with values >= 0 from those with values
   that can be negative? I certainly don't want to.

3) GCC has range types for Ada. I would hope that someday it also has range
   types for C or C++. Then, it would be very useful to express the fact that
   the values are in the range [0..PTRDIFF_MAX], so that GCC can use it for
   optimization.

4) For static analysis tools (gnulib now uses coverity in particular), I can
   imagine that an unsigned type is easier to work with than a signed type
   (i.e. that the tool can make more inferences and therefore detect more bugs
   when using unsigned types).
   To this effect, it is useful to use an unsigned type for those counters /
   size_t object, *just* for the static analysis tool.

To fix all of these issues, I suggest to use a typedef'ed type, instead. For
example:

  typedef ptrdiff_t wsize_t;

And then use wsize_t everywhere.

This solves problems 1), 2), 3), and 4 (through a #ifdefed definition of
wsize_t).

Yes it means that people reading the code will have to memorize one more type
identifier. But it is to their benefit: they will know the values are >= 0
(see point 1).

Bruno