Re: [Xen-devel] [PATCH RFC] CODING_STYLE: document intended usage of types

2018-02-22 Thread Jan Beulich
>>> On 21.02.18 at 16:34,  wrote:
> On Mon, Feb 19, 2018 at 8:44 AM, Jan Beulich  wrote:
>> +When signedness matters, qualify plain char, short, int, long, and
>> +long long with "signed" or "unsigned".  Signedness is specifically
>> +considered to matter when the valid value range of a variable covers
>> +only non-negative values.  The prime example of such is a variable used
>> +to index an array (negative array indexes, while they may occur, are
>> +rather rare).
> 
> This paragraph is a bit confusing.  You say "when signedeness
> matters", which I would interpret as saying, "When it matters whether
> you use signed or unsigned values".  But is there ever a situation
> where it doesn't matter whether we use signed or unsigned?

Yes, if the value range is limited. That's why many people use (and
are taught to use) plain int and long for things like loop variables,
even if only positive values will be iterated over. Hence
- "signed" and "unsigned" should be used to tell apart cases where
  their omission was simply because of not following the rule of
- "unsigned" should be used when the valid value range is non-
  negative

Perhaps I should distinguish "matters for correctness" from "matters
for quality of generated code". It is the former case where I'd
prefer "signed" to be made explicit.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC] CODING_STYLE: document intended usage of types

2018-02-21 Thread George Dunlap
On 02/21/2018 03:55 PM, Andrew Cooper wrote:
> On 19/02/18 13:30, Jan Beulich wrote:
> On 19.02.18 at 14:12,  wrote:
>>> On 19/02/18 08:44, Jan Beulich wrote:
 --- a/CODING_STYLE
 +++ b/CODING_STYLE
 @@ -88,6 +88,26 @@ Braces should be omitted for blocks with
  if ( condition )
  single_statement();
  
 +Types
 +-
 +
 +Use basic C types and C standard mandated typedef-s where possible (and
 +with preference in this order).  This in particular means to avoid u8,
 +u16, etc despite those types continuing to exist in our code base.
 +Fixed width types should only be used when a fixed width quantity is
 +meant (which for example may be a value read from or to be written to a
 +register).
 +
 +When signedness matters, qualify plain char, short, int, long, and
 +long long with "signed" or "unsigned".  Signedness is specifically
 +considered to matter when the valid value range of a variable covers
 +only non-negative values.  The prime example of such is a variable used
 +to index an array (negative array indexes, while they may occur, are
 +rather rare).
>>> As is evident from the other threads on the subject, I am very
>>> definitely -1 for littering our codebase with signed in cases like this.
>> Some context for those not having followed the earlier discussion:
>> There being quite a number of cases in the code base where plain
>> int or long are used when no negative values are ever expected
>> (or even possible) to be held by the respective variables, I would
>> prefer if we made explicit when signedness of a variable matters.
>> This then also eliminates signedness concerns for plain char and bit
>> fields (for both of these one already needs to explicitly add "signed"
>> when negative values are intended to be held by the variable/field,
>> at least if we don't want to tie ourselves to compiler specific
>> behavior).
>>
>>> IMO they do nothing but harm readibility.
>> How does making something explicit harm readability?
> 
> Using 'signed' when it is unnecessary strictly adds to code volume.
> 
> Expecting the use of signed in contexts where it is not necessary is
> going to add an extra round trip to every review.  Even if the core
> developers get used to using it, casual submitters are not going to use
> it, because it is very a-typical.  This constitutes a net reduction in
> review bandwidth.

This second bit is the biggest one for me: Adding this change will take
a significant amount of effort, and cause a significant amount of
annoyance to contributors, as well as increase the review workload (by
having to review an otherwise perfectly good patch a second time).

I also agree that it doesn't add anything (using a non-unsigned version
already implies 'I think this should be signed'), and I also agree that
we're almost certainly not going to achieve consistency; so in my
estimation the extra cost is clearly not worth it.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC] CODING_STYLE: document intended usage of types

2018-02-21 Thread Andrew Cooper
On 19/02/18 13:30, Jan Beulich wrote:
 On 19.02.18 at 14:12,  wrote:
>> On 19/02/18 08:44, Jan Beulich wrote:
>>> --- a/CODING_STYLE
>>> +++ b/CODING_STYLE
>>> @@ -88,6 +88,26 @@ Braces should be omitted for blocks with
>>>  if ( condition )
>>>  single_statement();
>>>  
>>> +Types
>>> +-
>>> +
>>> +Use basic C types and C standard mandated typedef-s where possible (and
>>> +with preference in this order).  This in particular means to avoid u8,
>>> +u16, etc despite those types continuing to exist in our code base.
>>> +Fixed width types should only be used when a fixed width quantity is
>>> +meant (which for example may be a value read from or to be written to a
>>> +register).
>>> +
>>> +When signedness matters, qualify plain char, short, int, long, and
>>> +long long with "signed" or "unsigned".  Signedness is specifically
>>> +considered to matter when the valid value range of a variable covers
>>> +only non-negative values.  The prime example of such is a variable used
>>> +to index an array (negative array indexes, while they may occur, are
>>> +rather rare).
>> As is evident from the other threads on the subject, I am very
>> definitely -1 for littering our codebase with signed in cases like this.
> Some context for those not having followed the earlier discussion:
> There being quite a number of cases in the code base where plain
> int or long are used when no negative values are ever expected
> (or even possible) to be held by the respective variables, I would
> prefer if we made explicit when signedness of a variable matters.
> This then also eliminates signedness concerns for plain char and bit
> fields (for both of these one already needs to explicitly add "signed"
> when negative values are intended to be held by the variable/field,
> at least if we don't want to tie ourselves to compiler specific
> behavior).
>
>> IMO they do nothing but harm readibility.
> How does making something explicit harm readability?

Using 'signed' when it is unnecessary strictly adds to code volume.

Expecting the use of signed in contexts where it is not necessary is
going to add an extra round trip to every review.  Even if the core
developers get used to using it, casual submitters are not going to use
it, because it is very a-typical.  This constitutes a net reduction in
review bandwidth.

We are never ever going to get consistent use of signed even under your
plan, due to the code we inherit from other projects.  This at best will
leave its use inconsistent across the codebase, and therefore add to the
development confusion we already have WRT indentation style etc.

As Coverity has demonstrated repeatedly, there is often a difference
between what people write, and what they intend to write (and this
includes yourself and myself).  Seeing a patch with an explicit "signed"
doesn't aid review, because the code still needs just as much second
guessing.  The only thing that matters is whether the reviewer can work
out if the code is correct, and that comes not from the written type,
but the context in which it is used.


I agree that submitted code should be using unsigned integers by
default, because is it is usually the correct behaviour, and the
resulting codegen is typically better.

However, I don't agree with your argument that explicit use of signed
for integers which are intended to have negative values will make
writing or reviewing code easier.  Quite the converse;  I'm confident
that it would be worse in several regards for the project as a whole.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC] CODING_STYLE: document intended usage of types

2018-02-21 Thread George Dunlap
On Mon, Feb 19, 2018 at 8:44 AM, Jan Beulich  wrote:
> Signed-off-by: Jan Beulich 
>
> --- a/CODING_STYLE
> +++ b/CODING_STYLE
> @@ -88,6 +88,26 @@ Braces should be omitted for blocks with
>  if ( condition )
>  single_statement();
>
> +Types
> +-
> +
> +Use basic C types and C standard mandated typedef-s where possible (and
> +with preference in this order).  This in particular means to avoid u8,
> +u16, etc
[snip]
> +Fixed width types should only be used when a fixed width quantity is
> +meant (which for example may be a value read from or to be written to a
> +register).

+1

> despite those types continuing to exist in our code base.

If CODING_STYLE doesn't already have a generic disclaimer like this,
we should make one.  I'm sure there are and will be lots of places
where the code doesn't match CODING_STYLE but should; we only need to
apologize for that once.

> +When signedness matters, qualify plain char, short, int, long, and
> +long long with "signed" or "unsigned".  Signedness is specifically
> +considered to matter when the valid value range of a variable covers
> +only non-negative values.  The prime example of such is a variable used
> +to index an array (negative array indexes, while they may occur, are
> +rather rare).

This paragraph is a bit confusing.  You say "when signedeness
matters", which I would interpret as saying, "When it matters whether
you use signed or unsigned values".  But is there ever a situation
where it doesn't matter whether we use signed or unsigned?  And the
second sentence says 'signedness' (which means 'being signed or
unsigned'), but then describe a situation where 'unsigned' is
appropriate.

FWIW I agree with Andy, that adding 'signed' should be avoided; 'long'
is signed already, 'signed long' is redundant.

Also, a brief motivation for why it's important to use unsigned when
we don't expect negative numbers would be helpful.  Otherwise it seems
a bit arbitrary.

I'd write this paragraph this way:

"Unsigned types should be used anywhere we can rule out negative
values.  Common examples include variables used to index an array,
loop counters, [other common examples]. [Why this helps.]"

 -George

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC] CODING_STYLE: document intended usage of types

2018-02-19 Thread Jan Beulich
>>> On 19.02.18 at 14:39,  wrote:
> Jan Beulich writes ("Re: [PATCH RFC] CODING_STYLE: document intended usage of 
> types"):
>> Types to be used for addresses - from a really generic pov -
>> depend on the architecture. Iirc there are some where a signed
>> type is the more natural representation, while on x86 and ARM
>> we'd certainly use "unsigned long". Since guests may be of
>> different bitness, specifying what type to use for their addresses
>> would go too far anyway imo.
> 
> If the underlying C type depends on the architecture, then the code
> should use a suitable typedef.  In generic code this means that the
> code is portable and correct; in arch-specific code it means it's
> consistent with the generic code.

Well, for the specific example there is uintptr_t, but the Linux
world appears to dislike it (and we're sort of following suit). Hence
for the foreseeable future it'll continue to be unsigned long.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC] CODING_STYLE: document intended usage of types

2018-02-19 Thread Ian Jackson
Jan Beulich writes ("Re: [PATCH RFC] CODING_STYLE: document intended usage of 
types"):
> Types to be used for addresses - from a really generic pov -
> depend on the architecture. Iirc there are some where a signed
> type is the more natural representation, while on x86 and ARM
> we'd certainly use "unsigned long". Since guests may be of
> different bitness, specifying what type to use for their addresses
> would go too far anyway imo.

If the underlying C type depends on the architecture, then the code
should use a suitable typedef.  In generic code this means that the
code is portable and correct; in arch-specific code it means it's
consistent with the generic code.

But that is directly contrary to the advice in your proposed
CODING_STYLE patch.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC] CODING_STYLE: document intended usage of types

2018-02-19 Thread Jan Beulich
>>> On 19.02.18 at 14:12,  wrote:
> On 19/02/18 08:44, Jan Beulich wrote:
>> --- a/CODING_STYLE
>> +++ b/CODING_STYLE
>> @@ -88,6 +88,26 @@ Braces should be omitted for blocks with
>>  if ( condition )
>>  single_statement();
>>  
>> +Types
>> +-
>> +
>> +Use basic C types and C standard mandated typedef-s where possible (and
>> +with preference in this order).  This in particular means to avoid u8,
>> +u16, etc despite those types continuing to exist in our code base.
>> +Fixed width types should only be used when a fixed width quantity is
>> +meant (which for example may be a value read from or to be written to a
>> +register).
>> +
>> +When signedness matters, qualify plain char, short, int, long, and
>> +long long with "signed" or "unsigned".  Signedness is specifically
>> +considered to matter when the valid value range of a variable covers
>> +only non-negative values.  The prime example of such is a variable used
>> +to index an array (negative array indexes, while they may occur, are
>> +rather rare).
> 
> As is evident from the other threads on the subject, I am very
> definitely -1 for littering our codebase with signed in cases like this.

Some context for those not having followed the earlier discussion:
There being quite a number of cases in the code base where plain
int or long are used when no negative values are ever expected
(or even possible) to be held by the respective variables, I would
prefer if we made explicit when signedness of a variable matters.
This then also eliminates signedness concerns for plain char and bit
fields (for both of these one already needs to explicitly add "signed"
when negative values are intended to be held by the variable/field,
at least if we don't want to tie ourselves to compiler specific
behavior).

> IMO they do nothing but harm readibility.

How does making something explicit harm readability?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC] CODING_STYLE: document intended usage of types

2018-02-19 Thread Andrew Cooper
On 19/02/18 08:44, Jan Beulich wrote:
> Signed-off-by: Jan Beulich 
>
> --- a/CODING_STYLE
> +++ b/CODING_STYLE
> @@ -88,6 +88,26 @@ Braces should be omitted for blocks with
>  if ( condition )
>  single_statement();
>  
> +Types
> +-
> +
> +Use basic C types and C standard mandated typedef-s where possible (and
> +with preference in this order).  This in particular means to avoid u8,
> +u16, etc despite those types continuing to exist in our code base.
> +Fixed width types should only be used when a fixed width quantity is
> +meant (which for example may be a value read from or to be written to a
> +register).
> +
> +When signedness matters, qualify plain char, short, int, long, and
> +long long with "signed" or "unsigned".  Signedness is specifically
> +considered to matter when the valid value range of a variable covers
> +only non-negative values.  The prime example of such is a variable used
> +to index an array (negative array indexes, while they may occur, are
> +rather rare).

As is evident from the other threads on the subject, I am very
definitely -1 for littering our codebase with signed in cases like this.

IMO they do nothing but harm readibility.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC] CODING_STYLE: document intended usage of types

2018-02-19 Thread Jan Beulich
>>> On 19.02.18 at 12:46,  wrote:
> Jan Beulich writes ("[PATCH RFC] CODING_STYLE: document intended usage of 
> types"):
>> +Types
>> +-
>> +
>> +Use basic C types and C standard mandated typedef-s where possible (and
>> +with preference in this order).  This in particular means to avoid u8,
>> +u16, etc despite those types continuing to exist in our code base.
>> +Fixed width types should only be used when a fixed width quantity is
>> +meant (which for example may be a value read from or to be written to a
>> +register).
> 
> This should have more practical advice, in the form of examples.  In
> particular, what types should be used for guest and host addreses in
> various contexts ?

Types to be used for addresses - from a really generic pov -
depend on the architecture. Iirc there are some where a signed
type is the more natural representation, while on x86 and ARM
we'd certainly use "unsigned long". Since guests may be of
different bitness, specifying what type to use for their addresses
would go too far anyway imo.

> Also, what assumptions should be made about the sizes of standard
> types ?  Should "short" be used when the value is known to fit in 16
> bits ?

I think we should discourage (but not forbid) the use of short
and (except for strings) char. I think we can (and I also think we
actively do) assume int to have at least 32 bits, and long to be
as wide as a pointer. All of these are not coding style aspects,
just like ...

> In this context it would probably also be worth mentioning that the
> programmer may assume that the machine is two's-complement and has
> 8-bit bytes.

... any of this isn't, either. Hence I don't think this belongs here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC] CODING_STYLE: document intended usage of types

2018-02-19 Thread Ian Jackson
Jan Beulich writes ("[PATCH RFC] CODING_STYLE: document intended usage of 
types"):
> +Types
> +-
> +
> +Use basic C types and C standard mandated typedef-s where possible (and
> +with preference in this order).  This in particular means to avoid u8,
> +u16, etc despite those types continuing to exist in our code base.
> +Fixed width types should only be used when a fixed width quantity is
> +meant (which for example may be a value read from or to be written to a
> +register).

This should have more practical advice, in the form of examples.  In
particular, what types should be used for guest and host addreses in
various contexts ?

Also, what assumptions should be made about the sizes of standard
types ?  Should "short" be used when the value is known to fit in 16
bits ?

In this context it would probably also be worth mentioning that the
programmer may assume that the machine is two's-complement and has
8-bit bytes.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH RFC] CODING_STYLE: document intended usage of types

2018-02-19 Thread Jan Beulich
Signed-off-by: Jan Beulich 

--- a/CODING_STYLE
+++ b/CODING_STYLE
@@ -88,6 +88,26 @@ Braces should be omitted for blocks with
 if ( condition )
 single_statement();
 
+Types
+-
+
+Use basic C types and C standard mandated typedef-s where possible (and
+with preference in this order).  This in particular means to avoid u8,
+u16, etc despite those types continuing to exist in our code base.
+Fixed width types should only be used when a fixed width quantity is
+meant (which for example may be a value read from or to be written to a
+register).
+
+When signedness matters, qualify plain char, short, int, long, and
+long long with "signed" or "unsigned".  Signedness is specifically
+considered to matter when the valid value range of a variable covers
+only non-negative values.  The prime example of such is a variable used
+to index an array (negative array indexes, while they may occur, are
+rather rare).
+
+Especially with pointer types, whenever the pointed to object is not
+(supposed to be) modified, qualify the pointed to type with "const".
+
 Comments
 
 




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel