Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-16 Thread Martin Sebor

On 07/16/2018 02:19 AM, Richard Sandiford wrote:

Aldy Hernandez  writes:

On Thu, Jul 12, 2018 at 1:41 PM Jonathan Wakely  wrote:


On Thu, 12 Jul 2018 at 11:41, Richard Sandiford wrote:

+Only use non-constant references in the following situations:
+
+
+
+when they are necessary to conform to a standard interface, such as
+the first argument to a non-member operator+=


And the return value of such operators (which also applies to member
operators, which is the more conventional way to write compound
assignment operators).


+in a return value, when providing access to something that is known
+to be nonnull
+
+
+
+In other situations the convention is to use pointers instead.
+
+
+
+HOST_WIDE_INT do_arith (..., bool *overflow);   // OK
+HOST_WIDE_INT do_arith (..., bool overflow);   // Please avoid


I understand the objection to using references for out parameters (an
alternative to pointers is to return a struct with the wide int result
and the overflow flag), but ...


Putting everything in a new struct is just going through more hoops to
avoid a common language idiom that everyone in C++ is used to.




+int *elt = array[i];  // OK
+int elt = array[i];   // Please avoid


... this seems unnecessary. If the function is so long that the fact
elt is a reference can easily get lost, the problem is the length of
the function, not the use of a reference.


Agreed.

Richard (Sandiford), this really looks like going out of your way to
enforce a personal style issue across the entire code base.  It's sad
that we try to come up with new ways to make it even harder for new
developers to contribute to GCC.


It's not enforcing a personal style issue so much as trying to stop
the source base from becoming even more inconsistent, admittedly in one
particular area only.

Before the switch to C++, GCC was one of the most consistent source
bases I'd seen (more so even than LLVM IMO, having worked on both).
It's clearly less so now.  The problem isn't just legacy C-era code vs.
new C++ code, but different styles being used for the C++ code.

(It was interesting that Martin complained earlier in the thread about
how inconsistent GCC was.  I imagine that's the impression that new
developers would have now, rather than being glad that they can use
references to return things.)


Just to be clear: mine wasn't meant to be a complaint so much
as an observation.  With a code base as large as GCC and with
so many people contributing to it, I wouldn't expect too much
more consistency unless it was enforced by tools.

I don't have a problem with adopting a new convention per se.
What I find a poor use of time and effort is relying on code
review as the only enforcement of a consistent coding style.
It gets especially frustrating when the same piece of code
that is acceptable to one reviewer is not good enough for
another only because it doesn't follow some convention that
GCC isn't consistent with to begin with (which to me seems
like it's more than those GCC is consistent with).

Martin

PS Here are some examples of documented GCC coding conventions
that I have noticed are either very inconsistent or virtually
ignored ignored (this isn't a survey):

  *  logical not!xvs  ! x  thousands of violations
  *  cast   (foo) x   vs  (foo)x   ditto
  *  C++-style casts vs C-stylemostly ignored
  *  explicit ctorsignored
  *  in class member definitions


Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-16 Thread Richard Biener
On Mon, Jul 16, 2018 at 11:48 AM Aldy Hernandez  wrote:
>
>
>
> On 07/16/2018 04:19 AM, Richard Sandiford wrote:
> > Aldy Hernandez  writes:
> >> On Thu, Jul 12, 2018 at 1:41 PM Jonathan Wakely  
> >> wrote:
> >>>
> >>> On Thu, 12 Jul 2018 at 11:41, Richard Sandiford wrote:
>  +Only use non-constant references in the following situations:
>  +
>  +
>  +
>  +when they are necessary to conform to a standard interface, such as
>  +the first argument to a non-member operator+=
> >>>
> >>> And the return value of such operators (which also applies to member
> >>> operators, which is the more conventional way to write compound
> >>> assignment operators).
> >>>
>  +in a return value, when providing access to something that is known
>  +to be nonnull
>  +
>  +
>  +
>  +In other situations the convention is to use pointers instead.
>  +
>  +
>  +
>  +HOST_WIDE_INT do_arith (..., bool *overflow);   // OK
>  +HOST_WIDE_INT do_arith (..., bool overflow);   // Please avoid
> >>>
> >>> I understand the objection to using references for out parameters (an
> >>> alternative to pointers is to return a struct with the wide int result
> >>> and the overflow flag), but ...
> >>
> >> Putting everything in a new struct is just going through more hoops to
> >> avoid a common language idiom that everyone in C++ is used to.
> >>
> >>>
>  +int *elt = array[i];  // OK
>  +int elt = array[i];   // Please avoid
> >>>
> >>> ... this seems unnecessary. If the function is so long that the fact
> >>> elt is a reference can easily get lost, the problem is the length of
> >>> the function, not the use of a reference.
> >>
> >> Agreed.
> >>
> >> Richard (Sandiford), this really looks like going out of your way to
> >> enforce a personal style issue across the entire code base.  It's sad
> >> that we try to come up with new ways to make it even harder for new
> >> developers to contribute to GCC.
> >
> > It's not enforcing a personal style issue so much as trying to stop
> > the source base from becoming even more inconsistent, admittedly in one
> > particular area only.
> >
> > Before the switch to C++, GCC was one of the most consistent source
> > bases I'd seen (more so even than LLVM IMO, having worked on both).
> > It's clearly less so now.  The problem isn't just legacy C-era code vs.
> > new C++ code, but different styles being used for the C++ code.
>
> I certainly appreciate your concern, and the time you have taken to
> amend the guidelines.  However, perhaps we should get consensus here, as
> this is an issue that affects us all.
>
> >
> > (It was interesting that Martin complained earlier in the thread about
> > how inconsistent GCC was.  I imagine that's the impression that new
> > developers would have now, rather than being glad that they can use
> > references to return things.)
> >
> > In the case of pointers vs. reference parameters for returning values,
> > we have a clear precendent from the C days and very few existing C++
> > functions that deviate from it.  So if we were going to pick one,
> > it seemed obvious which we should pick.  Perhaps there's an argument
> > that eventually everything will move over to the new style due to
> > natural churn, but I think another half-transition is far more likely.
>
> Heh.  I would definitely prefer moving to the new style ;-).
>
> >
> > But weight of opinion definitely seems to be against adding that rule
> > to the standards, so I'll drop the patch.
>
> Again, I have a preference for non constant references if it makes the
> code easier to read, but I don't feel so strongly as to fight about it,
> especially if there is consensus going the other way.
>
> I welcome more discussion on this, and will gladly accept what the
> majority agrees on.

The only thing I would insist on is consistency within a group of
API functions or even within related APIs (poly-int vs. wide-int for example).

Richard.

>
> Aldy


Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-16 Thread Aldy Hernandez




On 07/16/2018 04:19 AM, Richard Sandiford wrote:

Aldy Hernandez  writes:

On Thu, Jul 12, 2018 at 1:41 PM Jonathan Wakely  wrote:


On Thu, 12 Jul 2018 at 11:41, Richard Sandiford wrote:

+Only use non-constant references in the following situations:
+
+
+
+when they are necessary to conform to a standard interface, such as
+the first argument to a non-member operator+=


And the return value of such operators (which also applies to member
operators, which is the more conventional way to write compound
assignment operators).


+in a return value, when providing access to something that is known
+to be nonnull
+
+
+
+In other situations the convention is to use pointers instead.
+
+
+
+HOST_WIDE_INT do_arith (..., bool *overflow);   // OK
+HOST_WIDE_INT do_arith (..., bool overflow);   // Please avoid


I understand the objection to using references for out parameters (an
alternative to pointers is to return a struct with the wide int result
and the overflow flag), but ...


Putting everything in a new struct is just going through more hoops to
avoid a common language idiom that everyone in C++ is used to.




+int *elt = array[i];  // OK
+int elt = array[i];   // Please avoid


... this seems unnecessary. If the function is so long that the fact
elt is a reference can easily get lost, the problem is the length of
the function, not the use of a reference.


Agreed.

Richard (Sandiford), this really looks like going out of your way to
enforce a personal style issue across the entire code base.  It's sad
that we try to come up with new ways to make it even harder for new
developers to contribute to GCC.


It's not enforcing a personal style issue so much as trying to stop
the source base from becoming even more inconsistent, admittedly in one
particular area only.

Before the switch to C++, GCC was one of the most consistent source
bases I'd seen (more so even than LLVM IMO, having worked on both).
It's clearly less so now.  The problem isn't just legacy C-era code vs.
new C++ code, but different styles being used for the C++ code.


I certainly appreciate your concern, and the time you have taken to 
amend the guidelines.  However, perhaps we should get consensus here, as 
this is an issue that affects us all.




(It was interesting that Martin complained earlier in the thread about
how inconsistent GCC was.  I imagine that's the impression that new
developers would have now, rather than being glad that they can use
references to return things.)

In the case of pointers vs. reference parameters for returning values,
we have a clear precendent from the C days and very few existing C++
functions that deviate from it.  So if we were going to pick one,
it seemed obvious which we should pick.  Perhaps there's an argument
that eventually everything will move over to the new style due to
natural churn, but I think another half-transition is far more likely.


Heh.  I would definitely prefer moving to the new style ;-).



But weight of opinion definitely seems to be against adding that rule
to the standards, so I'll drop the patch.


Again, I have a preference for non constant references if it makes the 
code easier to read, but I don't feel so strongly as to fight about it, 
especially if there is consensus going the other way.


I welcome more discussion on this, and will gladly accept what the 
majority agrees on.


Aldy


Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-16 Thread Richard Sandiford
Aldy Hernandez  writes:
> On Thu, Jul 12, 2018 at 1:41 PM Jonathan Wakely  wrote:
>>
>> On Thu, 12 Jul 2018 at 11:41, Richard Sandiford wrote:
>> > +Only use non-constant references in the following situations:
>> > +
>> > +
>> > +
>> > +when they are necessary to conform to a standard interface, such as
>> > +the first argument to a non-member operator+=
>>
>> And the return value of such operators (which also applies to member
>> operators, which is the more conventional way to write compound
>> assignment operators).
>>
>> > +in a return value, when providing access to something that is known
>> > +to be nonnull
>> > +
>> > +
>> > +
>> > +In other situations the convention is to use pointers instead.
>> > +
>> > +
>> > +
>> > +HOST_WIDE_INT do_arith (..., bool *overflow);   // OK
>> > +HOST_WIDE_INT do_arith (..., bool overflow);   // Please avoid
>>
>> I understand the objection to using references for out parameters (an
>> alternative to pointers is to return a struct with the wide int result
>> and the overflow flag), but ...
>
> Putting everything in a new struct is just going through more hoops to
> avoid a common language idiom that everyone in C++ is used to.
>
>>
>> > +int *elt = array[i];  // OK
>> > +int elt = array[i];   // Please avoid
>>
>> ... this seems unnecessary. If the function is so long that the fact
>> elt is a reference can easily get lost, the problem is the length of
>> the function, not the use of a reference.
>
> Agreed.
>
> Richard (Sandiford), this really looks like going out of your way to
> enforce a personal style issue across the entire code base.  It's sad
> that we try to come up with new ways to make it even harder for new
> developers to contribute to GCC.

It's not enforcing a personal style issue so much as trying to stop
the source base from becoming even more inconsistent, admittedly in one
particular area only.

Before the switch to C++, GCC was one of the most consistent source
bases I'd seen (more so even than LLVM IMO, having worked on both).
It's clearly less so now.  The problem isn't just legacy C-era code vs.
new C++ code, but different styles being used for the C++ code.

(It was interesting that Martin complained earlier in the thread about
how inconsistent GCC was.  I imagine that's the impression that new
developers would have now, rather than being glad that they can use
references to return things.)

In the case of pointers vs. reference parameters for returning values,
we have a clear precendent from the C days and very few existing C++
functions that deviate from it.  So if we were going to pick one,
it seemed obvious which we should pick.  Perhaps there's an argument
that eventually everything will move over to the new style due to
natural churn, but I think another half-transition is far more likely.

But weight of opinion definitely seems to be against adding that rule
to the standards, so I'll drop the patch.

Thanks,
Richard


Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-16 Thread Aldy Hernandez
On Thu, Jul 12, 2018 at 8:02 PM Pedro Alves  wrote:
>
> On 07/12/2018 05:17 PM, Richard Sandiford wrote:
> > Pedro Alves  writes:
>
> >> (an
> >>> alternative to pointers is to return a struct with the wide int result
> >>> and the overflow flag),
> >>
> >> +1.  I've been pushing GDB in that direction whenever possible.
> >
> > I agree that can sometimes be better.  I guess it depends on the
> > context.  If a function returns a bool and some other data that has no
> > obvious "failure" value, it can be easier to write chained conditions if
> > the function returns the bool and provides the other data via an output
> > parameter.
>
> I agree it depends on context, though your example sounds like a
> case for std::optional.  (We have gdb::optional in gdb, since
> std::optional is C++17 and gdb requires C++11.  LLVM has a similar
> type, I believe.)
>
> >
> >>> but ...
> >>>
>  +int *elt = array[i];  // OK
>  +int elt = array[i];   // Please avoid
> >>>
> >>> ... this seems unnecessary. If the function is so long that the fact
> >>> elt is a reference can easily get lost, the problem is the length of
> >>> the function, not the use of a reference.
> >>>
> >>
> >> +1000.  This seems really unnecessary.  References have the advantage
> >> of implicit never-null semantics, for instance.
> >
> > The nonnullness is there either way in the above example though.
> >
> > I don't feel as strongly about the non-const-reference variables, but for:
> >
> >  int  = array[i];
> >  ...
> >  elt = 1;
> >
> > it can be easy to misread that "elt = 1" is changing more than just
> > a local variable.
>
> I think that might be just the case for people who haven't used
> references before, and once you get some exposure, the effect
> goes away.  See examples below.

Agreed.

>
> >
> > I take Jonathan's point that it shouldn't be much of a problem if
> > functions are a reasonable size, but we've not tended to be very
> > good at keeping function sizes down.  I guess part of the problem
> > is that even functions that start off small tend to grow over time.
> >
> > We have been better at enforcing more specific rules like the ones
> > in the patch.  And that's easier to do because a patch either adds
> > references or it doesn't.  It's harder to force (or remember to force)
> > someone to split up a function if they're adding 5 lines to one that is
> > already 20 lines long.  Then for the next person it's 25 lines long, etc.
> >
>
> > Also, the "int *elt" approach copes with cases in which "elt" might
> > have to be redirected later, so we're going to need to use it in some
> > cases.
>
> Sure, if you need to reseat, certainly use a pointer.  That actually
> highlights an advantage of references -- the fact that you are sure
> nothing could reseat it.  With a local pointer, you need to be mindful
> of that.  "Does this loop change the pointer?"  Etc.  If the semantics
> of the function call for having a variable that is not meant to be
> reseated, why not allow expressing it with a reference.
> "Write what you mean."  Of course, just like all code, there's going
> to be a judgment call.
>
> A place where references frequently appear is in C++11 range-for loops.
> Like, random example from gdb's codebase:
>
>  for (const symbol_search  : symbols)
>
> GCC doesn't yet require C++11, but it should be a matter
> of time, one would hope.
>
> Other places references appear often is in coming up with
> an alias to avoid repeating long expressions, like
> for example (again from gdb):
>
>   auto  = objfile->per_bfd->demangled_hash_languages;
>   auto it = std::lower_bound (vec.begin (), vec.end (),
>   MSYMBOL_LANGUAGE (sym));
>   if (it == vec.end () || *it != MSYMBOL_LANGUAGE (sym))
> vec.insert (it, MSYMBOL_LANGUAGE (sym));
>
> I don't think the fact that "vec" is a reference here confuses
> anyone.
>
> That was literally a random sample (grepped for "&[a-z]* = ") so
> I ended up picking one with C++11 "auto", but here's another
> random example, spelling out the type name, similarly using
> a reference as a shorthand alias:
>
>   osdata_item  = osdata->items.back ();
>
>   item.columns.emplace_back (std::move (data->property_name),
>  std::string (body_text));
>
> > It's then a question of whether "int " is useful enough that
> > it's worth accepting it too, and having a mixture of styles in the codebase.
> I don't see it as a mixture of styles, as I don't see
> pointers and references the exact same thing, but rather
> see references as another tool in the box.

Agreed, agreed, agreed :).

Seriously folks, I really think we should concentrate on cleaning up
GCC, and making it more readable, not nitpicking over style issues.
Look at our wide int implementation for instance-- it adheres
perfectly to our coding guidelines, and yet it's a mess mess to read
(*).

(*) I'm not singling out wide int because Richard worked on it.  I
believe it 

Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-16 Thread Aldy Hernandez
On Thu, Jul 12, 2018 at 1:41 PM Jonathan Wakely  wrote:
>
> On Thu, 12 Jul 2018 at 11:41, Richard Sandiford wrote:
> > +Only use non-constant references in the following situations:
> > +
> > +
> > +
> > +when they are necessary to conform to a standard interface, such as
> > +the first argument to a non-member operator+=
>
> And the return value of such operators (which also applies to member
> operators, which is the more conventional way to write compound
> assignment operators).
>
> > +in a return value, when providing access to something that is known
> > +to be nonnull
> > +
> > +
> > +
> > +In other situations the convention is to use pointers instead.
> > +
> > +
> > +
> > +HOST_WIDE_INT do_arith (..., bool *overflow);   // OK
> > +HOST_WIDE_INT do_arith (..., bool overflow);   // Please avoid
>
> I understand the objection to using references for out parameters (an
> alternative to pointers is to return a struct with the wide int result
> and the overflow flag), but ...

Putting everything in a new struct is just going through more hoops to
avoid a common language idiom that everyone in C++ is used to.

>
> > +int *elt = array[i];  // OK
> > +int elt = array[i];   // Please avoid
>
> ... this seems unnecessary. If the function is so long that the fact
> elt is a reference can easily get lost, the problem is the length of
> the function, not the use of a reference.

Agreed.

Richard (Sandiford), this really looks like going out of your way to
enforce a personal style issue across the entire code base.  It's sad
that we try to come up with new ways to make it even harder for new
developers to contribute to GCC.

Aldy


Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-12 Thread Martin Sebor

On 07/12/2018 04:41 AM, Richard Sandiford wrote:

Following on from:

https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00603.html

this patch is an RFC to mention references in the C++ coding conventions.
It allows const references anywhere they're useful but only allows
non-constant references in contexts that conform to a standard
interface, or when returning a reference to something that is known
to be nonnull.

It looks like there are still very few public functions that use
non-constant references.  If the patch is OK, I'll volunteer to
change them.

Thanks,
Richard


Index: htdocs/codingconventions.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/codingconventions.html,v
retrieving revision 1.81
diff -u -p -r1.81 codingconventions.html
--- htdocs/codingconventions.html   2 Jun 2018 21:16:09 -   1.81
+++ htdocs/codingconventions.html   12 Jul 2018 10:05:12 -
@@ -57,6 +57,7 @@ the conventions separately from any othe
 Overloading Functions
 Overloading Operators
 Default Arguments
+References
 Inlining Functions
 Templates
 Namespaces
@@ -996,6 +997,44 @@ Virtual functions should not have defaul
 


+References
+
+
+Const references can be used freely.  They are a useful way of reducing
+unnecessary copying or avoiding an impossible copy.
+
+
+
+Only use non-constant references in the following situations:
+
+
+
+when they are necessary to conform to a standard interface, such as
+the first argument to a non-member operator+=
+in a return value, when providing access to something that is known
+to be nonnull
+
+
+
+In other situations the convention is to use pointers instead.
+


As always, there are pluses and minuses.  You mention the pluses
and I don't disagree with them, though perhaps not strongly enough
to feel that it justifies making it a hard and fast rule in
a codebase as large (and as inconsistent) as GCC.

Others have already mentioned some of the downsides but let me
add my own $0.02.  I'm not sure that they rise to the level of
overwhelming the advantages but they don't make it a slam dunk
either.

Pass by pointer means that the function (or its reader) needs
to worry about the argument being null.  It's often on my mind
when I don't see it documented.  This could be mitigated to
an extent by using the nonnull attribute to help us avoid
(both visually and via -Wnonnull-compare) pointless tests for
null in the body of the function.  But it still means I have
to check the declaration to confirm that.  Pass by reference
doesn't suffer from these problems.

I agree that outside of function arguments, pointers are best
avoided where possible in favor of references (ideally const
references where applicable).  It makes it clear that a const
reference implies the object is immutable.  Someday GCC might
even be able to take advantage of it for optimization.

All in all, I'm ambivalent about the value of the convention.
I suspect it will be another hoop for new contributors (and
some of the rest of us) to jump through that will not lead
to significant improvements.  I think there are other, more
interesting, opportunities to improve things.

Martin

PS Rather than documenting a convention and leaving it to
others to remember to enforce, I would suggest to implement
a warning to help us achieve consistency and avoid wasting
time going back and forth about it in code reviews.


+
+
+HOST_WIDE_INT do_arith (..., bool *overflow);   // OK
+HOST_WIDE_INT do_arith (..., bool overflow);   // Please avoid
+
+int *elt = array[i];  // OK
+int elt = array[i];   // Please avoid
+
+
+
+There are two main reasons for this restriction.  The first is to avoid
+an inconsistency with the large body of older (pre-C++) code that uses
+the pointer convention.  The second is to maintain a visual distinction
+between the arguments that a function does modify and the arguments that
+it doesn't.
+
+
 Inlining Functions

 





Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-12 Thread Jonathan Wakely
On Thu, 12 Jul 2018 at 19:01, Pedro Alves  wrote:
>
> On 07/12/2018 05:17 PM, Richard Sandiford wrote:
> > Pedro Alves  writes:
>
> >> (an
> >>> alternative to pointers is to return a struct with the wide int result
> >>> and the overflow flag),
> >>
> >> +1.  I've been pushing GDB in that direction whenever possible.
> >
> > I agree that can sometimes be better.  I guess it depends on the
> > context.  If a function returns a bool and some other data that has no
> > obvious "failure" value, it can be easier to write chained conditions if
> > the function returns the bool and provides the other data via an output
> > parameter.
>
> I agree it depends on context, though your example sounds like a
> case for std::optional.  (We have gdb::optional in gdb, since
> std::optional is C++17 and gdb requires C++11.  LLVM has a similar
> type, I believe.)

Yes I agree.

optional foo ();
void bar (const T&);

if (optional val = foo ())
  bar (*val);

Or expected which is a union of T and an error flag (and a
discriminator value to tell you which union member is active). That
also allows chaining in a similar way. Forgetting to test whether the
optional/expected holds a value before trying to use that value would
be an ICE.


Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-12 Thread Pedro Alves
On 07/12/2018 05:17 PM, Richard Sandiford wrote:
> Pedro Alves  writes:

>> (an
>>> alternative to pointers is to return a struct with the wide int result
>>> and the overflow flag),
>>
>> +1.  I've been pushing GDB in that direction whenever possible.
> 
> I agree that can sometimes be better.  I guess it depends on the
> context.  If a function returns a bool and some other data that has no
> obvious "failure" value, it can be easier to write chained conditions if
> the function returns the bool and provides the other data via an output
> parameter.

I agree it depends on context, though your example sounds like a
case for std::optional.  (We have gdb::optional in gdb, since
std::optional is C++17 and gdb requires C++11.  LLVM has a similar
type, I believe.)

> 
>>> but ...
>>>
 +int *elt = array[i];  // OK
 +int elt = array[i];   // Please avoid
>>>
>>> ... this seems unnecessary. If the function is so long that the fact
>>> elt is a reference can easily get lost, the problem is the length of
>>> the function, not the use of a reference.
>>>
>>
>> +1000.  This seems really unnecessary.  References have the advantage
>> of implicit never-null semantics, for instance.
> 
> The nonnullness is there either way in the above example though.
> 
> I don't feel as strongly about the non-const-reference variables, but for:
> 
>  int  = array[i];
>  ...
>  elt = 1;
> 
> it can be easy to misread that "elt = 1" is changing more than just
> a local variable.

I think that might be just the case for people who haven't used
references before, and once you get some exposure, the effect
goes away.  See examples below.

> 
> I take Jonathan's point that it shouldn't be much of a problem if
> functions are a reasonable size, but we've not tended to be very
> good at keeping function sizes down.  I guess part of the problem
> is that even functions that start off small tend to grow over time.
> 
> We have been better at enforcing more specific rules like the ones
> in the patch.  And that's easier to do because a patch either adds
> references or it doesn't.  It's harder to force (or remember to force)
> someone to split up a function if they're adding 5 lines to one that is
> already 20 lines long.  Then for the next person it's 25 lines long, etc.
> 

> Also, the "int *elt" approach copes with cases in which "elt" might
> have to be redirected later, so we're going to need to use it in some
> cases.  

Sure, if you need to reseat, certainly use a pointer.  That actually
highlights an advantage of references -- the fact that you are sure
nothing could reseat it.  With a local pointer, you need to be mindful
of that.  "Does this loop change the pointer?"  Etc.  If the semantics
of the function call for having a variable that is not meant to be
reseated, why not allow expressing it with a reference.
"Write what you mean."  Of course, just like all code, there's going
to be a judgment call.

A place where references frequently appear is in C++11 range-for loops.
Like, random example from gdb's codebase:

 for (const symbol_search  : symbols)

GCC doesn't yet require C++11, but it should be a matter
of time, one would hope.

Other places references appear often is in coming up with
an alias to avoid repeating long expressions, like
for example (again from gdb):

  auto  = objfile->per_bfd->demangled_hash_languages;
  auto it = std::lower_bound (vec.begin (), vec.end (),
  MSYMBOL_LANGUAGE (sym));
  if (it == vec.end () || *it != MSYMBOL_LANGUAGE (sym))
vec.insert (it, MSYMBOL_LANGUAGE (sym));

I don't think the fact that "vec" is a reference here confuses
anyone.

That was literally a random sample (grepped for "&[a-z]* = ") so
I ended up picking one with C++11 "auto", but here's another
random example, spelling out the type name, similarly using
a reference as a shorthand alias:

  osdata_item  = osdata->items.back ();

  item.columns.emplace_back (std::move (data->property_name),
 std::string (body_text));

> It's then a question of whether "int " is useful enough that
> it's worth accepting it too, and having a mixture of styles in the codebase.
I don't see it as a mixture of styles, as I don't see
pointers and references the exact same thing, but rather
see references as another tool in the box.

Thanks,
Pedro Alves


Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-12 Thread Richard Sandiford
Pedro Alves  writes:
> On 07/12/2018 12:40 PM, Jonathan Wakely wrote:
>> On Thu, 12 Jul 2018 at 11:41, Richard Sandiford wrote:
>>> +Only use non-constant references in the following situations:
>>> +
>>> +
>>> +
>>> +when they are necessary to conform to a standard interface, such as
>>> +the first argument to a non-member operator+=
>> 
>> And the return value of such operators (which also applies to member
>> operators, which is the more conventional way to write compound
>> assignment operators).
>> 
>>> +in a return value, when providing access to something that is known
>>> +to be nonnull
>>> +
>>> +
>>> +
>>> +In other situations the convention is to use pointers instead.
>>> +
>>> +
>>> +
>>> +HOST_WIDE_INT do_arith (..., bool *overflow);   // OK
>>> +HOST_WIDE_INT do_arith (..., bool overflow);   // Please avoid
>> 
>> I understand the objection to using references for out parameters 
>
> FWIW, GDB's conventions (which import GCC's coding conventions) already
> suggest avoiding non-const reference output parameters, so this won't
> affect us.
>
> But, FWIW, personally, while I used to be all for avoiding non-const
> reference output parameters, I no longer am, at least so strongly,
> nowadays.
>
> The reason being that the visual distinction can be easily lost with
> pointers too anyway:
>
>  // the usual argument is that using pointers for output parameters shows
>  // clearly that bar is going to be modified:
>  function (foo, );
>
>  // but well, we often works with pointers, and if "bar" is a pointer,
>  // we don't get any visual clue anyway either:
>  function (foo, bar);

True, and I'm not claiming that code using pointers always makes
this obvious.  But like you say, both approaches lack a visual clue
in this case.  The pointer approach does at least mean we get an
error if "bar" isn't a pointer, whereas with references it would
compile if "bar" is a local object or a reference.

So with the pointer approach, there is at least an explicit statement of
intent somewhere in the calling code, even if it isn't at the call site
itself.

>  // which suggests that what really helps is seeing the output
>  // variable nearby, suggesting to define it right before the
>  // function call that fills it in, and I would go as far
>  // as saying that clearer symbol names help even more.  For e.g.:
>  B bar_out;
>  fill_bar (foo, bar_out);
>
> (an
>> alternative to pointers is to return a struct with the wide int result
>> and the overflow flag),
>
> +1.  I've been pushing GDB in that direction whenever possible.

I agree that can sometimes be better.  I guess it depends on the
context.  If a function returns a bool and some other data that has no
obvious "failure" value, it can be easier to write chained conditions if
the function returns the bool and provides the other data via an output
parameter.

>> but ...
>> 
>>> +int *elt = array[i];  // OK
>>> +int elt = array[i];   // Please avoid
>> 
>> ... this seems unnecessary. If the function is so long that the fact
>> elt is a reference can easily get lost, the problem is the length of
>> the function, not the use of a reference.
>> 
>
> +1000.  This seems really unnecessary.  References have the advantage
> of implicit never-null semantics, for instance.

The nonnullness is there either way in the above example though.

I don't feel as strongly about the non-const-reference variables, but for:

 int  = array[i];
 ...
 elt = 1;

it can be easy to misread that "elt = 1" is changing more than just
a local variable.

I take Jonathan's point that it shouldn't be much of a problem if
functions are a reasonable size, but we've not tended to be very
good at keeping function sizes down.  I guess part of the problem
is that even functions that start off small tend to grow over time.

We have been better at enforcing more specific rules like the ones
in the patch.  And that's easier to do because a patch either adds
references or it doesn't.  It's harder to force (or remember to force)
someone to split up a function if they're adding 5 lines to one that is
already 20 lines long.  Then for the next person it's 25 lines long, etc.

Also, the "int *elt" approach copes with cases in which "elt" might
have to be redirected later, so we're going to need to use it in some
cases.  It's then a question of whether "int " is useful enough that
it's worth accepting it too, and having a mixture of styles in the codebase.

Thanks,
Richard


Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-12 Thread Pedro Alves
On 07/12/2018 12:40 PM, Jonathan Wakely wrote:
> On Thu, 12 Jul 2018 at 11:41, Richard Sandiford wrote:
>> +Only use non-constant references in the following situations:
>> +
>> +
>> +
>> +when they are necessary to conform to a standard interface, such as
>> +the first argument to a non-member operator+=
> 
> And the return value of such operators (which also applies to member
> operators, which is the more conventional way to write compound
> assignment operators).
> 
>> +in a return value, when providing access to something that is known
>> +to be nonnull
>> +
>> +
>> +
>> +In other situations the convention is to use pointers instead.
>> +
>> +
>> +
>> +HOST_WIDE_INT do_arith (..., bool *overflow);   // OK
>> +HOST_WIDE_INT do_arith (..., bool overflow);   // Please avoid
> 
> I understand the objection to using references for out parameters 

FWIW, GDB's conventions (which import GCC's coding conventions) already
suggest avoiding non-const reference output parameters, so this won't
affect us.

But, FWIW, personally, while I used to be all for avoiding non-const
reference output parameters, I no longer am, at least so strongly,
nowadays.

The reason being that the visual distinction can be easily lost with
pointers too anyway:

 // the usual argument is that using pointers for output parameters shows
 // clearly that bar is going to be modified:
 function (foo, );

 // but well, we often works with pointers, and if "bar" is a pointer,
 // we don't get any visual clue anyway either:
 function (foo, bar);

 // which suggests that what really helps is seeing the output
 // variable nearby, suggesting to define it right before the
 // function call that fills it in, and I would go as far
 // as saying that clearer symbol names help even more.  For e.g.:
 B bar_out;
 fill_bar (foo, bar_out);

(an
> alternative to pointers is to return a struct with the wide int result
> and the overflow flag),

+1.  I've been pushing GDB in that direction whenever possible.

> but ...
> 
>> +int *elt = array[i];  // OK
>> +int elt = array[i];   // Please avoid
> 
> ... this seems unnecessary. If the function is so long that the fact
> elt is a reference can easily get lost, the problem is the length of
> the function, not the use of a reference.
> 

+1000.  This seems really unnecessary.  References have the advantage
of implicit never-null semantics, for instance.

Pedro Alves


Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-12 Thread Jonathan Wakely
On Thu, 12 Jul 2018 at 11:41, Richard Sandiford wrote:
> +Only use non-constant references in the following situations:
> +
> +
> +
> +when they are necessary to conform to a standard interface, such as
> +the first argument to a non-member operator+=

And the return value of such operators (which also applies to member
operators, which is the more conventional way to write compound
assignment operators).

> +in a return value, when providing access to something that is known
> +to be nonnull
> +
> +
> +
> +In other situations the convention is to use pointers instead.
> +
> +
> +
> +HOST_WIDE_INT do_arith (..., bool *overflow);   // OK
> +HOST_WIDE_INT do_arith (..., bool overflow);   // Please avoid

I understand the objection to using references for out parameters (an
alternative to pointers is to return a struct with the wide int result
and the overflow flag), but ...

> +int *elt = array[i];  // OK
> +int elt = array[i];   // Please avoid

... this seems unnecessary. If the function is so long that the fact
elt is a reference can easily get lost, the problem is the length of
the function, not the use of a reference.


Re: [RFC] Update coding conventions to restrict use of non-const references

2018-07-12 Thread Richard Biener
On Thu, Jul 12, 2018 at 12:41 PM Richard Sandiford
 wrote:
>
> Following on from:
>
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00603.html
>
> this patch is an RFC to mention references in the C++ coding conventions.
> It allows const references anywhere they're useful but only allows
> non-constant references in contexts that conform to a standard
> interface, or when returning a reference to something that is known
> to be nonnull.
>
> It looks like there are still very few public functions that use
> non-constant references.  If the patch is OK, I'll volunteer to
> change them.

OK.

Richard.

> Thanks,
> Richard
>
>
> Index: htdocs/codingconventions.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/codingconventions.html,v
> retrieving revision 1.81
> diff -u -p -r1.81 codingconventions.html
> --- htdocs/codingconventions.html   2 Jun 2018 21:16:09 -   1.81
> +++ htdocs/codingconventions.html   12 Jul 2018 10:05:12 -
> @@ -57,6 +57,7 @@ the conventions separately from any othe
>  Overloading Functions
>  Overloading Operators
>  Default Arguments
> +References
>  Inlining Functions
>  Templates
>  Namespaces
> @@ -996,6 +997,44 @@ Virtual functions should not have defaul
>  
>
>
> +References
> +
> +
> +Const references can be used freely.  They are a useful way of reducing
> +unnecessary copying or avoiding an impossible copy.
> +
> +
> +
> +Only use non-constant references in the following situations:
> +
> +
> +
> +when they are necessary to conform to a standard interface, such as
> +the first argument to a non-member operator+=
> +in a return value, when providing access to something that is known
> +to be nonnull
> +
> +
> +
> +In other situations the convention is to use pointers instead.
> +
> +
> +
> +HOST_WIDE_INT do_arith (..., bool *overflow);   // OK
> +HOST_WIDE_INT do_arith (..., bool overflow);   // Please avoid
> +
> +int *elt = array[i];  // OK
> +int elt = array[i];   // Please avoid
> +
> +
> +
> +There are two main reasons for this restriction.  The first is to avoid
> +an inconsistency with the large body of older (pre-C++) code that uses
> +the pointer convention.  The second is to maintain a visual distinction
> +between the arguments that a function does modify and the arguments that
> +it doesn't.
> +
> +
>  Inlining Functions
>
>