[ 
https://issues.apache.org/jira/browse/MESOS-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14501097#comment-14501097
 ] 

Michael Park edited comment on MESOS-2629 at 4/18/15 4:59 AM:
--------------------------------------------------------------

As suggested in [r33271|https://reviews.apache.org/r/33271], I'll try capture 
the full context and explanation regarding this topic.

h3. Brief History

Function parameters and local variables are similar in many ways e.g. both 
allocated on the stack, have the scope, a function argument is analogous to a 
variable initializer. The old K&R C syntax screams at the similarities as well:

{code}
void F(x) int x; { int y; }
{code}

Binding a temporary object to a const-ref *function parameter* makes sense 
since the temporary is guaranteed to stay alive until the function is fully 
evaluated. e.g. In the expression: {{F(create_temporary());}}, the result of 
{{create_temporary()}} is guaranteed to stay alive until the {{F}} is fully 
evaluated.

The feature to allow local variables to bind to a temporary was originally 
introduced to keep consistent with the above behavior since local variables are 
very similar to function parameters.

h3. Basic Rule for Lifetime of a Temporary Object

The basic rule for the lifetime of a temporary object is that it is destroyed 
at the end of the full expression involving the temporary object. I'll go into 
more depth in this post, but no one should have to remember further than this 
basic rule.

h3. Standardese

Let's get to the nitty gritty. The following is quoted from 
[N4296|http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf].

{quote}
(4) There are two contexts in which temporaries are destroyed at a different 
point than the end of the full-expression.
The first context is when a default constructor is called to initialize an 
element of an array. If the constructor has one or more default arguments, the 
destruction of every temporary created in a default argument is sequenced 
before the construction of the next array element, if any.
{quote}

This one's not all that interesting, let's skip it.

{quote}
(5) The second context is when a reference is bound to a temporary.116 The 
temporary to which the reference is bound or the temporary that is the complete 
object of a subobject to which the reference is bound persists for the lifetime 
of the reference except:
(5.1) — A temporary object bound to a reference parameter in a function call 
(5.2.2) persists until the completion of the full-expression containing the 
call.
(5.2) — The lifetime of a temporary bound to the returned value in a function 
return statement (6.6.3) is not extended; the temporary is destroyed at the end 
of the full-expression in the return statement.
(5.3) — A temporary bound to a reference in a new-initializer (5.3.4) persists 
until the completion of the full-expression containing the new-initializer. 
[Example:
{code}
        struct S { int mi; const std::pair<int,int>& mp; };
        S a { 1, {2,3} };
        S* p = new S{ 1, {2,3} }; // Creates dangling reference
{code}
    — end example ] [ Note: This may introduce a dangling reference, and 
implementations are encouraged to issue a warning in such a case. — end note ]
{quote}

So (5) tells us that when a reference is bound to a temporary, in general, the 
temporary's lifetime is extended to the reference's lifetime except the 3 
special cases listed as (5.1), (5.2) and (5.3). (5.1) allows temporaries to 
stick around for the full expression if it's bound to a reference parameter in 
a function call, which simply brings us back to the *Basic Rule* above. (5.2) 
says you can't return a temporary and bind it to a reference return type. i.e. 
we can't write code like the following and expect the temporary's lifetime to 
be extended.
{code} const Obj& createObj() { return Obj(); } {code}
(5.3): Well, there's an example for it. Basically, we can't extend the lifetime 
of a temporary bound to a reference in the heap.

h3. Pitfall

The major issue is that after looking at the code that (5) enables us to write:

{code}
const std::string& name = std::string("mpark");
{code}

*It misleads us to think that we can bind temporaries to a const-ref variable, 
then proceed to pass the const-ref variable around and it will keep the 
temporary alive*. This is the source of all the crazy. Here's probably the 
simplest example that breaks the above valid code.

{code}
const std::string& identity(const std::string& s) { return s; }
const std::string& name = identity(std::string("mpark"));
{code}

{{identity}} is a function that simply takes a string and returns it. Logically 
the code is no different, but {{name}} is now invalid. Why? Remember (5.2) 
which brings us back to the *Basic Rule*? {{std::string("mpark")}} is bound to 
a reference parameter of {{identity}}!

Recall that a member function is no different than a regular function except 
that it has an implicit object parameter as its first parameter. Therefore, 
(5.2) applies again when we call a member function on a temporary object which 
brings us back to the *Basic Rule* again. This is what we get hit by most often 
as illustrated by the *Mesos Case* section in the *Description*.

h3. Why Disallow the Simple Case?

There are a couple of reasons to disallow even the simple case. The main one is 
what is described in the *Reasoning* section in the *Description*, which is 
that it's very easy to turn the simple case into the complex case which causes 
hard-to-find bugs. The other is that it buys us absolutely nothing. 
Performance-wise, it's no better than what RVO and NRVO can do. In terms of 
"what people are used to", capturing temporaries by const-ref is not and has 
never been standard practice in the C++ community. There's no worries in 
regards to surprising new-comers.

h3. Bonus I: Reasoning Through the Rules

Reasoning about what the compiler has to do to implement this feature helps to 
clarify some of the rules and behaviors that may seem crazy. When the compiler 
sees a local const-ref variable directly bound to a temporary object, it 
creates a hidden variable in the same scope as the local variable and binds the 
temporary object to it. Essentially turning the following code

{code}
const std::string& name = std::string("mpark");
{code}

into

{code}
std::string hidden = std::string("mpark");
const std::string& name = hidden;
{code}

* This is how temporary object's lifetime is "extended" to the lifetime of the 
const-ref variable.
* This is why const-ref is no better in performance than RVO/NRVO, because it 
needs to essentially perform RVO/NRVO anyway.
* This is why (5.3) exists. Since we can't perform RVO/NRVO into heap memory, 
we can't extend the lifetime of temporaries bound to the reference members on 
the heap.
* In Herb's 
[GotW|http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/]
 article, the Guru question is

{quote}When the reference goes out of scope, which destructor gets 
called?{quote}

The answer is:
{quote}
You can take a const Base& to a Derived temporary and it will be destroyed 
without virtual dispatch on the destructor call.

This is nifty. Consider the following code:

{code}
// Example 3

Derived factory(); // construct a Derived object

void g() {
  const Base& b = factory(); // calls Derived::Derived here
  // … use b …
} // calls Derived::~Derived directly here — not Base::~Base + virtual dispatch!
{code}
{quote}

The compiler essentially turns the above code into:

{code}
void g() {
  Derived hidden = factory();
  const Base& b = hidden;
  // … use b …
}
{code}

So yes, it makes sense that {{Derived::~Derived}} would be invoked directly.

h3. Bonus II: What's "the most important *const*" referred to in the 
[GotW|http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/]?

{quote}
I once heard Andrei Alexandrescu give a talk on ScopeGuard (invented by Petru 
Marginean) where he used this C++ feature and called it "the most important 
*const* I ever wrote."
{quote}

{{ScopeGuard<Fn>}} (available in [Facebook 
Folly|https://github.com/facebook/folly/blob/master/folly/ScopeGuard.h]) is 
templated on the function object it holds. {{makeGuard(fn)}} is a factory 
function that returns an instance of {{ScopeGuard<Fn>}}. Since {{ScopeGuard}} 
was introduced before C\+\+11, they didn't have {{auto}} available to simply do 
{{auto guard = makeGuard(fn);}}. This lead to the following technique: (1) Make 
{{ScopeGuard<Fn>}} inherit from {{ScopeGuardBase}}, (2) Capture the temporary 
of type {{ScopeGuard<Fn>}} created by {{makeGuard(fn);}} by {{const 
ScopeGuardBase &}}. i.e. {{const ScopeGuardBase& guard = makeGuard(fn);}}. But 
with C++11's introduction of {{auto}}, this technique is now obsolete.


was (Author: mcypark):
As suggested in [r33271|https://reviews.apache.org/r/33271], I'll try capture 
the full context and explanation regarding this topic.

h3. Brief History

Function parameters and local variables are similar in many ways e.g. both 
allocated on the stack, have the scope, a function argument is analogous to a 
variable initializer. The old K&R C syntax screams at the similarities as well:

{code}
void F(x) int x; { int y; }
{code}

Binding a temporary object to a const-ref *function parameter* makes sense 
since the temporary is guaranteed to stay alive until the function is fully 
evaluated. e.g. In the expression: {{F(create_temporary());}}, the result of 
{{create_temporary()}} is guaranteed to stay alive until the {{F}} is fully 
evaluated.

The feature to allow local variables to bind to a temporary was originally 
introduced to keep consistent with the above behavior since local variables are 
very similar to function parameters.

h3. Basic Rule for Lifetime of a Temporary Object

The basic rule for the lifetime of a temporary object is that it is destroyed 
at the end of the full expression involving the temporary object. I'll go into 
more depth in this post, but no one should have to remember further than this 
basic rule.

h3. Standardese

Let's get to the nitty gritty. The following is quoted from 
[N4296|http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf].

{quote}
4 There are two contexts in which temporaries are destroyed at a different 
point than the end of the full-expression.
The first context is when a default constructor is called to initialize an 
element of an array. If the constructor has one or more default arguments, the 
destruction of every temporary created in a default argument is sequenced 
before the construction of the next array element, if any.
{quote}

This one's not all that interesting, let's skip it.

{quote}
5 The second context is when a reference is bound to a temporary.116 The 
temporary to which the reference is bound or the temporary that is the complete 
object of a subobject to which the reference is bound persists for the lifetime 
of the reference except:
    (5.1) — A temporary object bound to a reference parameter in a function 
call (5.2.2) persists until the completion of the full-expression containing 
the call.
    (5.2) — The lifetime of a temporary bound to the returned value in a 
function return statement (6.6.3) is not extended; the temporary is destroyed 
at the end of the full-expression in the return statement.
    (5.3) — A temporary bound to a reference in a new-initializer (5.3.4) 
persists until the completion of the full-expression containing the 
new-initializer. [Example:
{code}
        struct S { int mi; const std::pair<int,int>& mp; };
        S a { 1, {2,3} };
        S* p = new S{ 1, {2,3} }; // Creates dangling reference
{code}
    — end example ] [ Note: This may introduce a dangling reference, and 
implementations are encouraged to issue a warning in such a case. — end note ]
{quote}

So (5) tells us that when a reference is bound to a temporary, in general, the 
temporary's lifetime is extended to the reference's lifetime except the 3 
special cases listed as (5.1), (5.2) and (5.3). (5.1) allows temporaries to 
stick around for the full expression if it's bound to a reference parameter in 
a function call, which simply brings us back to the *Basic Rule* above. (5.2) 
says you can't return a temporary and bind it to a reference return type. i.e. 
we can't write code like the following and expect the temporary's lifetime to 
be extended.
{code} const Obj& createObj() { return Obj(); } {code}
(5.3): Well, there's an example for it. Basically, we can't extend the lifetime 
of a temporary bound to a reference in the heap.

h3. Pitfall

The major issue is that after looking at the code that (5) enables us to write:

{code}
const std::string& name = std::string("mpark");
{code}

*It misleads us to think that we can bind temporaries to a const-ref variable, 
then proceed to pass the const-ref variable around and it will keep the 
temporary alive*. This is the source of all the crazy. Here's probably the 
simplest example that breaks the above valid code.

{code}
const std::string& identity(const std::string& s) { return s; }
const std::string& name = identity(std::string("mpark"));
{code}

{{identity}} is a function that simply takes a string and returns it. Logically 
the code is no different, but {{name}} is now invalid. Why? Remember (5.2) 
which brings us back to the *Basic Rule*? {{std::string("mpark")}} is bound to 
a reference parameter of {{identity}}!

Recall that a member function is no different than a regular function except 
that it has an implicit object parameter as its first parameter. Therefore, 
(5.2) applies again when we call a member function on a temporary object which 
brings us back to the *Basic Rule* again. This is what we get hit by most often 
as illustrated by the *Mesos Case* section in the *Description*.

h3. Why Disallow the Simple Case?

There are a couple of reasons to disallow even the simple case. The main one is 
what is described in the *Reasoning* section in the *Description*, which is 
that it's very easy to turn the simple case into the complex case which causes 
hard-to-find bugs. The other is that it buys us absolutely nothing. 
Performance-wise, it's no better than what RVO and NRVO can do. In terms of 
"what people are used to", capturing temporaries by const-ref is not and has 
never been standard practice in the C++ community. There's no worries in 
regards to surprising new-comers.

h3. Bonus I: Reasoning Through the Rules

Reasoning about what the compiler has to do to implement this feature helps to 
clarify some of the rules and behaviors that may seem crazy. When the compiler 
sees a local const-ref variable directly bound to a temporary object, it 
creates a hidden variable in the same scope as the local variable and binds the 
temporary object to it. Essentially turning the following code

{code}
const std::string& name = std::string("mpark");
{code}

into

{code}
std::string hidden = std::string("mpark");
const std::string& name = hidden;
{code}

* This is how temporary object's lifetime is "extended" to the lifetime of the 
const-ref variable.
* This is why const-ref is no better in performance than RVO/NRVO, because it 
needs to essentially perform RVO/NRVO anyway.
* This is why (5.3) exists. Since we can't perform RVO/NRVO into heap memory, 
we can't extend the lifetime of temporaries bound to the reference members on 
the heap.
* In Herb's 
[GotW|http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/]
 article, the Guru question is

{quote}When the reference goes out of scope, which destructor gets 
called?{quote}

The answer is:
{quote}
You can take a const Base& to a Derived temporary and it will be destroyed 
without virtual dispatch on the destructor call.

This is nifty. Consider the following code:

{code}
// Example 3

Derived factory(); // construct a Derived object

void g() {
  const Base& b = factory(); // calls Derived::Derived here
  // … use b …
} // calls Derived::~Derived directly here — not Base::~Base + virtual dispatch!
{code}
{quote}

The compiler essentially turns the above code into:

{code}
void g() {
  Derived hidden = factory();
  const Base& b = hidden;
  // … use b …
}
{code}

So yes, it makes sense that {{Derived::~Derived}} would be invoked directly.

h3. Bonus II: What's "the most important *const*" referred to in the 
[GotW|http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/]?

{quote}
I once heard Andrei Alexandrescu give a talk on ScopeGuard (invented by Petru 
Marginean) where he used this C++ feature and called it "the most important 
*const* I ever wrote."
{quote}

{{ScopeGuard<Fn>}} (available in [Facebook 
Folly|https://github.com/facebook/folly/blob/master/folly/ScopeGuard.h]) is 
templated on the function object it holds. {{makeGuard(fn)}} is a factory 
function that returns an instance of {{ScopeGuard<Fn>}}. Since {{ScopeGuard}} 
was introduced before C\+\+11, they didn't have {{auto}} available to simply do 
{{auto guard = makeGuard(fn);}}. This lead to the following technique: (1) Make 
{{ScopeGuard<Fn>}} inherit from {{ScopeGuardBase}}, (2) Capture the temporary 
of type {{ScopeGuard<Fn>}} created by {{makeGuard(fn);}} by {{const 
ScopeGuardBase &}}. i.e. {{const ScopeGuardBase& guard = makeGuard(fn);}}. But 
with C++11's introduction of {{auto}}, this technique is now obsolete.

> Update style guide to disallow capture by reference of temporaries
> ------------------------------------------------------------------
>
>                 Key: MESOS-2629
>                 URL: https://issues.apache.org/jira/browse/MESOS-2629
>             Project: Mesos
>          Issue Type: Task
>            Reporter: Joris Van Remoortere
>            Assignee: Joris Van Remoortere
>
> We modify the style guide to disallow constant references to temporaries as a 
> whole. This means disallowing both (1) and (2) below.
> h3. Background
> 1. Constant references to simple expression temporaries do extend the 
> lifetime of the temporary till end of function scope:
> * Temporary returned by function:
>   {code}
>   // See full example below.
>   T f(const char* s) { return T(s); }
>   {
>     const T& good = f("Ok");
>     // use of good is ok.
>   }
>   {code}
> * Temporary constructed as simple expression:
>   {code}
>   // See full example below.
>   {
>     const T& good = T("Ok");
>     // use of good is ok.
>   }
>   {code}
> 2. Constant references to expressions that result in a reference to a 
> temporary do not extend the lifetime of the temporary:
>   * Temporary returned by function:
>   {code}
>   // See full example below.
>   T f(const char* s) { return T(s); }
>   {
>     const T& bad = f("Bad!").Member();
>     // use of bad is invalid.
>   }
>   {code}
>   * Temporary constructed as simple expression:
>   {code}
>   // See full example below.
>   {
>     const T& bad = T("Bad!").Member();
>     // use of bad is invalid.
>   }
>   {code}
> h3. Mesos Case
>   - In Mesos we use Future<T> a lot. Many of our functions return Futures by 
> value:
>   {code}
>   class Socket {
>     Future<Socket> accept();
>     Future<size_t> recv(char* data, size_t size);
>     ...
>   }
>   {code}
>   - Sometimes we capture these Futures:
>   {code}
>   {
>     const Future<Socket>& accepted = socket.accept(); // Valid c++, propose 
> we disallow.
>   }
>   {code}
>   - Sometimes we chain these Futures:
>   {code}
>   {
>     socket.accept().then(lambda::bind(_accepted)); // Temporary will be valid 
> during 'then' expression evaluation.
>   }
>   {code}
>   - Sometimes we do both:
>   {code}
>   {
>     const Future<Socket>& accepted = 
> socket.accept().then(lambda::bind(_accepted)); // Dangerous! 'accepted' 
> lifetime will not be valid till end of scope. Disallow!
>   }
>   {code}
> h3. Reasoning
> - Although (1) is ok, and considered a 
> [feature|http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/],
>  (2) is extremely dangerous and leads to hard to track bugs.
> - If we explicitly allow (1), but disallow (2), then my worry is that someone 
> coming along to maintain the code later on may accidentally turn (1) into 
> (2), without recognizing the severity of this mistake. For example:
> {code}
> // Original code:
> const T& val = T();
> std::cout << val << std::endl;
> // New code:
> const T& val = T().removeWhiteSpace();
> std::cout << val << std::endl; // val could be corrupted since the destructor 
> has been invoked and T's memory freed.
> {code}
> - If we disallow both cases: it will be easier to catch these mistakes early 
> on in code reviews (and avoid these painful bugs), at the same cost of 
> introducing a new style guide rule.
> h3. Performance Implications
> - BenH suggests c++ developers are commonly taught to capture by constant 
> reference to hint to the compiler that the copy can be elided.
> - Modern compilers use a Data Flow Graph to make optimizations such as
>   - *In-place-construction*: leveraged by RVO and NRVO to construct the 
> object in place on the stack. Similar to "*Placement new*": 
> http://en.wikipedia.org/wiki/Placement_syntax
>   - *RVO* (Return Value Optimization): 
> http://en.wikipedia.org/wiki/Return_value_optimization
>   - *NRVO* (Named Return Value Optimization): 
> https://msdn.microsoft.com/en-us/library/ms364057%28v=vs.80%29.aspx
> - Since modern compilers perform these optimizations, we no longer need to 
> 'hint' to the compiler that the copies can be elided.
> h3. Example program
> {code}
> #include <stdio.h>
> class T {
> public:
>   T(const char* str) : Str(str) {
>     printf("+ T(%s)\n", Str);
>   }
>   ~T() {
>     printf("- T(%s)\n", Str);
>   }
>   const T& Member() const
>   {
>     return *this;
>   }
> private:
>   const char* Str;
> };
> T f(const char* s) { return T(s); }
> int main() {
>   const T& good = T("Ok");
>   const T& good_f = f("Ok function");
>   const T& bad = T("Bad!").Member();
>   const T& bad_f = T("Bad function!").Member();
>   printf("End of function scope...\n");
> }
> {code}
> Output:
> {code}
> + T(Ok)
> + T(Ok function)
> + T(Bad!)
> - T(Bad!)
> + T(Bad function!)
> - T(Bad function!)
> End of function scope...
> - T(Ok function)
> - T(Ok)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to