Re: [webkit-dev] Pointers and self-documenting code

2012-07-06 Thread Ryosuke Niwa
On Fri, Jul 6, 2012 at 2:26 PM, Joe Mason  wrote:

> > From: Filip Pizlo [fpi...@apple.com]
> > Sent: Friday, July 06, 2012 4:52 PM
> > To: Joe Mason
> > Cc: WebKit Development ‎[webkit-dev@lists.webkit.org]‎
> > Subject: Re: [webkit-dev] Pointers and self-documenting code
> >
> > It's not at all clear that this is correct.
> >
> > Is it correct for the font size to be zero if the frame doesn't exist?
>  Seems
> > dubious.
> >
> > Is it correct for the font size to be zero if the page doesn't exist?
>  Seems
> > dubious.
> >
> > Adding null checking only makes sense if you have a good story for what
> the
> > code ought to do if the pointer in question is null.
>
> Well, yes, but that's not the point of the example.  I thought about
> adding a comment, "or some suitable default," but I figured that would be
> pedantic.
>
> Regardless of whether my example default was a good one, it's clear that
> dereferencing null and crashing is NOT the correct thing to do.
>
> > This is a questionable policy.  Often object properties have transient
> > nullness.  They may be null at time T1 and non-null at time T2, and the
> caller
> >  may know that the property is non-null from context (the value of some
> other
> > field, the fact that it's performed some action that forces the field to
> > become non-null, etc).
>
> In this case the caller should specifically assert that the property is
> not null, to let the person reading the code know that it's a possibility
> that's been considered and it's believed to be impossible.
>
> > Thus statically enforcing the non-nullness of fields is likely to just
> make
> > the code more complicated.  And it doesn't buy anything.
>
> It buys a lot!  It buys clarity. It makes the code less fragile. It's a
> lower barrier to entry, because less knowledge is needed to understand the
> code.  In some cases it provides a compile-time correctness guarantee,
> which is a good thing - not in all cases, but sometimes is better than
> never.
>
> There are three categories of accessor:
>
> 1. Those that can never return null
> 2. Those that can potentially return null, and in circumstances complex
> enough that developers can't usefully keep track while writing algorithms
> 3. Those that can potentially return null, but only in circumstances that
> are well defined and easy to reason about (the "transient nullness" you
> mention)
>
> For type 1, I argue that we should return references to indicate this
> statically.  I don't see how this makes the code more complicated.
>

This change seems valuable. I've seen a lot of code & patches where people
did something along the line of:

Editor* editor= m_frame->editor();
if (!editor)
return;

Note that Editor is a member variable of Frame and we just return &m_editor
in editor().

- Ryosuke
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Pointers and self-documenting code

2012-07-06 Thread Joe Mason
> From: Filip Pizlo [fpi...@apple.com]
> Sent: Friday, July 06, 2012 4:52 PM
> To: Joe Mason
> Cc: WebKit Development ‎[webkit-dev@lists.webkit.org]‎
> Subject: Re: [webkit-dev] Pointers and self-documenting code
> 
> It's not at all clear that this is correct.
> 
> Is it correct for the font size to be zero if the frame doesn't exist?  Seems 
> dubious.
> 
> Is it correct for the font size to be zero if the page doesn't exist?  Seems 
> dubious.
> 
> Adding null checking only makes sense if you have a good story for what the 
> code ought to do if the pointer in question is null.

Well, yes, but that's not the point of the example.  I thought about adding a 
comment, "or some suitable default," but I figured that would be pedantic.

Regardless of whether my example default was a good one, it's clear that 
dereferencing null and crashing is NOT the correct thing to do.

> This is a questionable policy.  Often object properties have transient 
> nullness.  They may be null at time T1 and non-null at time T2, and the caller
>  may know that the property is non-null from context (the value of some other 
> field, the fact that it's performed some action that forces the field to 
> become non-null, etc).

In this case the caller should specifically assert that the property is not 
null, to let the person reading the code know that it's a possibility that's 
been considered and it's believed to be impossible.

> Thus statically enforcing the non-nullness of fields is likely to just make 
> the code more complicated.  And it doesn't buy anything.

It buys a lot!  It buys clarity. It makes the code less fragile. It's a lower 
barrier to entry, because less knowledge is needed to understand the code.  In 
some cases it provides a compile-time correctness guarantee, which is a good 
thing - not in all cases, but sometimes is better than never.

There are three categories of accessor:

1. Those that can never return null
2. Those that can potentially return null, and in circumstances complex enough 
that developers can't usefully keep track while writing algorithms
3. Those that can potentially return null, but only in circumstances that are 
well defined and easy to reason about (the "transient nullness" you mention)

For type 1, I argue that we should return references to indicate this 
statically.  I don't see how this makes the code more complicated.
For type 2, we need to always do null checks.  Anything else would be unsafe.
For type 3, we're in the exact situation we are now - the function signature 
will return a pointer, and the only way to know whether it needs to be checked 
for null or not is to read the code and reason about its usage.

I would argue that we're currently too stingy with null checks for type 3, 
because it's very hard to tell why it's safe to skip null checks by looking at 
small parts of the code, and it's better to err on the side of safety.  But 
that's a more subtle discussion, and I don't want to focus on that part.  I'm 
more interested in the low-hanging fruit of type 1 vs type 2 right now.

Joe
-
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Pointers and self-documenting code

2012-07-06 Thread Filip Pizlo

On Jul 6, 2012, at 1:35 PM, Joe Mason wrote:

> As has been noted in a recent thread, WebKit strives to make the code clear 
> and understandable rather than have embedded comments that can become 
> obsolete.  One common practice that I find quite opaque is for classes to 
> have functions returning pointers which can never return 0, especially when 
> such functions are often chained.
> 
> For example, is this call safe?
> 
> int fontSize = 
> node->document()->frame()->page()->settings()->minimumFontSize();
> 
> (Yes, I could call document()->settings() directly, but I want to illustrate 
> all the different styles of writing accessors.)
> 
> No it is not:
> 
> Node::document() returns m_document, with an assert documenting that it never 
> returns 0
> Document::frame() has a comment stating that it CAN return 0
> Frame::page() CAN return 0 - I think.  It just returns "m_page" with no 
> comments - is this the same situation as Node::document(), so m_page is never 
> 0, but nobody bothered to assert?  I deduce that m_page can be 0 because the 
> detachFromPage() function which sets m_page to 0 is defined next in the file 
> and I happened to look down and notice it.
> Page::settings() cannot return 0, since it returns m_settings.get() and 
> m_settings is an OwnPtr, so I know its lifetime matches that of the Page.
> 
> So the correct way to write this is:
> 
> int fontSize = 0;
> Frame* frame = node->document()->frame();
> if (frame) {
>Page* page = frame->page();
>if (page)
>fontSize = page->settings()->minimumFontSize();
> }

It's not at all clear that this is correct.

Is it correct for the font size to be zero if the frame doesn't exist?  Seems 
dubious.

Is it correct for the font size to be zero if the page doesn't exist?  Seems 
dubious.

Adding null checking only makes sense if you have a good story for what the 
code ought to do if the pointer in question is null.

> 
> But the only way to know this is to read the code for each accessor!  And 
> forget about memorizing which is which: Document::settings() can return 0, 
> but Page::settings() can't, so whether "settings()" needs to be checked or 
> not depends on the context.
> 
> Is there a reason we don't return a reference from functions that are 
> guaranteed to be non-zero?  Then the above would become:
> 
> int fontSize = node->document().frame()->page()->settings().minimumFontSize();
> 
> And the error is easy to spot: all the dereferences without checking for 0 
> are dangerous.  And the compiler will enforce this.

This is a questionable policy.  Often object properties have transient 
nullness.  They may be null at time T1 and non-null at time T2, and the caller 
may know that the property is non-null from context (the value of some other 
field, the fact that it's performed some action that forces the field to become 
non-null, etc).

Thus statically enforcing the non-nullness of fields is likely to just make the 
code more complicated.  And it doesn't buy anything.

-F


> 
> I find the common use of things like "document()->frame()" without checking 
> the return value of document() is a big impediment to understanding the code, 
> because I'm never sure if I can take it as gospel or if someone has been 
> sloppy and forgotten to check return values.  And in turn it leads to 
> sloppiness, since it's easy to forget which accessors do require null checks 
> and write an invalid chain such as the one above.
> 
> Joe
> 
> -
> This transmission (including any attachments) may contain confidential 
> information, privileged material (including material protected by the 
> solicitor-client or other applicable privileges), or constitute non-public 
> information. Any use of this information by anyone other than the intended 
> recipient is prohibited. If you have received this transmission in error, 
> please immediately reply to the sender and delete this information from your 
> system. Use, dissemination, distribution, or reproduction of this 
> transmission by unintended recipients is not authorized and may be unlawful.
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] Pointers and self-documenting code

2012-07-06 Thread Ryosuke Niwa
On Fri, Jul 6, 2012 at 1:35 PM, Joe Mason  wrote:

> As has been noted in a recent thread, WebKit strives to make the code
> clear and understandable rather than have embedded comments that can become
> obsolete.  One common practice that I find quite opaque is for classes to
> have functions returning pointers which can never return 0, especially when
> such functions are often chained.
>
> For example, is this call safe?
>
> int fontSize =
> node->document()->frame()->page()->settings()->minimumFontSize();
>
> (Yes, I could call document()->settings() directly, but I want to
> illustrate all the different styles of writing accessors.)
>
> No it is not:
>
> Node::document() returns m_document, with an assert documenting that it
> never returns 0
> Document::frame() has a comment stating that it CAN return 0
> Frame::page() CAN return 0 - I think.  It just returns "m_page" with no
> comments - is this the same situation as Node::document(), so m_page is
> never 0, but nobody bothered to assert?  I deduce that m_page can be 0
> because the detachFromPage() function which sets m_page to 0 is defined
> next in the file and I happened to look down and notice it.
> Page::settings() cannot return 0, since it returns m_settings.get() and
> m_settings is an OwnPtr, so I know its lifetime matches that of the Page.
>
> So the correct way to write this is:
>
> int fontSize = 0;
> Frame* frame = node->document()->frame();
> if (frame) {
> Page* page = frame->page();
> if (page)
> fontSize = page->settings()->minimumFontSize();
> }
>
> But the only way to know this is to read the code for each accessor!  And
> forget about memorizing which is which: Document::settings() can return 0,
> but Page::settings() can't, so whether "settings()" needs to be checked or
> not depends on the context.
>
> Is there a reason we don't return a reference from functions that are
> guaranteed to be non-zero?  Then the above would become:
>
> int fontSize =
> node->document().frame()->page()->settings().minimumFontSize();
>
> And the error is easy to spot: all the dereferences without checking for 0
> are dangerous.  And the compiler will enforce this.
>

That sounds like a good idea. I like it. I'd like to hear opinions of more
"senior" contributors though.

- Ryosuke
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


[webkit-dev] Pointers and self-documenting code

2012-07-06 Thread Joe Mason
As has been noted in a recent thread, WebKit strives to make the code clear and 
understandable rather than have embedded comments that can become obsolete.  
One common practice that I find quite opaque is for classes to have functions 
returning pointers which can never return 0, especially when such functions are 
often chained.

For example, is this call safe?

int fontSize = node->document()->frame()->page()->settings()->minimumFontSize();

(Yes, I could call document()->settings() directly, but I want to illustrate 
all the different styles of writing accessors.)

No it is not:

Node::document() returns m_document, with an assert documenting that it never 
returns 0
Document::frame() has a comment stating that it CAN return 0
Frame::page() CAN return 0 - I think.  It just returns "m_page" with no 
comments - is this the same situation as Node::document(), so m_page is never 
0, but nobody bothered to assert?  I deduce that m_page can be 0 because the 
detachFromPage() function which sets m_page to 0 is defined next in the file 
and I happened to look down and notice it.
Page::settings() cannot return 0, since it returns m_settings.get() and 
m_settings is an OwnPtr, so I know its lifetime matches that of the Page.

So the correct way to write this is:

int fontSize = 0;
Frame* frame = node->document()->frame();
if (frame) {
Page* page = frame->page();
if (page)
fontSize = page->settings()->minimumFontSize();
}

But the only way to know this is to read the code for each accessor!  And 
forget about memorizing which is which: Document::settings() can return 0, but 
Page::settings() can't, so whether "settings()" needs to be checked or not 
depends on the context.

Is there a reason we don't return a reference from functions that are 
guaranteed to be non-zero?  Then the above would become:

int fontSize = node->document().frame()->page()->settings().minimumFontSize();

And the error is easy to spot: all the dereferences without checking for 0 are 
dangerous.  And the compiler will enforce this.

I find the common use of things like "document()->frame()" without checking the 
return value of document() is a big impediment to understanding the code, 
because I'm never sure if I can take it as gospel or if someone has been sloppy 
and forgotten to check return values.  And in turn it leads to sloppiness, 
since it's easy to forget which accessors do require null checks and write an 
invalid chain such as the one above.

Joe

-
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev