Re: [webkit-dev] Pointers and self-documenting code
On Fri, Jul 6, 2012 at 2:26 PM, Joe Mason wrote: > > From: Filip Pizlo [fpi...@apple.com] > > Sent: Friday, July 06, 2012 4:52 PM > > To: Joe Mason > > Cc: WebKit Development [webkit-dev@lists.webkit.org] > > Subject: Re: [webkit-dev] Pointers and self-documenting code > > > > It's not at all clear that this is correct. > > > > Is it correct for the font size to be zero if the frame doesn't exist? > Seems > > dubious. > > > > Is it correct for the font size to be zero if the page doesn't exist? > Seems > > dubious. > > > > Adding null checking only makes sense if you have a good story for what > the > > code ought to do if the pointer in question is null. > > Well, yes, but that's not the point of the example. I thought about > adding a comment, "or some suitable default," but I figured that would be > pedantic. > > Regardless of whether my example default was a good one, it's clear that > dereferencing null and crashing is NOT the correct thing to do. > > > This is a questionable policy. Often object properties have transient > > nullness. They may be null at time T1 and non-null at time T2, and the > caller > > may know that the property is non-null from context (the value of some > other > > field, the fact that it's performed some action that forces the field to > > become non-null, etc). > > In this case the caller should specifically assert that the property is > not null, to let the person reading the code know that it's a possibility > that's been considered and it's believed to be impossible. > > > Thus statically enforcing the non-nullness of fields is likely to just > make > > the code more complicated. And it doesn't buy anything. > > It buys a lot! It buys clarity. It makes the code less fragile. It's a > lower barrier to entry, because less knowledge is needed to understand the > code. In some cases it provides a compile-time correctness guarantee, > which is a good thing - not in all cases, but sometimes is better than > never. > > There are three categories of accessor: > > 1. Those that can never return null > 2. Those that can potentially return null, and in circumstances complex > enough that developers can't usefully keep track while writing algorithms > 3. Those that can potentially return null, but only in circumstances that > are well defined and easy to reason about (the "transient nullness" you > mention) > > For type 1, I argue that we should return references to indicate this > statically. I don't see how this makes the code more complicated. > This change seems valuable. I've seen a lot of code & patches where people did something along the line of: Editor* editor= m_frame->editor(); if (!editor) return; Note that Editor is a member variable of Frame and we just return &m_editor in editor(). - Ryosuke ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Pointers and self-documenting code
> From: Filip Pizlo [fpi...@apple.com] > Sent: Friday, July 06, 2012 4:52 PM > To: Joe Mason > Cc: WebKit Development [webkit-dev@lists.webkit.org] > Subject: Re: [webkit-dev] Pointers and self-documenting code > > It's not at all clear that this is correct. > > Is it correct for the font size to be zero if the frame doesn't exist? Seems > dubious. > > Is it correct for the font size to be zero if the page doesn't exist? Seems > dubious. > > Adding null checking only makes sense if you have a good story for what the > code ought to do if the pointer in question is null. Well, yes, but that's not the point of the example. I thought about adding a comment, "or some suitable default," but I figured that would be pedantic. Regardless of whether my example default was a good one, it's clear that dereferencing null and crashing is NOT the correct thing to do. > This is a questionable policy. Often object properties have transient > nullness. They may be null at time T1 and non-null at time T2, and the caller > may know that the property is non-null from context (the value of some other > field, the fact that it's performed some action that forces the field to > become non-null, etc). In this case the caller should specifically assert that the property is not null, to let the person reading the code know that it's a possibility that's been considered and it's believed to be impossible. > Thus statically enforcing the non-nullness of fields is likely to just make > the code more complicated. And it doesn't buy anything. It buys a lot! It buys clarity. It makes the code less fragile. It's a lower barrier to entry, because less knowledge is needed to understand the code. In some cases it provides a compile-time correctness guarantee, which is a good thing - not in all cases, but sometimes is better than never. There are three categories of accessor: 1. Those that can never return null 2. Those that can potentially return null, and in circumstances complex enough that developers can't usefully keep track while writing algorithms 3. Those that can potentially return null, but only in circumstances that are well defined and easy to reason about (the "transient nullness" you mention) For type 1, I argue that we should return references to indicate this statically. I don't see how this makes the code more complicated. For type 2, we need to always do null checks. Anything else would be unsafe. For type 3, we're in the exact situation we are now - the function signature will return a pointer, and the only way to know whether it needs to be checked for null or not is to read the code and reason about its usage. I would argue that we're currently too stingy with null checks for type 3, because it's very hard to tell why it's safe to skip null checks by looking at small parts of the code, and it's better to err on the side of safety. But that's a more subtle discussion, and I don't want to focus on that part. I'm more interested in the low-hanging fruit of type 1 vs type 2 right now. Joe - This transmission (including any attachments) may contain confidential information, privileged material (including material protected by the solicitor-client or other applicable privileges), or constitute non-public information. Any use of this information by anyone other than the intended recipient is prohibited. If you have received this transmission in error, please immediately reply to the sender and delete this information from your system. Use, dissemination, distribution, or reproduction of this transmission by unintended recipients is not authorized and may be unlawful. ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Pointers and self-documenting code
On Jul 6, 2012, at 1:35 PM, Joe Mason wrote: > As has been noted in a recent thread, WebKit strives to make the code clear > and understandable rather than have embedded comments that can become > obsolete. One common practice that I find quite opaque is for classes to > have functions returning pointers which can never return 0, especially when > such functions are often chained. > > For example, is this call safe? > > int fontSize = > node->document()->frame()->page()->settings()->minimumFontSize(); > > (Yes, I could call document()->settings() directly, but I want to illustrate > all the different styles of writing accessors.) > > No it is not: > > Node::document() returns m_document, with an assert documenting that it never > returns 0 > Document::frame() has a comment stating that it CAN return 0 > Frame::page() CAN return 0 - I think. It just returns "m_page" with no > comments - is this the same situation as Node::document(), so m_page is never > 0, but nobody bothered to assert? I deduce that m_page can be 0 because the > detachFromPage() function which sets m_page to 0 is defined next in the file > and I happened to look down and notice it. > Page::settings() cannot return 0, since it returns m_settings.get() and > m_settings is an OwnPtr, so I know its lifetime matches that of the Page. > > So the correct way to write this is: > > int fontSize = 0; > Frame* frame = node->document()->frame(); > if (frame) { >Page* page = frame->page(); >if (page) >fontSize = page->settings()->minimumFontSize(); > } It's not at all clear that this is correct. Is it correct for the font size to be zero if the frame doesn't exist? Seems dubious. Is it correct for the font size to be zero if the page doesn't exist? Seems dubious. Adding null checking only makes sense if you have a good story for what the code ought to do if the pointer in question is null. > > But the only way to know this is to read the code for each accessor! And > forget about memorizing which is which: Document::settings() can return 0, > but Page::settings() can't, so whether "settings()" needs to be checked or > not depends on the context. > > Is there a reason we don't return a reference from functions that are > guaranteed to be non-zero? Then the above would become: > > int fontSize = node->document().frame()->page()->settings().minimumFontSize(); > > And the error is easy to spot: all the dereferences without checking for 0 > are dangerous. And the compiler will enforce this. This is a questionable policy. Often object properties have transient nullness. They may be null at time T1 and non-null at time T2, and the caller may know that the property is non-null from context (the value of some other field, the fact that it's performed some action that forces the field to become non-null, etc). Thus statically enforcing the non-nullness of fields is likely to just make the code more complicated. And it doesn't buy anything. -F > > I find the common use of things like "document()->frame()" without checking > the return value of document() is a big impediment to understanding the code, > because I'm never sure if I can take it as gospel or if someone has been > sloppy and forgotten to check return values. And in turn it leads to > sloppiness, since it's easy to forget which accessors do require null checks > and write an invalid chain such as the one above. > > Joe > > - > This transmission (including any attachments) may contain confidential > information, privileged material (including material protected by the > solicitor-client or other applicable privileges), or constitute non-public > information. Any use of this information by anyone other than the intended > recipient is prohibited. If you have received this transmission in error, > please immediately reply to the sender and delete this information from your > system. Use, dissemination, distribution, or reproduction of this > transmission by unintended recipients is not authorized and may be unlawful. > ___ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Pointers and self-documenting code
On Fri, Jul 6, 2012 at 1:35 PM, Joe Mason wrote: > As has been noted in a recent thread, WebKit strives to make the code > clear and understandable rather than have embedded comments that can become > obsolete. One common practice that I find quite opaque is for classes to > have functions returning pointers which can never return 0, especially when > such functions are often chained. > > For example, is this call safe? > > int fontSize = > node->document()->frame()->page()->settings()->minimumFontSize(); > > (Yes, I could call document()->settings() directly, but I want to > illustrate all the different styles of writing accessors.) > > No it is not: > > Node::document() returns m_document, with an assert documenting that it > never returns 0 > Document::frame() has a comment stating that it CAN return 0 > Frame::page() CAN return 0 - I think. It just returns "m_page" with no > comments - is this the same situation as Node::document(), so m_page is > never 0, but nobody bothered to assert? I deduce that m_page can be 0 > because the detachFromPage() function which sets m_page to 0 is defined > next in the file and I happened to look down and notice it. > Page::settings() cannot return 0, since it returns m_settings.get() and > m_settings is an OwnPtr, so I know its lifetime matches that of the Page. > > So the correct way to write this is: > > int fontSize = 0; > Frame* frame = node->document()->frame(); > if (frame) { > Page* page = frame->page(); > if (page) > fontSize = page->settings()->minimumFontSize(); > } > > But the only way to know this is to read the code for each accessor! And > forget about memorizing which is which: Document::settings() can return 0, > but Page::settings() can't, so whether "settings()" needs to be checked or > not depends on the context. > > Is there a reason we don't return a reference from functions that are > guaranteed to be non-zero? Then the above would become: > > int fontSize = > node->document().frame()->page()->settings().minimumFontSize(); > > And the error is easy to spot: all the dereferences without checking for 0 > are dangerous. And the compiler will enforce this. > That sounds like a good idea. I like it. I'd like to hear opinions of more "senior" contributors though. - Ryosuke ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
[webkit-dev] Pointers and self-documenting code
As has been noted in a recent thread, WebKit strives to make the code clear and understandable rather than have embedded comments that can become obsolete. One common practice that I find quite opaque is for classes to have functions returning pointers which can never return 0, especially when such functions are often chained. For example, is this call safe? int fontSize = node->document()->frame()->page()->settings()->minimumFontSize(); (Yes, I could call document()->settings() directly, but I want to illustrate all the different styles of writing accessors.) No it is not: Node::document() returns m_document, with an assert documenting that it never returns 0 Document::frame() has a comment stating that it CAN return 0 Frame::page() CAN return 0 - I think. It just returns "m_page" with no comments - is this the same situation as Node::document(), so m_page is never 0, but nobody bothered to assert? I deduce that m_page can be 0 because the detachFromPage() function which sets m_page to 0 is defined next in the file and I happened to look down and notice it. Page::settings() cannot return 0, since it returns m_settings.get() and m_settings is an OwnPtr, so I know its lifetime matches that of the Page. So the correct way to write this is: int fontSize = 0; Frame* frame = node->document()->frame(); if (frame) { Page* page = frame->page(); if (page) fontSize = page->settings()->minimumFontSize(); } But the only way to know this is to read the code for each accessor! And forget about memorizing which is which: Document::settings() can return 0, but Page::settings() can't, so whether "settings()" needs to be checked or not depends on the context. Is there a reason we don't return a reference from functions that are guaranteed to be non-zero? Then the above would become: int fontSize = node->document().frame()->page()->settings().minimumFontSize(); And the error is easy to spot: all the dereferences without checking for 0 are dangerous. And the compiler will enforce this. I find the common use of things like "document()->frame()" without checking the return value of document() is a big impediment to understanding the code, because I'm never sure if I can take it as gospel or if someone has been sloppy and forgotten to check return values. And in turn it leads to sloppiness, since it's easy to forget which accessors do require null checks and write an invalid chain such as the one above. Joe - This transmission (including any attachments) may contain confidential information, privileged material (including material protected by the solicitor-client or other applicable privileges), or constitute non-public information. Any use of this information by anyone other than the intended recipient is prohibited. If you have received this transmission in error, please immediately reply to the sender and delete this information from your system. Use, dissemination, distribution, or reproduction of this transmission by unintended recipients is not authorized and may be unlawful. ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev