Re: [cfe-commits] [PATCH] Implementation quantity limits to prevent crashes

Douglas Gregor Fri, 07 Oct 2011 12:07:41 -0700

On Sep 28, 2011, at 6:46 PM, Aaron Ballman wrote:

> On Fri, Sep 23, 2011 at 3:23 PM, Douglas Gregor <[email protected]> wrote:
>> 
>> On Sep 23, 2011, at 1:22 PM, Aaron Ballman wrote:
>> 
>>> On Fri, Sep 23, 2011 at 1:35 PM, Douglas Gregor <[email protected]> wrote:
>>>> 
>>>> On Sep 21, 2011, at 8:45 PM, Aaron Ballman wrote:
>>>> 
>>>>> On Wed, Sep 21, 2011 at 8:11 PM, Eli Friedman <[email protected]> 
>>>>> wrote:
>>>>>>> Also, the diagnostic wording was taken from MSVC, but could likely be
>>>>>>> improved.  Suggestions welcome.
>>>>>> 
>>>>>> I would say something more along the lines of "parser recusion limit
>>>>>> reached"; using the term "stack overflow" makes it sound like a bug in
>>>>>> the compiler.
>>>>> 
>>>>> Here is a revised patch with the new diagnostic wording.
>>>> 
>>>> Minor stuff: there's a typo "recusion" in the diagnostic wording, and the 
>>>> new test case hasn't been updated for the new wording.
>>> 
>>> Good catches!
>>> 
>>>> I like this general direction, but I do wonder about adding another RAII 
>>>> object for this. You've added the RAII object in two places, but aren't 
>>>> there many other similar places? Can we be more methodical, so we can be 
>>>> sure that we don't lose the benefits of this checking with later 
>>>> maintenance?
>>> 
>>> The intention was to see if folks liked the direction, and then add it
>>> to more places.  Why add it everywhere under the sun if people think
>>> it is the wrong way to solve the problem?
>>> 
>>>> I have one idea: what if we expanded the mandate of this RAII object a 
>>>> bit, so that it was responsible for both balancing the delimiters 
>>>> (parentheses, square brackets, braces, etc.)? In other words, we introduce 
>>>> the RAII object to consume the opening paren/bracket/brace and also to 
>>>> find the matching ')':
>>> 
>>> When I started out, I had intended this to have a bit more to do with
>>> [implimits].  But what I ended up with is really is about depth of
>>> recursion in the parser.  Without further accessors, it's not useful
>>> for iterative limit checking (such as the number of constants in an
>>> enumeration).  But do we care?
>> 
>> Probably not. We should be able to handle 'lots' of enumerators. Recursion 
>> is really the killer for us, since we have limited stack space (and Clang 
>> uses a lot of it).
>> 
>>>>  BalancedDelimiterTracket Parens(*this, tok::l_paren); // tries to consume 
>>>> the opening '('
>>>>  if (!Parens) // checks whether an error occurred in consuming the opening 
>>>> ')'
>>>>    return error;
>>>> 
>>>>  // parse whatever is in the parens
>>>> 
>>>>  if (Parens.close()) { // tries to consume the closing ')'
>>>>    // couldn't find closing ')'
>>>>  }
>>>> 
>>>> The idea here would be to encapsulate all of the behavior that now uses 
>>>> MatchRHSPunctuation into an RAII object, and have that also check for 
>>>> implementation limits. There are three benefits here:
>>>> 
>>>>  (1) It becomes clearer where we're matching delimited sets, so the 
>>>> structure of the parser improves in general.
>>>>  (2) Implementation limits checking comes "free" when we use the right 
>>>> abstraction, so it won't be forgotten or lost.
>>>>  (3) These RAII objects, in aggregate, outline the source as we're parsing 
>>>> it. We should be able to use that information to improve parser recovery.
>>>> 
>>>> What do you think?
>>> 
>>> I think I'll play around with the idea and get back to you.  :-)  On
>>> its face, it sounds like a clean way to go about it, so I'll see if it
>>> works in practice as well.  Thank you for the idea!
>> 
>> Great! I'm looking forward to your next patch.
> 
> Going off Douglas' suggestions, I've created a new patch for parsing
> balanced delimiters.
> 
> There is a Parser::DelimiterTracker class which is responsible for
> tracking the "depth" any individual delimiter is at.  The parser owns
> an instance of this class.  There is a
> Parser::BalancedDelimiterTracker class which is used by the parsing
> methods to navigate the balanced delimiters.  It works in conjunction
> with the Parser::DelimiterTracker so that we can track recursion
> limits.  The class comes with three methods: consumeOpen,
> expectAndConsume and consumeClose.
> 
> I've refactored all occurrences of MatchRHSPunctuation to now use a
> BalancedDelimiterTracker object instead.  Due to this, I've removed
> the MatchRHSPunctuation function entirely.


Excellent.

> I've tested this patch against the test suite and all tests continue
> to behave as expected.  I've also checked to make sure this doesn't
> regress performance, and it hasn't.
> 
> This patch does still address bug 10332, and contains a test case for
> it as well.


A few comments below.

+  class DelimiterTracker {
+  private:
+    friend class Parser;
+
+    unsigned P, B, S, L, LLL, T;

Some more-descriptive variable names would be useful here :)

+    BalancedDelimiterTracker(Parser& p, tok::TokenKind k);
+    ~BalancedDelimiterTracker();
+
+    bool consumeOpen(SourceLocation *open) const;
+    bool expectAndConsume(SourceLocation *open, 
+                           unsigned DiagID, 
+                           const char *Msg = "", 
+                           tok::TokenKind SkipToTok = tok::unknown) const;
+    bool consumeClose(const SourceLocation& open, SourceLocation *close) const;
+  };

I'd prefer that we keep the open/close source locations within 
BalancedDelimiterTracker, which would simplify the interface a bit, since 
*every* user of BalancedDelimiterTracker needs to track this information anyway.

Otherwise, this looks great. Thanks for working on it!

        - Doug
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [cfe-commits] [PATCH] Implementation quantity limits to prevent crashes

Reply via email to