Re: [PATCH] c++: error message for dependent template members [PR70417]

Anthony Sharp via Gcc-patches Fri, 17 Sep 2021 15:18:19 -0700

Re-adding gcc-patches@gcc.gnu.org.

---------- Forwarded message ---------
From: Anthony Sharp <anthonyshar...@gmail.com>
Date: Fri, 17 Sept 2021 at 23:11
Subject: Re: [PATCH] c++: error message for dependent template members [PR70417]
To: Jason Merrill <ja...@redhat.com>



Hi Jason! Apologies for the delay.

> This is basically core issue 1835, http://wg21.link/cwg1835

> This was changed for C++23 by the paper "Declarations and where to find
> them", http://wg21.link/p1787

Interesting, I was not aware of that. I was very vaguely aware that a
template-id in a class member access expression could be found by
ordinary lookup (very bottom of here
https://en.cppreference.com/w/cpp/language/dependent_name), but it's
interesting to see it is deeper than I realised.

> But in either case, whether create<U> is in a dependent scope depends on
> how we resolve impl::, we don't need to remember further back in the
> expression.  So your dependent_expression_p parameter seems like the
> wrong approach.  Note that when we're looking up the name after ->, the
> type of the object expression is in parser->context->object_type.

That's true. I think my thinking was that since it already got figured
out in cp_parser_postfix_dot_deref_expression, which is where . and ->
access expressions come from, I thought I might as well pass it
through, since it seemed to work. But looking again, you're right,
it's not really worth the hassle; might as well just call
dependent_scope_p again.

> The cases you fixed in symbol-summary.h are indeed mistakes, but not
> ill-formed, so giving an error on them is wrong.  For example, here is a
> well-formed program that is rejected with your patch:

> template <class T, int N> void f(T t) { t.m<N>(0); }
> struct A { int m; } a;
> int main() { f<A,42>(a); }

I suppose there was always going to be edge-cases when doing it the
way I've done. But yes, it can be worked-around by making it a warning
instead. Interestingly Clang doesn't trip up on that example, so I
guess they must be examining it some other way (e.g. at instantiation
time) - but that approach perhaps misses out on the slight performance
improvement this seems to bring.

> Now that we're writing C++, I'd prefer to avoid this kind of pattern in
> favor of RAII, such as saved_token_sentinel.  If this is still relevant
> after addressing the above comments.

Sorry, it's the junior developer in me showing! So this confused me at
first. After having mucked around a bit I tried using
saved_token_sentinel but didn't see any benefit since it doesn't
rollback on going out of scope, and I'll always want to rollback. I
can call rollback directly, but then I might as well save and restore
myself. So what I did was use it but also modify it slightly to
rollback by default on going out of scope (in my mind that makes more
sense, since if something goes wrong you wouldn't normally want to
commit anything that happened [edit 1: unless committing was part of
the whole sanity checking thing] [edit 2: well I guess you could also
argue that since this is a parser afterall, we like to KEEP things
sometimes]). But anyways, I made this configurable; it now has three
modes - roll-back, commit or do nothing. Let me know if you think
that's not the way to go.

> This code doesn't handle skipping matched ()/{}/[] in the
> template-argument-list.  You probably want to involve
> cp_parser_skip_to_end_of_template_parameter_list somehow.

Good point. It required some refactoring, but I have used it. Also,
just putting it out there, this line from
cp_parser_skip_to_end_of_template_parameter_list makes zero sense to
me (why throw an error OR immediately return?), but I have worked
around it, since it seems to break without it:

> /* Are we ready, yet?  If not, issue error message.  */
> if (cp_parser_require (parser, CPP_GREATER, RT_GREATER))
>   return false;

Last thing - I initially made a mistake. I put something like:

(next_token->type == CPP_NAME
 && MAYBE_CLASS_TYPE_P (parser->scope)
 && !constructor_name_p (cp_expr (next_token->u.value,
                                                          next_token->location),
                                           parser->scope))

Instead of:

!(next_token->type == CPP_NAME
  && MAYBE_CLASS_TYPE_P (parser->scope)
  && constructor_name_p (cp_expr (next_token->u.value,
                                                          next_token->location),
                                           parser->scope))

This meant a lot of things were being excluded that weren't supposed
to be. Oops! Changing this opened up a whole new can of worms, so I
had to make some changes to the main logic, but it just a little bit
in the end.

Regtested everything again and all seems fine. Bootstraps fine. Patch
attached. Let me know if it needs anything else.

Thanks for the help,
Anthony



On Fri, 27 Aug 2021 at 22:33, Jason Merrill <ja...@redhat.com> wrote:
>
> On 8/20/21 12:56 PM, Anthony Sharp via Gcc-patches wrote:
> > Hi, hope everyone is well. I have a patch here for issue 70417
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70417). I'm still a GCC
> > noob, and this is probably the hardest thing I have ever coded in my
> > life, so please forgive any initial mistakes!
> >
> > TLDR
> >
> > This patch introduces a helpful error message when the user attempts
> > to put a template-id after a member access token (., -> or ::) in a
> > dependent context without having put the "template" keyword after the
> > access token.
>
> Great, thanks!
>
> > CONTEXT
> >
> > In C++, when referencing a class member (using ., -> or ::) in a
> > dependent context, if that member is a template, the template keyword
> > is required after the access token. For example:
> >
> > template<class T> void MyDependentTemplate(T t)
> > {
> >    t.DoSomething<T>(); /* Error - t is dependent. "<" is treated as a
> > less-than sign because DoSomething is assumed to be a non-template
> > identifier. */
> >    t.template DoSomething<T>(); // Good.
> > }
> >
> > PATCH EXPLANATION
> >
> > In order to throw an appropriate error message we need to identify
> > and/or create a point in the compiler where the following conditions
> > are all simultaneously satisfied:
> >
> > A) a class member access token (., ->, ::)
> > B) a dependent context
> > C) the thing being accessed is a template-id
> > D) No template keyword
> >
> > A, B and D are relatively straightforward and I won't go into detail
> > about how that was achieved. The real challenge is C - figuring out
> > whether a name is a template-id.
> >
> > Usually, we would look up the name and use that to determine whether
> > the name is a template or not. But, we cannot do that. We are in a
> > dependent context, so lookup cannot (in theory) find anything at the
> > point the access expression is parsed.
> >
> > We (maybe) could defer checking until the template is actually
> > instantiated. But this is not the approach I have gone for, since this
> > to me sounded unnecessarily awkward and clunky. In my mind this also
> > seems like a syntax error, and IMO it seems more natural that syntax
> > errors should get caught as soon as they are parsed.
> >
> > So, the solution I went for was to figure out whether the name was a
> > template by looking at the surrounding tokens. The key to this lies in
> > being able to differentiate between the start and end of a template
> > parameter list (< and >) and greater-than and less-than tokens (and
> > perhaps also things like << or >>). If the final > (if we find one at
> > all) does indeed mark the end of a class or function template, then it
> > will be followed by something like (, ::, or even just ;. On the other
> > hand, a greater-than operator would be followed by a name.
> >
> > PERFORMANCE IMPACT
> >
> > My concern with this approach was that checking this would actually
> > slow the compiler down. In the end it seemed the opposite was true. By
> > parsing the tokens to determine whether the name looks like a
> > template-id, we can cut out a lot of the template-checking code that
> > gets run trying to find a template when there is none, making compile
> > time generally faster. That being said, I'm not sure if the
> > improvement will be that substantial, so I did not feel it necessary
> > to introduce the template checking method everywhere where we could
> > possibly have run into a template.
> >
> > I ran an ABAB test with the following code repeated enough times to
> > fill up about 10,000 lines:
> >
> > ai = 1;
> > bi = 2;
> > ci = 3;
> > c.x = 4;
> > (&c)->x = 5;
> > mytemplateclass<Y>::y = 6;
> > c.template class_func<Y>();
> > (&c)->template class_func<Y>();
> > mytemplateclass<Y>::template class_func_static<Y>();
> > c2.class_func<myclass>();
> > (&c2)->class_func<myclass>();
> > myclass::class_func_static<myclass>();
> > ai = 6;
> > bi = 5;
> > ci = 4;
> > c.x = 3;
> > (&c)->x = 2;
> > mytemplateclass<Y>::y = 1;
> > c.template class_func<Y>();
> > (&c)->template class_func<Y>();
> > mytemplateclass<Y>::template class_func_static<Y>();
> > c2.class_func<myclass>();
> > (&c2)->class_func<myclass>();
> > myclass::class_func_static<myclass>();
> >
> > It basically just contains a mixture of class access expressions plus
> > some regular assignments for good measure. The compiler was compiled
> > without any optimisations (and so slower than usual). In hindsight,
> > perhaps this test case assumes the worst-ish-case scenario since it
> > contains a lot of templates; most of the benefits of this patch occur
> > when parsing non-templates.
> >
> > The compiler (clean) and the compiler with the patch applied (patched)
> > compiled the above code 20 times each in ABAB fashion. On average, the
> > clean compiler took 2.26295 seconds and the patched compiler took
> > 2.25145 seconds (about 0.508% faster). Whether this improvement
> > remains or disappears when the compiler is built with optimisations
> > turned on I haven't tested. My main concern was just making sure it
> > didn't get any slower.
> >
> > AWKWARD TEST CASES
> >
> > You will see from the patch that it required the updating of a few
> > testcases (as well as one place within the compiler). I believe this
> > is because up until now, the compiler allowed the omission of the
> > template keyword in some places it shouldn't have. Take decltype34.C
> > for example:
> >
> > struct impl
> > {
> >    template <class T> static T create();
> > };
> >
> > template<class T, class U, class =
> > decltype(impl::create<T>()->impl::create<U>())>
> > struct tester{};
> >
> > GCC currently accepts this code. My patch causes it to be rejected.
>
> > For what it's worth, Clang also rejects this code:
> >
> > 1824786093/source.cpp:6:70: error: use 'template' keyword to treat
> > 'create' as a dependent template name
> > template<class T, class U, class =
> > decltype(impl::create<T>()->impl::create<U>())>
> >
> >                                        ^ template
> >
> > I think Clang is correct since from what I understand it should be
> > written as impl::create<T>()->impl::template create<U>(). Here,
> > create<T>() is dependent, so it makes sense that we would need
> > "template" before the scope operator. Then again, I could well be
> > wrong. The rules around dependent template names are pretty crazy.
>
> This is basically core issue 1835, http://wg21.link/cwg1835
>
> This was changed for C++23 by the paper "Declarations and where to find
> them", http://wg21.link/p1787
>
> Previously, e.g. in C++20,
> https://timsong-cpp.github.io/cppwp/n4861/basic.lookup.classref#4 said
> that when we see ->impl, since we can't look up the name in the
> (dependent) object type, we do unqualified lookup and find ::impl, so
> create<U> is not in a dependent scope, and the testcase is fine.
>
> The 1835 change clarifies that we only do unqualified lookup of the
> second impl if the object type is non-dependent, so the second create is
> in a dependent scope, and we need the ::template.
>
> But in either case, whether create<U> is in a dependent scope depends on
> how we resolve impl::, we don't need to remember further back in the
> expression.  So your dependent_expression_p parameter seems like the
> wrong approach.  Note that when we're looking up the name after ->, the
> type of the object expression is in parser->context->object_type.
>
> 1835 also makes the following testcase valid, which we don't currently
> accept, but clang does:
>
> template <class T> void f(T t) { t.foo::bar(); }
> struct foo { void bar(); };
> int main() { f(foo()); }
>
> For C++20 and down, I want to start accepting this testcase, but also
> still accept decltype34.C to avoid breaking existing code.  But that's a
> separate issue; I don't think your patch should affect decltype34.
>
> The cases you fixed in symbol-summary.h are indeed mistakes, but not
> ill-formed, so giving an error on them is wrong.  For example, here is a
> well-formed program that is rejected with your patch:
>
> template <class T, int N> void f(T t) { t.m<N>(0); }
> struct A { int m; } a;
> int main() { f<A,42>(a); }
>
> Comparing the result of < by > is very unlikely to be what the user
> actually meant, but it is potentially valid.  So we should accept f with
> a warning about the dubious expression.  People can silence the warning
> if they really mean this by parenthesizing the LHS of the < operator.
>
> Another valid program, using ::
>
> int x;
> template <class T, int N> void f(T t) { t.m<N>::x; }
> struct A { int m; } a;
> int main() { f<A,42>(a); }
>
> Now to reviewing the actual patch:
>
> > +   yet).  Return 1 if it does look like a template-id.  Return 2
> > +   if not.  */
>
> Let's use -1 for definitely not.
>
> > +  /* Could be a ~ referencing the destructor of a class template.
> > +     Temporarily eat it up so it's not in our way.  */
> > +  if (next_token->type == CPP_COMPL)
> > +    {
> > +      cp_lexer_save_tokens (parser->lexer);
> > +      cp_lexer_consume_token (parser->lexer);
> > +      next_token = cp_lexer_peek_token (parser->lexer);
> > +      saved = true;
> > +    }
>
> There's no ambiguity in the case of a destructor, as you note in your
> comments in cp_parser_id_expression.  How about returning early if we see ~?
>
> > +  /* Find offset of final ">".  */
> > +  for (; depth > 0; i++)
> > +    {
> > +      switch (cp_lexer_peek_nth_token (parser->lexer, i)->type)
> > +      {
> > +        case CPP_LESS:
> > +          depth++;
> > +          break;
> > +        case CPP_GREATER:
> > +          depth--;
> > +          break;
> > +        case CPP_RSHIFT:
> > +          depth-=2;
> > +          break;
> > +        case CPP_EOF:
> > +        case CPP_SEMICOLON:
> > +          goto no;
> > +        default:
> > +          break;
> > +      }
> > +    }
>
> This code doesn't handle skipping matched ()/{}/[] in the
> template-argument-list.  You probably want to involve
> cp_parser_skip_to_end_of_template_parameter_list somehow.
>
> > +no:
> > +  if (saved)
> > +    cp_lexer_rollback_tokens (parser->lexer);
> > +  return 2;
> > +
> > +yes:
> > +  if (saved)
> > +    cp_lexer_rollback_tokens (parser->lexer);
> > +  return 1;
>
> Now that we're writing C++, I'd prefer to avoid this kind of pattern in
> favor of RAII, such as saved_token_sentinel.  If this is still relevant
> after addressing the above comments.
>
> > +   If DEPENDENT_EXPRESSION_P is true, this is a dependent id-expression
> > +   that is not the current instantiation.  */
>
> As mentioned above, let's not add this parameter.
>
> > +      error_at (next_token->location,
> > +             "expected \"template\" keyword before dependent template "
> > +             "name");
>
> As mentioned above, this should be a warning.
>
> > -     /* If it worked, we're done.  */
> > -     if (cp_parser_parse_definitely (parser))
> > -       return id;
> > +  if (looks_like_template != 2)
> > +    {
> > +      /* We don't know yet whether or not this will be a
> > +      template-id.  */
>
> The indentation here is off.
>
> > +                       int looks_like_template)
>
> Let's give these parms a default argument of 0.  And call them
> looks_like_template_id to be clearer.
>
> > -  /* Avoid performing name lookup if there is no possibility of
> > -     finding a template-id.  */
> > -  if ((token->type != CPP_NAME && token->keyword != RID_OPERATOR)
> > -      || (token->type == CPP_NAME
> > -       && !cp_parser_nth_token_starts_template_argument_list_p
> > -            (parser, 2)))
> > +  /* Only if we have not already checked whether this looks like a
> > +     template.  */
> > +  if (looks_like_template == 0)
> >      {
> > -      cp_parser_error (parser, "expected template-id");
> > -      return error_mark_node;
> > +      /* Avoid performing name lookup if there is no possibility of
> > +      finding a template-id.  */
>
> You could add the looks_like_template_id check to the existing condition
> so we don't need to reindent the comment or body.
>
> > +++ b/gcc/symbol-summary.h
> > @@ -191,7 +191,7 @@ public:
> >    template<typename Arg, bool (*f)(const T &, Arg)>
> >    void traverse (Arg a) const
> >    {
> > -    m_map.traverse <f> (a);
> > +    m_map.template traverse <f> (a);
>
> I've gone ahead and applied this fix as a separate patch.
>
> Jason
>

Re: [PATCH] c++: error message for dependent template members [PR70417]

Reply via email to