And also re-attaching the patch!

On Fri, 17 Sept 2021 at 23:17, Anthony Sharp <anthonyshar...@gmail.com> wrote:
>
> Re-adding gcc-patches@gcc.gnu.org.
>
> ---------- Forwarded message ---------
> From: Anthony Sharp <anthonyshar...@gmail.com>
> Date: Fri, 17 Sept 2021 at 23:11
> Subject: Re: [PATCH] c++: error message for dependent template members 
> [PR70417]
> To: Jason Merrill <ja...@redhat.com>
>
>
> Hi Jason! Apologies for the delay.
>
> > This is basically core issue 1835, http://wg21.link/cwg1835
>
> > This was changed for C++23 by the paper "Declarations and where to find
> > them", http://wg21.link/p1787
>
> Interesting, I was not aware of that. I was very vaguely aware that a
> template-id in a class member access expression could be found by
> ordinary lookup (very bottom of here
> https://en.cppreference.com/w/cpp/language/dependent_name), but it's
> interesting to see it is deeper than I realised.
>
> > But in either case, whether create<U> is in a dependent scope depends on
> > how we resolve impl::, we don't need to remember further back in the
> > expression.  So your dependent_expression_p parameter seems like the
> > wrong approach.  Note that when we're looking up the name after ->, the
> > type of the object expression is in parser->context->object_type.
>
> That's true. I think my thinking was that since it already got figured
> out in cp_parser_postfix_dot_deref_expression, which is where . and ->
> access expressions come from, I thought I might as well pass it
> through, since it seemed to work. But looking again, you're right,
> it's not really worth the hassle; might as well just call
> dependent_scope_p again.
>
> > The cases you fixed in symbol-summary.h are indeed mistakes, but not
> > ill-formed, so giving an error on them is wrong.  For example, here is a
> > well-formed program that is rejected with your patch:
>
> > template <class T, int N> void f(T t) { t.m<N>(0); }
> > struct A { int m; } a;
> > int main() { f<A,42>(a); }
>
> I suppose there was always going to be edge-cases when doing it the
> way I've done. But yes, it can be worked-around by making it a warning
> instead. Interestingly Clang doesn't trip up on that example, so I
> guess they must be examining it some other way (e.g. at instantiation
> time) - but that approach perhaps misses out on the slight performance
> improvement this seems to bring.
>
> > Now that we're writing C++, I'd prefer to avoid this kind of pattern in
> > favor of RAII, such as saved_token_sentinel.  If this is still relevant
> > after addressing the above comments.
>
> Sorry, it's the junior developer in me showing! So this confused me at
> first. After having mucked around a bit I tried using
> saved_token_sentinel but didn't see any benefit since it doesn't
> rollback on going out of scope, and I'll always want to rollback. I
> can call rollback directly, but then I might as well save and restore
> myself. So what I did was use it but also modify it slightly to
> rollback by default on going out of scope (in my mind that makes more
> sense, since if something goes wrong you wouldn't normally want to
> commit anything that happened [edit 1: unless committing was part of
> the whole sanity checking thing] [edit 2: well I guess you could also
> argue that since this is a parser afterall, we like to KEEP things
> sometimes]). But anyways, I made this configurable; it now has three
> modes - roll-back, commit or do nothing. Let me know if you think
> that's not the way to go.
>
> > This code doesn't handle skipping matched ()/{}/[] in the
> > template-argument-list.  You probably want to involve
> > cp_parser_skip_to_end_of_template_parameter_list somehow.
>
> Good point. It required some refactoring, but I have used it. Also,
> just putting it out there, this line from
> cp_parser_skip_to_end_of_template_parameter_list makes zero sense to
> me (why throw an error OR immediately return?), but I have worked
> around it, since it seems to break without it:
>
> > /* Are we ready, yet?  If not, issue error message.  */
> > if (cp_parser_require (parser, CPP_GREATER, RT_GREATER))
> >   return false;
>
> Last thing - I initially made a mistake. I put something like:
>
> (next_token->type == CPP_NAME
>  && MAYBE_CLASS_TYPE_P (parser->scope)
>  && !constructor_name_p (cp_expr (next_token->u.value,
>                                                           
> next_token->location),
>                                            parser->scope))
>
> Instead of:
>
> !(next_token->type == CPP_NAME
>   && MAYBE_CLASS_TYPE_P (parser->scope)
>   && constructor_name_p (cp_expr (next_token->u.value,
>                                                           
> next_token->location),
>                                            parser->scope))
>
> This meant a lot of things were being excluded that weren't supposed
> to be. Oops! Changing this opened up a whole new can of worms, so I
> had to make some changes to the main logic, but it just a little bit
> in the end.
>
> Regtested everything again and all seems fine. Bootstraps fine. Patch
> attached. Let me know if it needs anything else.
>
> Thanks for the help,
> Anthony
>
>
>
> On Fri, 27 Aug 2021 at 22:33, Jason Merrill <ja...@redhat.com> wrote:
> >
> > On 8/20/21 12:56 PM, Anthony Sharp via Gcc-patches wrote:
> > > Hi, hope everyone is well. I have a patch here for issue 70417
> > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70417). I'm still a GCC
> > > noob, and this is probably the hardest thing I have ever coded in my
> > > life, so please forgive any initial mistakes!
> > >
> > > TLDR
> > >
> > > This patch introduces a helpful error message when the user attempts
> > > to put a template-id after a member access token (., -> or ::) in a
> > > dependent context without having put the "template" keyword after the
> > > access token.
> >
> > Great, thanks!
> >
> > > CONTEXT
> > >
> > > In C++, when referencing a class member (using ., -> or ::) in a
> > > dependent context, if that member is a template, the template keyword
> > > is required after the access token. For example:
> > >
> > > template<class T> void MyDependentTemplate(T t)
> > > {
> > >    t.DoSomething<T>(); /* Error - t is dependent. "<" is treated as a
> > > less-than sign because DoSomething is assumed to be a non-template
> > > identifier. */
> > >    t.template DoSomething<T>(); // Good.
> > > }
> > >
> > > PATCH EXPLANATION
> > >
> > > In order to throw an appropriate error message we need to identify
> > > and/or create a point in the compiler where the following conditions
> > > are all simultaneously satisfied:
> > >
> > > A) a class member access token (., ->, ::)
> > > B) a dependent context
> > > C) the thing being accessed is a template-id
> > > D) No template keyword
> > >
> > > A, B and D are relatively straightforward and I won't go into detail
> > > about how that was achieved. The real challenge is C - figuring out
> > > whether a name is a template-id.
> > >
> > > Usually, we would look up the name and use that to determine whether
> > > the name is a template or not. But, we cannot do that. We are in a
> > > dependent context, so lookup cannot (in theory) find anything at the
> > > point the access expression is parsed.
> > >
> > > We (maybe) could defer checking until the template is actually
> > > instantiated. But this is not the approach I have gone for, since this
> > > to me sounded unnecessarily awkward and clunky. In my mind this also
> > > seems like a syntax error, and IMO it seems more natural that syntax
> > > errors should get caught as soon as they are parsed.
> > >
> > > So, the solution I went for was to figure out whether the name was a
> > > template by looking at the surrounding tokens. The key to this lies in
> > > being able to differentiate between the start and end of a template
> > > parameter list (< and >) and greater-than and less-than tokens (and
> > > perhaps also things like << or >>). If the final > (if we find one at
> > > all) does indeed mark the end of a class or function template, then it
> > > will be followed by something like (, ::, or even just ;. On the other
> > > hand, a greater-than operator would be followed by a name.
> > >
> > > PERFORMANCE IMPACT
> > >
> > > My concern with this approach was that checking this would actually
> > > slow the compiler down. In the end it seemed the opposite was true. By
> > > parsing the tokens to determine whether the name looks like a
> > > template-id, we can cut out a lot of the template-checking code that
> > > gets run trying to find a template when there is none, making compile
> > > time generally faster. That being said, I'm not sure if the
> > > improvement will be that substantial, so I did not feel it necessary
> > > to introduce the template checking method everywhere where we could
> > > possibly have run into a template.
> > >
> > > I ran an ABAB test with the following code repeated enough times to
> > > fill up about 10,000 lines:
> > >
> > > ai = 1;
> > > bi = 2;
> > > ci = 3;
> > > c.x = 4;
> > > (&c)->x = 5;
> > > mytemplateclass<Y>::y = 6;
> > > c.template class_func<Y>();
> > > (&c)->template class_func<Y>();
> > > mytemplateclass<Y>::template class_func_static<Y>();
> > > c2.class_func<myclass>();
> > > (&c2)->class_func<myclass>();
> > > myclass::class_func_static<myclass>();
> > > ai = 6;
> > > bi = 5;
> > > ci = 4;
> > > c.x = 3;
> > > (&c)->x = 2;
> > > mytemplateclass<Y>::y = 1;
> > > c.template class_func<Y>();
> > > (&c)->template class_func<Y>();
> > > mytemplateclass<Y>::template class_func_static<Y>();
> > > c2.class_func<myclass>();
> > > (&c2)->class_func<myclass>();
> > > myclass::class_func_static<myclass>();
> > >
> > > It basically just contains a mixture of class access expressions plus
> > > some regular assignments for good measure. The compiler was compiled
> > > without any optimisations (and so slower than usual). In hindsight,
> > > perhaps this test case assumes the worst-ish-case scenario since it
> > > contains a lot of templates; most of the benefits of this patch occur
> > > when parsing non-templates.
> > >
> > > The compiler (clean) and the compiler with the patch applied (patched)
> > > compiled the above code 20 times each in ABAB fashion. On average, the
> > > clean compiler took 2.26295 seconds and the patched compiler took
> > > 2.25145 seconds (about 0.508% faster). Whether this improvement
> > > remains or disappears when the compiler is built with optimisations
> > > turned on I haven't tested. My main concern was just making sure it
> > > didn't get any slower.
> > >
> > > AWKWARD TEST CASES
> > >
> > > You will see from the patch that it required the updating of a few
> > > testcases (as well as one place within the compiler). I believe this
> > > is because up until now, the compiler allowed the omission of the
> > > template keyword in some places it shouldn't have. Take decltype34.C
> > > for example:
> > >
> > > struct impl
> > > {
> > >    template <class T> static T create();
> > > };
> > >
> > > template<class T, class U, class =
> > > decltype(impl::create<T>()->impl::create<U>())>
> > > struct tester{};
> > >
> > > GCC currently accepts this code. My patch causes it to be rejected.
> >
> > > For what it's worth, Clang also rejects this code:
> > >
> > > 1824786093/source.cpp:6:70: error: use 'template' keyword to treat
> > > 'create' as a dependent template name
> > > template<class T, class U, class =
> > > decltype(impl::create<T>()->impl::create<U>())>
> > >
> > >                                        ^ template
> > >
> > > I think Clang is correct since from what I understand it should be
> > > written as impl::create<T>()->impl::template create<U>(). Here,
> > > create<T>() is dependent, so it makes sense that we would need
> > > "template" before the scope operator. Then again, I could well be
> > > wrong. The rules around dependent template names are pretty crazy.
> >
> > This is basically core issue 1835, http://wg21.link/cwg1835
> >
> > This was changed for C++23 by the paper "Declarations and where to find
> > them", http://wg21.link/p1787
> >
> > Previously, e.g. in C++20,
> > https://timsong-cpp.github.io/cppwp/n4861/basic.lookup.classref#4 said
> > that when we see ->impl, since we can't look up the name in the
> > (dependent) object type, we do unqualified lookup and find ::impl, so
> > create<U> is not in a dependent scope, and the testcase is fine.
> >
> > The 1835 change clarifies that we only do unqualified lookup of the
> > second impl if the object type is non-dependent, so the second create is
> > in a dependent scope, and we need the ::template.
> >
> > But in either case, whether create<U> is in a dependent scope depends on
> > how we resolve impl::, we don't need to remember further back in the
> > expression.  So your dependent_expression_p parameter seems like the
> > wrong approach.  Note that when we're looking up the name after ->, the
> > type of the object expression is in parser->context->object_type.
> >
> > 1835 also makes the following testcase valid, which we don't currently
> > accept, but clang does:
> >
> > template <class T> void f(T t) { t.foo::bar(); }
> > struct foo { void bar(); };
> > int main() { f(foo()); }
> >
> > For C++20 and down, I want to start accepting this testcase, but also
> > still accept decltype34.C to avoid breaking existing code.  But that's a
> > separate issue; I don't think your patch should affect decltype34.
> >
> > The cases you fixed in symbol-summary.h are indeed mistakes, but not
> > ill-formed, so giving an error on them is wrong.  For example, here is a
> > well-formed program that is rejected with your patch:
> >
> > template <class T, int N> void f(T t) { t.m<N>(0); }
> > struct A { int m; } a;
> > int main() { f<A,42>(a); }
> >
> > Comparing the result of < by > is very unlikely to be what the user
> > actually meant, but it is potentially valid.  So we should accept f with
> > a warning about the dubious expression.  People can silence the warning
> > if they really mean this by parenthesizing the LHS of the < operator.
> >
> > Another valid program, using ::
> >
> > int x;
> > template <class T, int N> void f(T t) { t.m<N>::x; }
> > struct A { int m; } a;
> > int main() { f<A,42>(a); }
> >
> > Now to reviewing the actual patch:
> >
> > > +   yet).  Return 1 if it does look like a template-id.  Return 2
> > > +   if not.  */
> >
> > Let's use -1 for definitely not.
> >
> > > +  /* Could be a ~ referencing the destructor of a class template.
> > > +     Temporarily eat it up so it's not in our way.  */
> > > +  if (next_token->type == CPP_COMPL)
> > > +    {
> > > +      cp_lexer_save_tokens (parser->lexer);
> > > +      cp_lexer_consume_token (parser->lexer);
> > > +      next_token = cp_lexer_peek_token (parser->lexer);
> > > +      saved = true;
> > > +    }
> >
> > There's no ambiguity in the case of a destructor, as you note in your
> > comments in cp_parser_id_expression.  How about returning early if we see ~?
> >
> > > +  /* Find offset of final ">".  */
> > > +  for (; depth > 0; i++)
> > > +    {
> > > +      switch (cp_lexer_peek_nth_token (parser->lexer, i)->type)
> > > +      {
> > > +        case CPP_LESS:
> > > +          depth++;
> > > +          break;
> > > +        case CPP_GREATER:
> > > +          depth--;
> > > +          break;
> > > +        case CPP_RSHIFT:
> > > +          depth-=2;
> > > +          break;
> > > +        case CPP_EOF:
> > > +        case CPP_SEMICOLON:
> > > +          goto no;
> > > +        default:
> > > +          break;
> > > +      }
> > > +    }
> >
> > This code doesn't handle skipping matched ()/{}/[] in the
> > template-argument-list.  You probably want to involve
> > cp_parser_skip_to_end_of_template_parameter_list somehow.
> >
> > > +no:
> > > +  if (saved)
> > > +    cp_lexer_rollback_tokens (parser->lexer);
> > > +  return 2;
> > > +
> > > +yes:
> > > +  if (saved)
> > > +    cp_lexer_rollback_tokens (parser->lexer);
> > > +  return 1;
> >
> > Now that we're writing C++, I'd prefer to avoid this kind of pattern in
> > favor of RAII, such as saved_token_sentinel.  If this is still relevant
> > after addressing the above comments.
> >
> > > +   If DEPENDENT_EXPRESSION_P is true, this is a dependent id-expression
> > > +   that is not the current instantiation.  */
> >
> > As mentioned above, let's not add this parameter.
> >
> > > +      error_at (next_token->location,
> > > +             "expected \"template\" keyword before dependent template "
> > > +             "name");
> >
> > As mentioned above, this should be a warning.
> >
> > > -     /* If it worked, we're done.  */
> > > -     if (cp_parser_parse_definitely (parser))
> > > -       return id;
> > > +  if (looks_like_template != 2)
> > > +    {
> > > +      /* We don't know yet whether or not this will be a
> > > +      template-id.  */
> >
> > The indentation here is off.
> >
> > > +                       int looks_like_template)
> >
> > Let's give these parms a default argument of 0.  And call them
> > looks_like_template_id to be clearer.
> >
> > > -  /* Avoid performing name lookup if there is no possibility of
> > > -     finding a template-id.  */
> > > -  if ((token->type != CPP_NAME && token->keyword != RID_OPERATOR)
> > > -      || (token->type == CPP_NAME
> > > -       && !cp_parser_nth_token_starts_template_argument_list_p
> > > -            (parser, 2)))
> > > +  /* Only if we have not already checked whether this looks like a
> > > +     template.  */
> > > +  if (looks_like_template == 0)
> > >      {
> > > -      cp_parser_error (parser, "expected template-id");
> > > -      return error_mark_node;
> > > +      /* Avoid performing name lookup if there is no possibility of
> > > +      finding a template-id.  */
> >
> > You could add the looks_like_template_id check to the existing condition
> > so we don't need to reindent the comment or body.
> >
> > > +++ b/gcc/symbol-summary.h
> > > @@ -191,7 +191,7 @@ public:
> > >    template<typename Arg, bool (*f)(const T &, Arg)>
> > >    void traverse (Arg a) const
> > >    {
> > > -    m_map.traverse <f> (a);
> > > +    m_map.template traverse <f> (a);
> >
> > I've gone ahead and applied this fix as a separate patch.
> >
> > Jason
> >

Attachment: pr70417_5.patch
Description: Binary data



Reply via email to