On Sep 17, 2012, at 4:50 PM, Richard Smith <[email protected]> wrote:

> On Mon, Sep 17, 2012 at 4:04 PM, James Dennett <[email protected]> wrote:
> On Mon, Sep 17, 2012 at 3:04 PM, Argyrios Kyrtzidis <[email protected]> 
> wrote:
> >
> > On Sep 17, 2012, at 2:38 PM, James Dennett <[email protected]> wrote:
> >
> >> On Mon, Sep 17, 2012 at 2:22 PM, Argyrios Kyrtzidis <[email protected]> 
> >> wrote:
> >>> On Sep 17, 2012, at 12:44 PM, James Dennett <[email protected]> wrote:
> >>>
> >>>> On Mon, Sep 10, 2012 at 7:17 PM, Argyrios Kyrtzidis <[email protected]> 
> >>>> wrote:
> >>>>> Author: akirtzidis
> >>>>> Date: Mon Sep 10 21:17:21 2012
> >>>>> New Revision: 163588
> >>>>>
> >>>>> URL: http://llvm.org/viewvc/llvm-project?rev=163588&view=rev
> >>>>> Log:
> >>>>> [libclang] Fix getting a cursor inside an angled #include directive.
> >>>>>
> >>>>> Fixed by pointing the end location of the preprocessed entity for the 
> >>>>> #include
> >>>>> at the closing '>', instead of the start of '<'.
> >>>>>
> >>>>> rdar://11113134
> >>>>>
> >>>>> Modified:
> >>>>>   cfe/trunk/lib/Lex/PPDirectives.cpp
> >>>>>   cfe/trunk/test/Index/c-index-getCursor-pp.c
> >>>>>
> >>>>> Modified: cfe/trunk/lib/Lex/PPDirectives.cpp
> >>>>> URL: 
> >>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/PPDirectives.cpp?rev=163588&r1=163587&r2=163588&view=diff
> >>>>> ==============================================================================
> >>>>> --- cfe/trunk/lib/Lex/PPDirectives.cpp (original)
> >>>>> +++ cfe/trunk/lib/Lex/PPDirectives.cpp Mon Sep 10 21:17:21 2012
> >>>>> @@ -1296,6 +1296,9 @@
> >>>>>  case tok::string_literal:
> >>>>>    Filename = getSpelling(FilenameTok, FilenameBuffer);
> >>>>>    End = FilenameTok.getLocation();
> >>>>> +    // For an angled include, point the end location at the closing 
> >>>>> '>'.
> >>>>> +    if (FilenameTok.is(tok::angle_string_literal))
> >>>>> +      End = End.getLocWithOffset(Filename.size()-1);
> >>>>>    CharEnd = End.getLocWithOffset(Filename.size());
> >>>>>    break;
> >>>>>
> >>>>>
> >>>>> Modified: cfe/trunk/test/Index/c-index-getCursor-pp.c
> >>>>> URL: 
> >>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Index/c-index-getCursor-pp.c?rev=163588&r1=163587&r2=163588&view=diff
> >>>>> ==============================================================================
> >>>>> --- cfe/trunk/test/Index/c-index-getCursor-pp.c (original)
> >>>>> +++ cfe/trunk/test/Index/c-index-getCursor-pp.c Mon Sep 10 21:17:21 2012
> >>>>> @@ -15,6 +15,8 @@
> >>>>>
> >>>>> const char *fname = __FILE__;
> >>>>>
> >>>>> +#include <a.h>
> >>>>> +
> >>>>> // RUN: c-index-test -cursor-at=%s:1:11 -I%S/Inputs %s | FileCheck 
> >>>>> -check-prefix=CHECK-1 %s
> >>>>> // CHECK-1: macro definition=OBSCURE
> >>>>> // RUN: c-index-test -cursor-at=%s:2:14 -I%S/Inputs %s | FileCheck 
> >>>>> -check-prefix=CHECK-2 %s
> >>>>> @@ -31,6 +33,8 @@
> >>>>> // CHECK-7: macro expansion=B:12:9
> >>>>> // RUN: c-index-test -cursor-at=%s:16:25 -I%S/Inputs %s | FileCheck 
> >>>>> -check-prefix=CHECK-8 %s
> >>>>> // CHECK-8: macro expansion=__FILE__
> >>>>> +// RUN: c-index-test -cursor-at=%s:18:12 -I%S/Inputs %s | FileCheck 
> >>>>> -check-prefix=CHECK-9 %s
> >>>>> +// CHECK-9: inclusion directive=a.h
> >>>>>
> >>>>> // Same tests, but with "editing" optimizations
> >>>>> // RUN: env CINDEXTEST_EDITING=1 c-index-test -cursor-at=%s:1:11 
> >>>>> -I%S/Inputs %s | FileCheck -check-prefix=CHECK-1 %s
> >>>>
> >>>> This change broke clients that implement
> >>>> PPCallbacks::InclusionDirective, which now receive the location of the
> >>>> closing '>' rather than the location of the token that they're
> >>>> expecting based on the spec:
> >>>> /// \param EndLoc The location of the last token within the
> >>>> inclusion
> >>>> /// directive.
> >>>> The last (preprocessing) token is the whole of <filename>, of type
> >>>> tok::angle_string_literal.
> >>>>
> >>>> Could we consider reverting this change and finding an alternative way
> >>>> to fix libclang without breaking/changing the lower-level interface?
> >>>
> >>> Hi James,
> >>>
> >>> This is not a high-level libclang issue.
> >>
> >> Apologies, I was going by the change description "[libclang] Fix
> >> getting a cursor inside an angled #include directive.", and that the
> >> only tests I saw included were for libclang.  (That does suggests that
> >> we're missing tests for this in the appropriate place.)
> >
> > You're right, a preprocessing record unit test is also needed.
> >
> >>
> >>> The preprocessing record needs to know the '>' location and depends on 
> >>> the PPCallbacks::InclusionDirective callback. Previously this callback 
> >>> did not provide a way to reliably get at that location.
> >>
> >> I'm not sure about that -- it provided a way to get there, but it was
> >> necessary to skip over the filename.
> >
> > If the include filename was formed via a macro expansion then you had an 
> > EndLoc pointing at '>'. So this was the situation:
> >
> > No macro expansions involved: EndLoc points at '<'
> > With macro expansion: EndLoc points at '>'.
> >
> > The receiver had no easy way to know which case it was. With the change it 
> > always pointed at '>'.
> 
> Checking which character is at the EndLoc isn't so hard, but I take your 
> point.
> 
> >> With your change a similar trick
> >> is required for code wanting to get to the filename, but only for
> >> #include <filename>, not for #include "filename".  (Why are the two
> >> cases different?  Currently the callback returns the location of the
> >> opening quote for the #include "name" case, and the closing one for
> >> #include <name>.  The asymmetry seems strange.)
> >
> > There's an asymmetry due to how lexing would behave if you pointed at the 
> > beginning of the filename, for example:
> >
> > #include "filename"
> > If you point at the opening quote and start lexing you will get the 
> > tok::string_literal token, the same that the preprocessor saw. Then you can 
> > get at the character range of the whole filename input by checking the 
> > token size.
> >
> > #include <filename>
> > if you point at the opening '<' and start lexing you will get a tok::less 
> > token, you need to get into the special "parsing include directive" mode to 
> > receive a tok::angle_string_literal token.
> > So just pointing at '<' will not allow you to easily lex and get the 
> > character range of the filename input.
> 
> I was assuming that you'd lex correctly, which means being in the
> right mode to lex an include directive in this case.  I don't think it
> matters that if you lex incorrectly that you'll get the wrong results.
> 
> >>> I believe we still need to modify the bahavior of the InclusionDirective 
> >>> callback and focus the discussion there.
> >>>
> >>> How about reverting the EndLoc adjustment but have the InclusionDirective 
> >>> also pass the FilenameTok token so that receivers of the callback can 
> >>> have more information for the inclusion directive (then the preprocessing 
> >>> record can do the adjustment itself).
> >>
> >> That sounds good to me -- though I'm not sure if we need to pass
> >> EndLoc if we pass the FilenameTok (possibly for cases where the
> >> filename comes from macro expansion? I'm not all that familiar with
> >> this code).
> >
> > Yes, the EndLoc is necessary because if there are macro expansions involved 
> > then the FilenameTok token will be a tok::less for '<'.
> 
> This is going to be "fun" to document thoroughly enough for clients to
> be written based on the spec alone -- but then the quirky lexing rules
> for this context are what they are, and Clang can't (and shouldn't)
> completely hide them.
> 
> In terms of getting something which is easy to explain, how about we pass 
> FilenameTok, as described above, and we also pass the location of the 
> terminating > or " *character* as EndLoc? Would that cover all the use cases?

Not sure about pointing at the character, pointing at '>' is pointing at 
something that can be lexed raw as a token, while pointing at the ending quote 
is not. How about a minor change to your suggestion and pass the end location 
of the directive character range (just after the > or "); this is already 
calculated as the CharEnd variable.

> _______________________________________________
> cfe-commits mailing list
> [email protected]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to