On Mon, Sep 17, 2012 at 3:04 PM, Argyrios Kyrtzidis <[email protected]> wrote:
>
> On Sep 17, 2012, at 2:38 PM, James Dennett <[email protected]> wrote:
>
>> On Mon, Sep 17, 2012 at 2:22 PM, Argyrios Kyrtzidis <[email protected]> 
>> wrote:
>>> On Sep 17, 2012, at 12:44 PM, James Dennett <[email protected]> wrote:
>>>
>>>> On Mon, Sep 10, 2012 at 7:17 PM, Argyrios Kyrtzidis <[email protected]> 
>>>> wrote:
>>>>> Author: akirtzidis
>>>>> Date: Mon Sep 10 21:17:21 2012
>>>>> New Revision: 163588
>>>>>
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=163588&view=rev
>>>>> Log:
>>>>> [libclang] Fix getting a cursor inside an angled #include directive.
>>>>>
>>>>> Fixed by pointing the end location of the preprocessed entity for the 
>>>>> #include
>>>>> at the closing '>', instead of the start of '<'.
>>>>>
>>>>> rdar://11113134
>>>>>
>>>>> Modified:
>>>>>   cfe/trunk/lib/Lex/PPDirectives.cpp
>>>>>   cfe/trunk/test/Index/c-index-getCursor-pp.c
>>>>>
>>>>> Modified: cfe/trunk/lib/Lex/PPDirectives.cpp
>>>>> URL: 
>>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/PPDirectives.cpp?rev=163588&r1=163587&r2=163588&view=diff
>>>>> ==============================================================================
>>>>> --- cfe/trunk/lib/Lex/PPDirectives.cpp (original)
>>>>> +++ cfe/trunk/lib/Lex/PPDirectives.cpp Mon Sep 10 21:17:21 2012
>>>>> @@ -1296,6 +1296,9 @@
>>>>>  case tok::string_literal:
>>>>>    Filename = getSpelling(FilenameTok, FilenameBuffer);
>>>>>    End = FilenameTok.getLocation();
>>>>> +    // For an angled include, point the end location at the closing '>'.
>>>>> +    if (FilenameTok.is(tok::angle_string_literal))
>>>>> +      End = End.getLocWithOffset(Filename.size()-1);
>>>>>    CharEnd = End.getLocWithOffset(Filename.size());
>>>>>    break;
>>>>>
>>>>>
>>>>> Modified: cfe/trunk/test/Index/c-index-getCursor-pp.c
>>>>> URL: 
>>>>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Index/c-index-getCursor-pp.c?rev=163588&r1=163587&r2=163588&view=diff
>>>>> ==============================================================================
>>>>> --- cfe/trunk/test/Index/c-index-getCursor-pp.c (original)
>>>>> +++ cfe/trunk/test/Index/c-index-getCursor-pp.c Mon Sep 10 21:17:21 2012
>>>>> @@ -15,6 +15,8 @@
>>>>>
>>>>> const char *fname = __FILE__;
>>>>>
>>>>> +#include <a.h>
>>>>> +
>>>>> // RUN: c-index-test -cursor-at=%s:1:11 -I%S/Inputs %s | FileCheck 
>>>>> -check-prefix=CHECK-1 %s
>>>>> // CHECK-1: macro definition=OBSCURE
>>>>> // RUN: c-index-test -cursor-at=%s:2:14 -I%S/Inputs %s | FileCheck 
>>>>> -check-prefix=CHECK-2 %s
>>>>> @@ -31,6 +33,8 @@
>>>>> // CHECK-7: macro expansion=B:12:9
>>>>> // RUN: c-index-test -cursor-at=%s:16:25 -I%S/Inputs %s | FileCheck 
>>>>> -check-prefix=CHECK-8 %s
>>>>> // CHECK-8: macro expansion=__FILE__
>>>>> +// RUN: c-index-test -cursor-at=%s:18:12 -I%S/Inputs %s | FileCheck 
>>>>> -check-prefix=CHECK-9 %s
>>>>> +// CHECK-9: inclusion directive=a.h
>>>>>
>>>>> // Same tests, but with "editing" optimizations
>>>>> // RUN: env CINDEXTEST_EDITING=1 c-index-test -cursor-at=%s:1:11 
>>>>> -I%S/Inputs %s | FileCheck -check-prefix=CHECK-1 %s
>>>>
>>>> This change broke clients that implement
>>>> PPCallbacks::InclusionDirective, which now receive the location of the
>>>> closing '>' rather than the location of the token that they're
>>>> expecting based on the spec:
>>>> /// \param EndLoc The location of the last token within the
>>>> inclusion
>>>> /// directive.
>>>> The last (preprocessing) token is the whole of <filename>, of type
>>>> tok::angle_string_literal.
>>>>
>>>> Could we consider reverting this change and finding an alternative way
>>>> to fix libclang without breaking/changing the lower-level interface?
>>>
>>> Hi James,
>>>
>>> This is not a high-level libclang issue.
>>
>> Apologies, I was going by the change description "[libclang] Fix
>> getting a cursor inside an angled #include directive.", and that the
>> only tests I saw included were for libclang.  (That does suggests that
>> we're missing tests for this in the appropriate place.)
>
> You're right, a preprocessing record unit test is also needed.
>
>>
>>> The preprocessing record needs to know the '>' location and depends on the 
>>> PPCallbacks::InclusionDirective callback. Previously this callback did not 
>>> provide a way to reliably get at that location.
>>
>> I'm not sure about that -- it provided a way to get there, but it was
>> necessary to skip over the filename.
>
> If the include filename was formed via a macro expansion then you had an 
> EndLoc pointing at '>'. So this was the situation:
>
> No macro expansions involved: EndLoc points at '<'
> With macro expansion: EndLoc points at '>'.
>
> The receiver had no easy way to know which case it was. With the change it 
> always pointed at '>'.

Checking which character is at the EndLoc isn't so hard, but I take your point.

>> With your change a similar trick
>> is required for code wanting to get to the filename, but only for
>> #include <filename>, not for #include "filename".  (Why are the two
>> cases different?  Currently the callback returns the location of the
>> opening quote for the #include "name" case, and the closing one for
>> #include <name>.  The asymmetry seems strange.)
>
> There's an asymmetry due to how lexing would behave if you pointed at the 
> beginning of the filename, for example:
>
> #include "filename"
> If you point at the opening quote and start lexing you will get the 
> tok::string_literal token, the same that the preprocessor saw. Then you can 
> get at the character range of the whole filename input by checking the token 
> size.
>
> #include <filename>
> if you point at the opening '<' and start lexing you will get a tok::less 
> token, you need to get into the special "parsing include directive" mode to 
> receive a tok::angle_string_literal token.
> So just pointing at '<' will not allow you to easily lex and get the 
> character range of the filename input.

I was assuming that you'd lex correctly, which means being in the
right mode to lex an include directive in this case.  I don't think it
matters that if you lex incorrectly that you'll get the wrong results.

>>> I believe we still need to modify the bahavior of the InclusionDirective 
>>> callback and focus the discussion there.
>>>
>>> How about reverting the EndLoc adjustment but have the InclusionDirective 
>>> also pass the FilenameTok token so that receivers of the callback can have 
>>> more information for the inclusion directive (then the preprocessing record 
>>> can do the adjustment itself).
>>
>> That sounds good to me -- though I'm not sure if we need to pass
>> EndLoc if we pass the FilenameTok (possibly for cases where the
>> filename comes from macro expansion? I'm not all that familiar with
>> this code).
>
> Yes, the EndLoc is necessary because if there are macro expansions involved 
> then the FilenameTok token will be a tok::less for '<'.

This is going to be "fun" to document thoroughly enough for clients to
be written based on the spec alone -- but then the quirky lexing rules
for this context are what they are, and Clang can't (and shouldn't)
completely hide them.

-- James

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to