FWIW, the original patch caused PR13880. On Tue, Sep 18, 2012 at 1:50 PM, Argyrios Kyrtzidis <[email protected]>wrote:
> On Sep 18, 2012, at 1:35 PM, Richard Smith <[email protected]> wrote: > > On Tue, Sep 18, 2012 at 10:23 AM, Argyrios Kyrtzidis > <[email protected]>wrote: > >> On Sep 17, 2012, at 4:50 PM, Richard Smith <[email protected]> wrote: >> >> On Mon, Sep 17, 2012 at 4:04 PM, James Dennett <[email protected]>wrote: >> >>> On Mon, Sep 17, 2012 at 3:04 PM, Argyrios Kyrtzidis <[email protected]> >>> wrote: >>> > >>> > On Sep 17, 2012, at 2:38 PM, James Dennett <[email protected]> >>> wrote: >>> > >>> >> On Mon, Sep 17, 2012 at 2:22 PM, Argyrios Kyrtzidis < >>> [email protected]> wrote: >>> >>> On Sep 17, 2012, at 12:44 PM, James Dennett <[email protected]> >>> wrote: >>> >>> >>> >>>> On Mon, Sep 10, 2012 at 7:17 PM, Argyrios Kyrtzidis < >>> [email protected]> wrote: >>> >>>>> Author: akirtzidis >>> >>>>> Date: Mon Sep 10 21:17:21 2012 >>> >>>>> New Revision: 163588 >>> >>>>> >>> >>>>> URL: http://llvm.org/viewvc/llvm-project?rev=163588&view=rev >>> >>>>> Log: >>> >>>>> [libclang] Fix getting a cursor inside an angled #include >>> directive. >>> >>>>> >>> >>>>> Fixed by pointing the end location of the preprocessed entity for >>> the #include >>> >>>>> at the closing '>', instead of the start of '<'. >>> >>>>> >>> >>>>> rdar://11113134 >>> >>>>> >>> >>>>> Modified: >>> >>>>> cfe/trunk/lib/Lex/PPDirectives.cpp >>> >>>>> cfe/trunk/test/Index/c-index-getCursor-pp.c >>> >>>>> >>> >>>>> Modified: cfe/trunk/lib/Lex/PPDirectives.cpp >>> >>>>> URL: >>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/PPDirectives.cpp?rev=163588&r1=163587&r2=163588&view=diff >>> >>>>> >>> ============================================================================== >>> >>>>> --- cfe/trunk/lib/Lex/PPDirectives.cpp (original) >>> >>>>> +++ cfe/trunk/lib/Lex/PPDirectives.cpp Mon Sep 10 21:17:21 2012 >>> >>>>> @@ -1296,6 +1296,9 @@ >>> >>>>> case tok::string_literal: >>> >>>>> Filename = getSpelling(FilenameTok, FilenameBuffer); >>> >>>>> End = FilenameTok.getLocation(); >>> >>>>> + // For an angled include, point the end location at the >>> closing '>'. >>> >>>>> + if (FilenameTok.is <http://filenametok.is/> >>> (tok::angle_string_literal)) >>> >>>>> + End = End.getLocWithOffset(Filename.size()-1); >>> >>>>> CharEnd = End.getLocWithOffset(Filename.size()); >>> >>>>> break; >>> >>>>> >>> >>>>> >>> >>>>> Modified: cfe/trunk/test/Index/c-index-getCursor-pp.c >>> >>>>> URL: >>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Index/c-index-getCursor-pp.c?rev=163588&r1=163587&r2=163588&view=diff >>> >>>>> >>> ============================================================================== >>> >>>>> --- cfe/trunk/test/Index/c-index-getCursor-pp.c (original) >>> >>>>> +++ cfe/trunk/test/Index/c-index-getCursor-pp.c Mon Sep 10 >>> 21:17:21 2012 >>> >>>>> @@ -15,6 +15,8 @@ >>> >>>>> >>> >>>>> const char *fname = __FILE__; >>> >>>>> >>> >>>>> +#include <a.h> >>> >>>>> + >>> >>>>> // RUN: c-index-test -cursor-at=%s:1:11 -I%S/Inputs %s | FileCheck >>> -check-prefix=CHECK-1 %s >>> >>>>> // CHECK-1: macro definition=OBSCURE >>> >>>>> // RUN: c-index-test -cursor-at=%s:2:14 -I%S/Inputs %s | FileCheck >>> -check-prefix=CHECK-2 %s >>> >>>>> @@ -31,6 +33,8 @@ >>> >>>>> // CHECK-7: macro expansion=B:12:9 >>> >>>>> // RUN: c-index-test -cursor-at=%s:16:25 -I%S/Inputs %s | >>> FileCheck -check-prefix=CHECK-8 %s >>> >>>>> // CHECK-8: macro expansion=__FILE__ >>> >>>>> +// RUN: c-index-test -cursor-at=%s:18:12 -I%S/Inputs %s | >>> FileCheck -check-prefix=CHECK-9 %s >>> >>>>> +// CHECK-9: inclusion directive=a.h >>> >>>>> >>> >>>>> // Same tests, but with "editing" optimizations >>> >>>>> // RUN: env CINDEXTEST_EDITING=1 c-index-test -cursor-at=%s:1:11 >>> -I%S/Inputs %s | FileCheck -check-prefix=CHECK-1 %s >>> >>>> >>> >>>> This change broke clients that implement >>> >>>> PPCallbacks::InclusionDirective, which now receive the location of >>> the >>> >>>> closing '>' rather than the location of the token that they're >>> >>>> expecting based on the spec: >>> >>>> /// \param EndLoc The location of the last token within the >>> >>>> inclusion >>> >>>> /// directive. >>> >>>> The last (preprocessing) token is the whole of <filename>, of type >>> >>>> tok::angle_string_literal. >>> >>>> >>> >>>> Could we consider reverting this change and finding an alternative >>> way >>> >>>> to fix libclang without breaking/changing the lower-level interface? >>> >>> >>> >>> Hi James, >>> >>> >>> >>> This is not a high-level libclang issue. >>> >> >>> >> Apologies, I was going by the change description "[libclang] Fix >>> >> getting a cursor inside an angled #include directive.", and that the >>> >> only tests I saw included were for libclang. (That does suggests that >>> >> we're missing tests for this in the appropriate place.) >>> > >>> > You're right, a preprocessing record unit test is also needed. >>> > >>> >> >>> >>> The preprocessing record needs to know the '>' location and depends >>> on the PPCallbacks::InclusionDirective callback. Previously this callback >>> did not provide a way to reliably get at that location. >>> >> >>> >> I'm not sure about that -- it provided a way to get there, but it was >>> >> necessary to skip over the filename. >>> > >>> > If the include filename was formed via a macro expansion then you had >>> an EndLoc pointing at '>'. So this was the situation: >>> > >>> > No macro expansions involved: EndLoc points at '<' >>> > With macro expansion: EndLoc points at '>'. >>> > >>> > The receiver had no easy way to know which case it was. With the >>> change it always pointed at '>'. >>> >>> Checking which character is at the EndLoc isn't so hard, but I take your >>> point. >>> >>> >> With your change a similar trick >>> >> is required for code wanting to get to the filename, but only for >>> >> #include <filename>, not for #include "filename". (Why are the two >>> >> cases different? Currently the callback returns the location of the >>> >> opening quote for the #include "name" case, and the closing one for >>> >> #include <name>. The asymmetry seems strange.) >>> > >>> > There's an asymmetry due to how lexing would behave if you pointed at >>> the beginning of the filename, for example: >>> > >>> > #include "filename" >>> > If you point at the opening quote and start lexing you will get the >>> tok::string_literal token, the same that the preprocessor saw. Then you can >>> get at the character range of the whole filename input by checking the >>> token size. >>> > >>> > #include <filename> >>> > if you point at the opening '<' and start lexing you will get a >>> tok::less token, you need to get into the special "parsing include >>> directive" mode to receive a tok::angle_string_literal token. >>> > So just pointing at '<' will not allow you to easily lex and get the >>> character range of the filename input. >>> >>> I was assuming that you'd lex correctly, which means being in the >>> right mode to lex an include directive in this case. I don't think it >>> matters that if you lex incorrectly that you'll get the wrong results. >>> >>> >>> I believe we still need to modify the bahavior of the >>> InclusionDirective callback and focus the discussion there. >>> >>> >>> >>> How about reverting the EndLoc adjustment but have the >>> InclusionDirective also pass the FilenameTok token so that receivers of the >>> callback can have more information for the inclusion directive (then the >>> preprocessing record can do the adjustment itself). >>> >> >>> >> That sounds good to me -- though I'm not sure if we need to pass >>> >> EndLoc if we pass the FilenameTok (possibly for cases where the >>> >> filename comes from macro expansion? I'm not all that familiar with >>> >> this code). >>> > >>> > Yes, the EndLoc is necessary because if there are macro expansions >>> involved then the FilenameTok token will be a tok::less for '<'. >>> >>> This is going to be "fun" to document thoroughly enough for clients to >>> be written based on the spec alone -- but then the quirky lexing rules >>> for this context are what they are, and Clang can't (and shouldn't) >>> completely hide them. >> >> >> In terms of getting something which is easy to explain, how about we pass >> FilenameTok, as described above, and we also pass the location of the >> terminating > or " *character* as EndLoc? Would that cover all the use >> cases? >> >> >> Not sure about pointing at the character, pointing at '>' is pointing at >> something that can be lexed raw as a token, while pointing at the ending >> quote is not. >> > > You can't lex the ending character as a token in the angle_string_literal > case either. For instance, we accept (with a warning): > > #include <iostream>> > > If you try to lex the '>' as a token, you'll get '>>'. But I don't see why > it would be useful to lex from that position, so I don't think this matters. > > > This is what will happen if you want to convert a token range for the > filename or directive to a character range; Lexer::getLocForEndOfToken will > do a raw lex, it won't lex in the context of an #include directive. > > > >> How about a minor change to your suggestion and pass the end location of >> the directive character range (just after the > or "); this is already >> calculated as the CharEnd variable. >> > > Yes, I think that's an improvement. I had incorrectly recalled that a > CharRange included both the start and end character. Since it actually > excludes its end character (unlike a TokenRange!), this seems like the more > useful option. On reflection, I think my preferred option would be to ditch > FilenameTok and pass a CharSourceRange for the included entity to the > InclusionCallback. > > > Yes, passing a character range will help alleviate this ugliness. I'll > modify it like this if there are no objections. >
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
