On Wed, Feb 5, 2014 at 2:36 PM, Harald van Dijk <[email protected]> wrote:
> On 04/02/14 22:24, Richard Smith wrote: > > On Tue, Feb 4, 2014 at 1:13 PM, Harald van Dijk <[email protected] > > <mailto:[email protected]>> wrote: > > > > On 04/02/14 20:25, Justin Bogner wrote: > > > Harald van Dijk <[email protected] <mailto:[email protected]>> > > writes: > > >> Attached are my updated patches intended to fix various whitespace > > >> issues in clang, modified to take Justin Bogner's comments into > > account. > > >> They are intended to ensure that whitespace is not inappropriately > > >> removed just because a macro or macro argument expansion is > > empty, and > > >> does get removed during token pasting. > > > > > > I've committed the first four patches for you: r200785 through > > r200788. > > > > Thanks! > > > > >> I have moved the handling of invalid token pastes from > > >> TokenLexer::ExpandFunctionArguments to TokenLexer::Lex, so that > > it works > > >> for both object- and function-like macros, and both when ##'s > > operands > > >> use macro parameters and when they don't. > > > > > > Given that the tests needed to be changed and the behaviour clearly > > > hasn't followed the comment in a while, I'm not entirely convinced > > this > > > is the right thing to do. Could the comment simply be wrong? Are > > people > > > relying on one behaviour or the other for invalid token pastes in > > > assembler-with-cpp? > > > > > > Basically, is there a way to objectively say one of these > > behaviours is > > > better than the other? > > > > Having looked closer, the current behaviour is inconsistent in a way > > that cannot really be explained to someone not familiar with a few > > implementation details. Given > > > > #define foo(x) (. ## x . ## y) > > > > foo(y) actually expands to (.y . y) in assembler-with-cpp mode, with > or > > without my first four patches. That first result is what the comment > > refers to; could the fact that the second result is different be an > > oversight? There does not seem to be a reason for this difference, at > > least. Surprisingly though, this is exactly what GCC does too. > > > > My last patch would have turned this into (.y .y), removing the > second > > space. That makes sense to me. (. y . y) could also be a perfectly > > sensible result. > > > > > > I think (.y .y) is probably the best answer here (consistently remove > > whitespace on both sides of ## wherever it appears) -- this also > > probably better matches what MSVC does (where token-paste can result in > > multiple tokens, and seems to act as if the tokens were merely abutted > > with no adjacent whitespace). This would also behave more consistently > > when processing languages with a different lexical structure from C -- > > in a language where an identifier can start with a period, (.y .y) seems > > to unambiguously be the right answer. > > That is a good point about MSVC. There is actually one test case > affected by this change, using -fms-extensions, where the behaviour did > not match that of MSVC, and does now, at least for the compiler that > comes with Visual Studio 2013. > > > I suspect, but do not actually know, that nobody really uses . ## foo > > unless foo is a macro argument, because when foo is not a macro > > argument, there is no point in using ## in the first place. It would > > explain why this went unnoticed for so long. And I can imagine a few > > cases where this would be useful: assembler directives that have > subtly > > different syntax, depending on the assembler used. x y would be > > insufficient if x is ., if y is size, and if .size is a directive, > but . > > size is a syntax error. ## would work here. > > > > Even if clang's behaviour should change, though, my patch does not > have > > adequate testing for these special cases, so shouldn't be applied as > is. > > Should I work on better tests, or do you think it is more likely that > > the behaviour should remain unchanged and get decent testing? > > > > > > There's a risk of breaking someone's assumption by making a change here > > (especially since we'd be introducing a difference from GCC) but I think > > the new behavior is much more defensible, and there's probably no way to > > find out without trying it. > > All right. I have now taken a closer look at the test cases where the > behaviour changes, and noticed that for blargh ## $foo -> blargh $foo it > would not be right to simply change the test case to blargh$foo, as that > misses the point of the test. > Right, I see, it's just making sure that -fdollars-in-identifiers allows the paste to work. I like your fix to the test here. > There was also already a test for . ## foo, but it only tested the case > where foo is a macro argument. I have extended that test to also check > what happens when foo is not a macro argument. > Can you also add a test for the case where the . comes from a macro argument? May as well be thorough =) > How does the attached patch look? I've re-tested it on sources of today, > but on top of my patch in the "r200787 - Fix whitespace handling in > empty macro expansions" thread (after your okay for that). Checking whether the previous token was a ## worries me a little. What about cases like this: #define foo(x, y) x####y foo(, z) This expands to "## z" under our current left-to-right pasting strategy, but I think your patch drops the space. Amusingly, "##z" appears to be the right answer under a right-to-left pasting strategy, so maybe that's OK? Both GCC and EDG expand this to just "z"; the standard is not spectacularly clear here, but I think we're right, because a ## produced by pasting a placemarker token onto a ## is not a token in the replacement list (that token is already gone). If you want to say that we don't care about this case, I could live with that =)
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
