On Aug 13, 2012, at 2:37 AM, Eli Friedman wrote: > On Mon, Aug 13, 2012 at 2:24 AM, Enea Zaffanella <[email protected]> > wrote: >> On 08/10/2012 11:36 PM, Chad Rosier wrote: >>> Author: mcrosier >>> Date: Fri Aug 10 16:36:25 2012 >>> New Revision: 161703 >>> >>> URL: http://llvm.org/viewvc/llvm-project?rev=161703&view=rev >>> Log: >>> [ms-inline asm] Avoid extra allocations by making this an array of >>> StringRefs. >> [...] >> >> >> Hello. >> >> Since you are working on the ms-inline asm statements, >> we would like to propose a change to the AST representation. >> >> Currently, consecutive MS asm statements are merged into a single >> MSAsmStmt node. >> >> This choice negatively affects source code fidelity. >> For instance, there is a single source location for just the first of >> the possibly many consecutive __asm keywords and there is no faithful >> tracking of braces. >> Moreover, it seems to also complicate later code trying to reconstruct >> the spacing and line-breaking around the asm tokens (namely, the use of >> array LineEnds and of static function needSpaceAsmToken). >> >> As things are now, when parsing the following: >> >> __asm { >> mov eax, dword ptr [0] >> mov edx, dword ptr [4] >> } >> >> we obtain the string >> "mov eax, dword ptr[0]mov edx, dword ptr[4]" >> with no line break. >> >> >> Wouldn't thing be more simple if each __asm keyword is mapped onto its >> own MSAsmStmt node? >> (Side note: afawct, this is what happens for non-MS AsmStmt nodes.) >> >> From the source code fidelity perspective, we would get a source >> location for each __asm keyword and we will be able to properly tell if >> it is using braces or not. >> Moreover, the logic for producing the asm string would be simpler too: >> there will be no need to use an array LineEnds, because you can query >> each asm token with isAtStartOfLine(). >> As a matter of fact, token method hasLeadingSpace() could replace static >> function needSpaceAsmToken() to add a white space if there was one in >> the original sources. >> >> For the sample above, we would thus obtain the string >> "mov eax, dword ptr [0]\nmov edx, dword ptr [4]" >> with both the line break and the extra space before the "[". > > Semantically, there's only one statement; CodeGen, for example, must > produce a single LLVM IR inline asm. A series of __asm lines and a > single __asm{} block have the same meaning; they're basically just > spelled differently in the source code. I think it would > substantially complicate the semantic analysis and CodeGen to be > forced to handle a series of AST nodes as if they were one.
I strongly agree with Eli. > Granted, if the current AST nodes don't provide enough source location > information, we should do something, but I don't think splitting them > is the right answer. Again, I agree. The ms-style inline assembly is still in the very early stages of development and we do intend on improving the support better source location tracking. Patches are welcome.. Chad > > -Eli _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
