Re: ARM inline assembly usage in Linux kernel
On 20 February 2014 12:59, Ramana Radhakrishnan wrote: > It's not really because GAS supports it, but there exists a large body > of code out there which uses inline assembler with pre-UAL syntax. I'm > not sure people will appreciate a blanket break in one version of the > toolchain and especially when people could quite easily mix and match > between compiler versions and binutils versions. Hi Ramana, I agree, I didn't mean it was GAS' fault. > Before anything else the compiler needs to be fixed and there are some > corner cases to deal with build attributes especially for Thumb1 in > the assembler before we can starting thinking about deprecating > pre-UAL syntax. Absolutely. But there needs to be an interest in the GNU community to drive these changes forward. In LLVM we're very much pro-UAL and it took us quite a lot of convincing to support pre-UAL syntax in the *parser only*, but we'll never generate it ourselves. Everything we generate is (or should be) UAL. > It may be of > interest for 4.9 + 1 = (4.10 /5.0) in GCC and the next binutils > revision. If people are really interested, I can start the ball rolling in the binutils list. > Adding the warning by default to GAS is just part of the solution. It'll only be the second step, yes, with the first one being to fix the remaining ugly bugs. There will be many more... cheers, --renato
Re: ARM inline assembly usage in Linux kernel
On Wed, Feb 19, 2014 at 11:26 PM, Renato Golin wrote: > On 19 February 2014 23:19, Andrew Pinski wrote: >> With the unified assembly format, you should not need those >> .arm/.thumb and in fact emitting them can make things even worse. > > If only we could get rid or all pre-UAL inline assembly on the planet... :) > The has been the only reason why we added support for those in our > assembler, because GAS supports them and people still use (or have > legacy code they won't change). It's not really because GAS supports it, but there exists a large body of code out there which uses inline assembler with pre-UAL syntax. I'm not sure people will appreciate a blanket break in one version of the toolchain and especially when people could quite easily mix and match between compiler versions and binutils versions. Granted the benefits of moving to UAL. Before anything else the compiler needs to be fixed and there are some corner cases to deal with build attributes especially for Thumb1 in the assembler before we can starting thinking about deprecating pre-UAL syntax. Currently we only put out UAL syntax for Thumb2 integer instructions and Neon/ Advanced SIMD instructions. Switching ARM state to UAL is trivial, VFP a little bit more work and Thumb1 a bit harder as you may need a more up to date GAS with some fixes. We also need a command line switch (and maybe a pragma) in GCC to put out a .syntax divided at the entry to and exit from an inline assembler block to allow folks to transition their inline assembler code, all of which as you can imagine is not rocket science but needs diligent rework. It may be of interest for 4.9 + 1 = (4.10 /5.0) in GCC and the next binutils revision. Ripping out pre-UAL support from GAS is a different story and will take quite a few more years, empirical evidence shows that it took us quite a few years to get rid of FPA support in the compiler and I don't think it's fully gone from the assembler. We'll remain stuck with pre-UAL syntax in the GNU Tools world for quite a while IMNSHO. Adding the warning by default to GAS is just part of the solution. regards Ramana > > cheers, > --renato
Re: ARM inline assembly usage in Linux kernel
On 19/02/14 23:19, Andrew Pinski wrote: > On Wed, Feb 19, 2014 at 3:17 PM, Renato Golin wrote: >> On 19 February 2014 11:58, Richard Sandiford >> wrote: >>> I agree that having an unrecognised asm shouldn't be a hard error until >>> assembly time though. Saleem, is the problem that this is being rejected >>> earlier? >> >> Hi Andrew, Richard, >> >> Thanks for your reviews! We agree that we should actually just ignore >> the contents until object emission. >> >> Just for context, one of the reasons why we enabled inline assembly >> checks is for some obscure cases when the snippet changes the >> instructions set (arm -> thumb) and the rest of the function becomes >> garbage. Our initial implementation was to always emit .arm/.thumb >> after *any* inline assembly, which would become a nop in the worst >> case. But since we had easy access to the assembler, we thought: "why >> not?". > > With the unified assembly format, you should not need those > .arm/.thumb and in fact emitting them can make things even worse. > Nonsense. If an inline assembly statement changed the state and didn't put it back again, then all hell could break loose afterwards, including getting bogus error messages out of the assembler that would appear to the user as bugs in the compiler. Also not all instructions have duals in the other instruction set (eg. ORN in thumb has no dual in ARM and RSB has no dual in thumb). Furthermore, GCC has to understand some things about inline assembly in order to get literal pool placement (and in Thumb1 branch ranges) correct. It has to assume that an inline assembly block generates no more than 4 bytes of code per statement in the assembly (so .size ) is certainly going to cause problems. Inline assembly can't be an entirely opaque blob. R. > Thanks, > Andrew Pinski > > >> >> The idea is now to try to parse the snippet for cases like .arm/.thumb >> but only emit a warning IFF -Wbad-inline-asm (or whatever) is set (and >> not to make it on by default), otherwise, ignore. We're hoping our >> assembler will be able to cope with the multiple levels of indirection >> automagically. ;) >> >> Thanks again! >> --renato >
Re: ARM inline assembly usage in Linux kernel
On 20 February 2014 10:11, Ramana Radhakrishnan wrote: > The current behaviour is that when the compiler generates code for > Thumb1 and Thumb2 we switch back to the appropriate state after inline > assembler is emitted. We don't switch back to ARM state on the (fairly > robust) assumption that most inline assembler is written in ARM state. We went one step further (possibly unnecessarily) and we check what's the current state before going into inline asm and always emit the correct code directive afterwards. We're changing it back from the bad decision to validate inline assembly (my fault!) in -S mode. > In any case when users are switching ARM and Thumb states, they need > to be careful anyway to make sure that the *machine* is going to get > back to the *correct* state and having a screen full of possibly > meaningless compile time errors may not be the most productive. Maybe it'd be better to have fixed the error reporting in the first place. ;) > .arm / .thumb directives should not assemble to any instruction least > of all nop. You mean ignored here :). Yes. ;) cheers, --renato
Re: ARM inline assembly usage in Linux kernel
On Wed, Feb 19, 2014 at 11:19 PM, Andrew Pinski wrote: > On Wed, Feb 19, 2014 at 3:17 PM, Renato Golin wrote: >> On 19 February 2014 11:58, Richard Sandiford >> wrote: >>> I agree that having an unrecognised asm shouldn't be a hard error until >>> assembly time though. Saleem, is the problem that this is being rejected >>> earlier? >> >> Hi Andrew, Richard, >> >> Thanks for your reviews! We agree that we should actually just ignore >> the contents until object emission. >> >> Just for context, one of the reasons why we enabled inline assembly >> checks is for some obscure cases when the snippet changes the >> instructions set (arm -> thumb) and the rest of the function becomes >> garbage. The current behaviour is that when the compiler generates code for Thumb1 and Thumb2 we switch back to the appropriate state after inline assembler is emitted. We don't switch back to ARM state on the (fairly robust) assumption that most inline assembler is written in ARM state. In any case when users are switching ARM and Thumb states, they need to be careful anyway to make sure that the *machine* is going to get back to the *correct* state and having a screen full of possibly meaningless compile time errors may not be the most productive. FTR this is to be the motivation behind such a change based on a conversation with rearnsha. >> Our initial implementation was to always emit .arm/.thumb >> after *any* inline assembly, which would become a nop in the worst >> case. But since we had easy access to the assembler, we thought: "why >> not?". .arm / .thumb directives should not assemble to any instruction least of all nop. You mean ignored here :). > > With the unified assembly format, you should not need those > .arm/.thumb and in fact emitting them can make things even worse. Why ? Care to explain when and how it is worse ? UAL makes no reference to the actual assembler directives required which is (assembler) implementation dependent. It is purely a grammar for the instructions in the assembly language and doesn't attempt to standardize assembler directives which would have evolved differently over time and different assemblers. How do you otherwise tell the assembler whether to assemble for ARM state or Thumb instructions ? regards Ramana > > Thanks, > Andrew Pinski > > >> >> The idea is now to try to parse the snippet for cases like .arm/.thumb >> but only emit a warning IFF -Wbad-inline-asm (or whatever) is set (and >> not to make it on by default), otherwise, ignore. We're hoping our >> assembler will be able to cope with the multiple levels of indirection >> automagically. ;) >> >> Thanks again! >> --renato
Re: ARM inline assembly usage in Linux kernel
On 19 February 2014 23:19, Andrew Pinski wrote: > With the unified assembly format, you should not need those > .arm/.thumb and in fact emitting them can make things even worse. If only we could get rid or all pre-UAL inline assembly on the planet... :) The has been the only reason why we added support for those in our assembler, because GAS supports them and people still use (or have legacy code they won't change). If the binutils folks (and you guys) are happy to start seriously de-phasing pre-UAL support, I'd be more than happy to do so on our end. Do you think I should start that conversation on the binutils list? Maybe a new serious compulsory warning, to start? cheers, --renato
Re: ARM inline assembly usage in Linux kernel
On Wed, Feb 19, 2014 at 3:17 PM, Renato Golin wrote: > On 19 February 2014 11:58, Richard Sandiford > wrote: >> I agree that having an unrecognised asm shouldn't be a hard error until >> assembly time though. Saleem, is the problem that this is being rejected >> earlier? > > Hi Andrew, Richard, > > Thanks for your reviews! We agree that we should actually just ignore > the contents until object emission. > > Just for context, one of the reasons why we enabled inline assembly > checks is for some obscure cases when the snippet changes the > instructions set (arm -> thumb) and the rest of the function becomes > garbage. Our initial implementation was to always emit .arm/.thumb > after *any* inline assembly, which would become a nop in the worst > case. But since we had easy access to the assembler, we thought: "why > not?". With the unified assembly format, you should not need those .arm/.thumb and in fact emitting them can make things even worse. Thanks, Andrew Pinski > > The idea is now to try to parse the snippet for cases like .arm/.thumb > but only emit a warning IFF -Wbad-inline-asm (or whatever) is set (and > not to make it on by default), otherwise, ignore. We're hoping our > assembler will be able to cope with the multiple levels of indirection > automagically. ;) > > Thanks again! > --renato
Re: ARM inline assembly usage in Linux kernel
On 19 February 2014 11:58, Richard Sandiford wrote: > I agree that having an unrecognised asm shouldn't be a hard error until > assembly time though. Saleem, is the problem that this is being rejected > earlier? Hi Andrew, Richard, Thanks for your reviews! We agree that we should actually just ignore the contents until object emission. Just for context, one of the reasons why we enabled inline assembly checks is for some obscure cases when the snippet changes the instructions set (arm -> thumb) and the rest of the function becomes garbage. Our initial implementation was to always emit .arm/.thumb after *any* inline assembly, which would become a nop in the worst case. But since we had easy access to the assembler, we thought: "why not?". The idea is now to try to parse the snippet for cases like .arm/.thumb but only emit a warning IFF -Wbad-inline-asm (or whatever) is set (and not to make it on by default), otherwise, ignore. We're hoping our assembler will be able to cope with the multiple levels of indirection automagically. ;) Thanks again! --renato
Re: ARM inline assembly usage in Linux kernel
Andrew Pinski writes: > On Tue, Feb 18, 2014 at 6:56 PM, Saleem Abdulrasool > wrote: >> Hello. >> >> I am sending this at the behest of Renato. I have been working on the ARM >> integrated assembler in LLVM and came across an interesting item in the Linux >> kernel. >> >> I am wondering if this is an unstated covenant between the kernel and GCC or >> simply a clever use of an unintended/undefined behaviour. >> >> The Linux kernel uses the *compiler* as a fancy preprocessor to generate a >> specially crafted assembly file. This file is then post-processed via sed to >> generate a header containing constants which is shared across assembly and C >> sources. >> >> In order to clarify the question, I am selecting a particular example and >> pulling out the relevant bits of the source code below. >> >> #define DEFINE(sym, val) asm volatile("\n->" #sym " %0 " #val : : "i" (val)) >> >> #define __NR_PAGEFLAGS 22 >> >> void definitions(void) { >> DEFINE(NR_PAGEFLAGS, __NR_PAGEFLAGS); >> } >> >> This is then assembled to generate the following: >> >> ->NR_PAGEFLAGS #22 __NR_PAGEFLAGS >> >> This will later be post-processed to generate: >> >> #define NR_PAGELAGS 22 /* __NR_PAGEFLAGS */ >> >> By using the inline assembler to evaluate (constant) expressions into >> constant >> values and then emit that using a special identifier (->) is a fairly clever >> trick. This leads to my question: is this just use of an unintentional >> "feature" or something that was worked out between the two projects. If the output is being post-processed by sed then maybe you could put a comment character at the beginning of the line and sed it out? But I tend to agree with Andrew that for -S output the compiler should be prepared to accept asm strings that it can't parse, even if the integrated assembler thinks it understands every instruction. > I don't see why this is a bad use of the inline-asm. GCC does not > know and is not supposed to know what the string inside the inline-asm > is going to be. In fact if you have a newer assembler than the > compiler, you could use instructions that GCC does not even know > about. Yeah, FWIW, I agree this is a valid use of inline asm. The use of volatile in a reachable part of definitions() means that the asm (and thus the asm string) must be kept if definitions() is kept. I doubt the idea was agreed with GCC developers because no GCC changes were needed to use inline asm this way. > This is the purpose of inline-asm. I think it was a bad > design decision on LLVM/clang's part that it would check the assembly > code up front. Being able to parse it is a useful feature. E.g. it means you can get an accurate byte length for the asm, which is something that we otherwise have to guess by multiplying the number of lines by a constant factor. (And that's wrong for MIPS assembly macros, unless you use a very conservative constant factor.) I agree that having an unrecognised asm shouldn't be a hard error until assembly time though. Saleem, is the problem that this is being rejected earlier? Thanks, Richard
Re: ARM inline assembly usage in Linux kernel
On Tue, Feb 18, 2014 at 6:56 PM, Saleem Abdulrasool wrote: > Hello. > > I am sending this at the behest of Renato. I have been working on the ARM > integrated assembler in LLVM and came across an interesting item in the Linux > kernel. > > I am wondering if this is an unstated covenant between the kernel and GCC or > simply a clever use of an unintended/undefined behaviour. > > The Linux kernel uses the *compiler* as a fancy preprocessor to generate a > specially crafted assembly file. This file is then post-processed via sed to > generate a header containing constants which is shared across assembly and C > sources. > > In order to clarify the question, I am selecting a particular example and > pulling out the relevant bits of the source code below. > > #define DEFINE(sym, val) asm volatile("\n->" #sym " %0 " #val : : "i" (val)) > > #define __NR_PAGEFLAGS 22 > > void definitions(void) { > DEFINE(NR_PAGEFLAGS, __NR_PAGEFLAGS); > } > > This is then assembled to generate the following: > > ->NR_PAGEFLAGS #22 __NR_PAGEFLAGS > > This will later be post-processed to generate: > > #define NR_PAGELAGS 22 /* __NR_PAGEFLAGS */ > > By using the inline assembler to evaluate (constant) expressions into constant > values and then emit that using a special identifier (->) is a fairly clever > trick. This leads to my question: is this just use of an unintentional > "feature" or something that was worked out between the two projects. > > Please explicitly CC me on any response as I am not subscribed to this mailing > list. I don't see why this is a bad use of the inline-asm. GCC does not know and is not supposed to know what the string inside the inline-asm is going to be. In fact if you have a newer assembler than the compiler, you could use instructions that GCC does not even know about. This is the purpose of inline-asm. I think it was a bad design decision on LLVM/clang's part that it would check the assembly code up front. Thanks, Andrew Pinski > > Thanks. > > -- > Saleem Abdulrasool > compnerd (at) compnerd (dot) org >
ARM inline assembly usage in Linux kernel
Hello. I am sending this at the behest of Renato. I have been working on the ARM integrated assembler in LLVM and came across an interesting item in the Linux kernel. I am wondering if this is an unstated covenant between the kernel and GCC or simply a clever use of an unintended/undefined behaviour. The Linux kernel uses the *compiler* as a fancy preprocessor to generate a specially crafted assembly file. This file is then post-processed via sed to generate a header containing constants which is shared across assembly and C sources. In order to clarify the question, I am selecting a particular example and pulling out the relevant bits of the source code below. #define DEFINE(sym, val) asm volatile("\n->" #sym " %0 " #val : : "i" (val)) #define __NR_PAGEFLAGS 22 void definitions(void) { DEFINE(NR_PAGEFLAGS, __NR_PAGEFLAGS); } This is then assembled to generate the following: ->NR_PAGEFLAGS #22 __NR_PAGEFLAGS This will later be post-processed to generate: #define NR_PAGELAGS 22 /* __NR_PAGEFLAGS */ By using the inline assembler to evaluate (constant) expressions into constant values and then emit that using a special identifier (->) is a fairly clever trick. This leads to my question: is this just use of an unintentional "feature" or something that was worked out between the two projects. Please explicitly CC me on any response as I am not subscribed to this mailing list. Thanks. -- Saleem Abdulrasool compnerd (at) compnerd (dot) org