Re: linux says it is a bug
What is volatile instructions? Can you give us an example? Check volatile_insn_p. AFAIK there are two classes of volatile instructions: * volatile asm * unspec volatiles (target-specific instructions for e.g. protecting function prologues) -Y
Re: linux says it is a bug
>> Asms without outputs are automatically volatile. So there ought be zero change >> with and without the explicit use of the __volatile__ keyword. > > That’s what the documentation says but it wasn’t actually true > as of a couple of releases ago, as I recall. Looks like 2005: $ git annotate gcc/c/c-typeck.c ... 89552023( bonzini 2005-10-05 12:17:16 + 9073) /* asm statements without outputs, including simple ones, are treated 89552023( bonzini 2005-10-05 12:17:16 + 9074) as volatile. */ 89552023( bonzini 2005-10-05 12:17:16 + 9075) ASM_INPUT_P (args) = simple; 89552023( bonzini 2005-10-05 12:17:16 + 9076) ASM_VOLATILE_P (args) = (noutputs == 0); -Y
RE: [RFC] Introducing MIPS O32 ABI Extension for FR0 and FR1 Interlinking
Richard Sandiford writes: > Matthew Fortune writes: > >> Matthew Fortune writes: > >> >> > Sorry, forgot about that. In that case maybe program headers > >> >> > would be best, like you say. I.e. we could use a combination of > >> >> > GNU attributes and a new program header, with the program header > >> >> > hopefully being more general than for just this case. I suppose > >> >> > this comes back to the thread from binutils@ last year about how > >> >> > to manage the dwindling number of free flags: > >> >> > > >> >> > https://www.sourceware.org/ml/binutils/2013-09/msg00039.html > >> >> > to https://www.sourceware.org/ml/binutils/2013-09/msg00099.html > >> >> > > >> > > >> > There are a couple of issues to resolve in order to use gnu > >> > attributes to record FP requirements at the module level. As it > >> > currently stands gnu attributes are controlled via the > >> > .gnu_attribute directive and these are emitted explicitly by the > >> > compiler. I think it is important that a more meaningful directive > >> > is available but it will need to interact nicely with the > .gnu_attribute as well. > >> > > >> > The first problem is that there will be new ways to influence > >> > whether a gnu attribute is emitted or not. i.e. the command line > >> > options -mfp32, -mfpxx, -mfp64 will infer the relevant attribute > >> > Tag_GNU_MIPS_ABI_FP and if the .module directive is present then > >> > that would override it. Will there be any problems with a new ways > >> > to > >> generate a gnu attribute? > >> > >> I think we should just give an error if any .gnu_attributes are > >> inconsistent with the module-level setting (whether that comes from > >> .module or command-line flags). > > > > I would need to account for the -msoft-float and -msingle-float > > command line options to calculate module-level setting in order to do > > this, which is fine. There is however no way to infer the no-float ABI > > from command line options as it is not passed through from the GCC > > driver. This would mean the no-float ABI would always conflict with > > the module level setting. > > -mno-float the GCC option doesn't really select a different ABI. > It does leave the FP attribute as being the default 0/dont-care, but > that's just like it would be when compiling most hand-written assembly > code, including code written before -mno-float or .gnu_attribute was > invented. > > > I suspect the only answer is to make an exception and allow a > > .gnu_attribute 4,0 to take precedence over a command line option (but > > not a .module option). This seems a little convoluted in the end. > > I don't think we should ever override an explicit .gnu_attribute. > The most we can do is report a contradiction. > > >> > The second problem is that in order to support relaxing a mode > >> > requirement then any up-front directive/command line option that > >> > sets a specific fp32/fp64 requirement needs to be updated to fpxx. > >> > With gnu attributes this would mean updating an existing > >> > Tag_GNU_MIPS_ABI_FP setting to be modeless. > >> > >> Not sure what you mean here, sorry. > > > > At the end of a unit we will know whether an FP32 or FP64 ABI can be > > relaxed to FPXX. This will happen if no floating point code has been > > emitted that uses odd numbered registers. All I was checking is that > > it is going to be acceptable to alter the FP ABI attribute even if it > > was set using the .gnu_attribute directive. I know I 'can' do it in > > the code as I have it working already just checking that it is OK. I > > suppose this case is going to be quite rare (hand written assembly > > code that includes a .gnu_attribute 4,1 which is mode safe) but I'd > > like to catch as many cases as possible and relax the ABI. > > Yeah, I don't think we should do any relaxation like that. If a > specific attribute value is chosen we should honour it even if it > doesn't seem necessary. If -mfp32, -mfp64, .module fp=32 or .module > fp=64 is used we should honour it even if -mfpxx or .module fp=xx seems > OK. Are you're OK with automatically selecting fpxx if no -mfp option, no .module and no .gnu_attribute exists? Such code would currently end up as FP ABI Any even if FP code was present, I don't suppose anything would get worse if this existing behaviour simply continued though. For this to be sufficient we would be making the assumption that there is almost no hand written code that has a .gnu_attribute directive nor build system that explicitly uses the -mfp32 option. I think this will certainly hold true for assembly modules that have no FP code in them but presumably anyone who has hand-written FP code will have used the .gnu_attribute. We will lose the ability for such modules to quietly transition to fpxx but I have no data to say how many of such modules exist or may exist in the future. I still need to check if there are any hand written FP modules in the standard libraries. I am concerned about this aspect but
Re: [RFC][PATCH 0/5] arch: atomic rework
On 3 March 2014 20:44, Torvald Riegel wrote: > On Sun, 2014-03-02 at 04:05 -0600, Peter Sewell wrote: >> On 1 March 2014 08:03, Paul E. McKenney wrote: >> > On Sat, Mar 01, 2014 at 04:06:34AM -0600, Peter Sewell wrote: >> >> Hi Paul, >> >> >> >> On 28 February 2014 18:50, Paul E. McKenney >> >> wrote: >> >> > On Thu, Feb 27, 2014 at 12:53:12PM -0800, Paul E. McKenney wrote: >> >> >> On Thu, Feb 27, 2014 at 11:47:08AM -0800, Linus Torvalds wrote: >> >> >> > On Thu, Feb 27, 2014 at 11:06 AM, Paul E. McKenney >> >> >> > wrote: >> >> >> > > >> >> >> > > 3. The comparison was against another RCU-protected pointer, >> >> >> > > where that other pointer was properly fetched using one >> >> >> > > of the RCU primitives. Here it doesn't matter which >> >> >> > > pointer >> >> >> > > you use. At least as long as the rcu_assign_pointer() for >> >> >> > > that other pointer happened after the last update to the >> >> >> > > pointed-to structure. >> >> >> > > >> >> >> > > I am a bit nervous about #3. Any thoughts on it? >> >> >> > >> >> >> > I think that it might be worth pointing out as an example, and saying >> >> >> > that code like >> >> >> > >> >> >> >p = atomic_read(consume); >> >> >> >X; >> >> >> >q = atomic_read(consume); >> >> >> >Y; >> >> >> >if (p == q) >> >> >> > data = p->val; >> >> >> > >> >> >> > then the access of "p->val" is constrained to be data-dependent on >> >> >> > *either* p or q, but you can't really tell which, since the compiler >> >> >> > can decide that the values are interchangeable. >> >> >> > >> >> >> > I cannot for the life of me come up with a situation where this would >> >> >> > matter, though. If "X" contains a fence, then that fence will be a >> >> >> > stronger ordering than anything the consume through "p" would >> >> >> > guarantee anyway. And if "X" does *not* contain a fence, then the >> >> >> > atomic reads of p and q are unordered *anyway*, so then whether the >> >> >> > ordering to the access through "p" is through p or q is kind of >> >> >> > irrelevant. No? >> >> >> >> >> >> I can make a contrived litmus test for it, but you are right, the only >> >> >> time you can see it happen is when X has no barriers, in which case >> >> >> you don't have any ordering anyway -- both the compiler and the CPU can >> >> >> reorder the loads into p and q, and the read from p->val can, as you >> >> >> say, >> >> >> come from either pointer. >> >> >> >> >> >> For whatever it is worth, hear is the litmus test: >> >> >> >> >> >> T1: p = kmalloc(...); >> >> >> if (p == NULL) >> >> >> deal_with_it(); >> >> >> p->a = 42; /* Each field in its own cache line. */ >> >> >> p->b = 43; >> >> >> p->c = 44; >> >> >> atomic_store_explicit(&gp1, p, memory_order_release); >> >> >> p->b = 143; >> >> >> p->c = 144; >> >> >> atomic_store_explicit(&gp2, p, memory_order_release); >> >> >> >> >> >> T2: p = atomic_load_explicit(&gp2, memory_order_consume); >> >> >> r1 = p->b; /* Guaranteed to get 143. */ >> >> >> q = atomic_load_explicit(&gp1, memory_order_consume); >> >> >> if (p == q) { >> >> >> /* The compiler decides that q->c is same as p->c. */ >> >> >> r2 = p->c; /* Could get 44 on weakly order system. */ >> >> >> } >> >> >> >> >> >> The loads from gp1 and gp2 are, as you say, unordered, so you get what >> >> >> you get. >> >> >> >> >> >> And publishing a structure via one RCU-protected pointer, updating it, >> >> >> then publishing it via another pointer seems to me to be asking for >> >> >> trouble anyway. If you really want to do something like that and still >> >> >> see consistency across all the fields in the structure, please put a >> >> >> lock >> >> >> in the structure and use it to guard updates and accesses to those >> >> >> fields. >> >> > >> >> > And here is a patch documenting the restrictions for the current Linux >> >> > kernel. The rules change a bit due to rcu_dereference() acting a bit >> >> > differently than atomic_load_explicit(&p, memory_order_consume). >> >> > >> >> > Thoughts? >> >> >> >> That might serve as informal documentation for linux kernel >> >> programmers about the bounds on the optimisations that you expect >> >> compilers to do for common-case RCU code - and I guess that's what you >> >> intend it to be for. But I don't see how one can make it precise >> >> enough to serve as a language definition, so that compiler people >> >> could confidently say "yes, we respect that", which I guess is what >> >> you really need. As a useful criterion, we should aim for something >> >> precise enough that in a verified-compiler context you can >> >> mathematically prove that the compiler will satisfy it (even though >> >> that won't happen anytime soon for GCC), and that analysis tool >> >> authors can actually know what they're working with. All this stuff >> >> about "yo
Re: [RFC][PATCH 0/5] arch: atomic rework
On Tue, Mar 04, 2014 at 11:00:32AM -0800, Paul E. McKenney wrote: > On Mon, Mar 03, 2014 at 09:46:19PM +0100, Torvald Riegel wrote: > > xagsmtp2.20140303204700.3...@vmsdvma.vnet.ibm.com > > X-Xagent-Gateway: vmsdvma.vnet.ibm.com (XAGSMTP2 at VMSDVMA) > > > > On Mon, 2014-03-03 at 11:20 -0800, Paul E. McKenney wrote: > > > On Mon, Mar 03, 2014 at 07:55:08PM +0100, Torvald Riegel wrote: > > > > xagsmtp2.20140303190831.9...@uk1vsc.vnet.ibm.com > > > > X-Xagent-Gateway: uk1vsc.vnet.ibm.com (XAGSMTP2 at UK1VSC) > > > > > > > > On Fri, 2014-02-28 at 16:50 -0800, Paul E. McKenney wrote: > > > > > +oDo not use the results from the boolean "&&" and "||" when > > > > > + dereferencing. For example, the following (rather improbable) > > > > > + code is buggy: > > > > > + > > > > > + int a[2]; > > > > > + int index; > > > > > + int force_zero_index = 1; > > > > > + > > > > > + ... > > > > > + > > > > > + r1 = rcu_dereference(i1) > > > > > + r2 = a[r1 && force_zero_index]; /* BUGGY!!! */ > > > > > + > > > > > + The reason this is buggy is that "&&" and "||" are often > > > > > compiled > > > > > + using branches. While weak-memory machines such as ARM or > > > > > PowerPC > > > > > + do order stores after such branches, they can speculate loads, > > > > > + which can result in misordering bugs. > > > > > + > > > > > +oDo not use the results from relational operators ("==", "!=", > > > > > + ">", ">=", "<", or "<=") when dereferencing. For example, > > > > > + the following (quite strange) code is buggy: > > > > > + > > > > > + int a[2]; > > > > > + int index; > > > > > + int flip_index = 0; > > > > > + > > > > > + ... > > > > > + > > > > > + r1 = rcu_dereference(i1) > > > > > + r2 = a[r1 != flip_index]; /* BUGGY!!! */ > > > > > + > > > > > + As before, the reason this is buggy is that relational operators > > > > > + are often compiled using branches. And as before, although > > > > > + weak-memory machines such as ARM or PowerPC do order stores > > > > > + after such branches, but can speculate loads, which can again > > > > > + result in misordering bugs. > > > > > > > > Those two would be allowed by the wording I have recently proposed, > > > > AFAICS. r1 != flip_index would result in two possible values (unless > > > > there are further constraints due to the type of r1 and the values that > > > > flip_index can have). > > > > > > And I am OK with the value_dep_preserving type providing more/better > > > guarantees than we get by default from current compilers. > > > > > > One question, though. Suppose that the code did not want a value > > > dependency to be tracked through a comparison operator. What does > > > the developer do in that case? (The reason I ask is that I have > > > not yet found a use case in the Linux kernel that expects a value > > > dependency to be tracked through a comparison.) > > > > Hmm. I suppose use an explicit cast to non-vdp before or after the > > comparison? > > That should work well assuming that things like "if", "while", and "?:" > conditions are happy to take a vdp. This assumes that p->a only returns > vdp if field "a" is declared vdp, otherwise we have vdps running wild > through the program. ;-) > > The other thing that can happen is that a vdp can get handed off to > another synchronization mechanism, for example, to reference counting: > > p = atomic_load_explicit(&gp, memory_order_consume); > if (do_something_with(p->a)) { > /* fast path protected by RCU. */ > return 0; > } > if (atomic_inc_not_zero(&p->refcnt) { > /* slow path protected by reference counting. */ > return do_something_else_with((struct foo *)p); /* CHANGE */ > } > /* Needed slow path, but raced with deletion. */ > return -EAGAIN; > > I am guessing that the cast ends the vdp. Is that the case? And here is a more elaborate example from the Linux kernel: struct md_rdev value_dep_preserving *rdev; /* CHANGE */ rdev = rcu_dereference(conf->mirrors[disk].rdev); if (r1_bio->bios[disk] == IO_BLOCKED || rdev == NULL || test_bit(Unmerged, &rdev->flags) || test_bit(Faulty, &rdev->flags)) continue; The fact that the "rdev == NULL" returns vdp does not force the "||" operators to be evaluated arithmetically because the entire function is an "if" condition, correct? Thanx, Paul
Re: linux says it is a bug
On Mar 4, 2014, at 2:30 PM, Richard Henderson wrote: > On 03/04/2014 01:23 AM, Richard Biener wrote: >> Doesn't sound like a bug but a feature. We can move >> asm ("" : : : "memory") around freely up to the next/previous >> instruction involving memory. > > Asms without outputs are automatically volatile. So there ought be zero > change > with and without the explicit use of the __volatile__ keyword. That’s what the documentation says but it wasn’t actually true as of a couple of releases ago, as I recall. paul
Re: [RFC] Introducing MIPS O32 ABI Extension for FR0 and FR1 Interlinking
Matthew Fortune writes: >> Matthew Fortune writes: >> >> > Sorry, forgot about that. In that case maybe program headers would >> >> > be best, like you say. I.e. we could use a combination of GNU >> >> > attributes and a new program header, with the program header >> >> > hopefully being more general than for just this case. I suppose >> >> > this comes back to the thread from binutils@ last year about how to >> >> > manage the dwindling number of free flags: >> >> > >> >> > https://www.sourceware.org/ml/binutils/2013-09/msg00039.html >> >> > to https://www.sourceware.org/ml/binutils/2013-09/msg00099.html >> >> > >> > >> > There are a couple of issues to resolve in order to use gnu attributes >> > to record FP requirements at the module level. As it currently stands >> > gnu attributes are controlled via the .gnu_attribute directive and >> > these are emitted explicitly by the compiler. I think it is important >> > that a more meaningful directive is available but it will need to >> > interact nicely with the .gnu_attribute as well. >> > >> > The first problem is that there will be new ways to influence whether >> > a gnu attribute is emitted or not. i.e. the command line options >> > -mfp32, -mfpxx, -mfp64 will infer the relevant attribute >> > Tag_GNU_MIPS_ABI_FP and if the .module directive is present then that >> > would override it. Will there be any problems with a new ways to >> generate a gnu attribute? >> >> I think we should just give an error if any .gnu_attributes are >> inconsistent with the module-level setting (whether that comes from >> .module or command-line flags). > > I would need to account for the -msoft-float and -msingle-float command > line options to calculate module-level setting in order to do this, > which is fine. There is however no way to infer the no-float ABI from > command line options as it is not passed through from the GCC > driver. This would mean the no-float ABI would always conflict with the > module level setting. -mno-float the GCC option doesn't really select a different ABI. It does leave the FP attribute as being the default 0/dont-care, but that's just like it would be when compiling most hand-written assembly code, including code written before -mno-float or .gnu_attribute was invented. > I suspect the only answer is to make an exception > and allow a .gnu_attribute 4,0 to take precedence over a command line > option (but not a .module option). This seems a little convoluted in the > end. I don't think we should ever override an explicit .gnu_attribute. The most we can do is report a contradiction. >> > The second problem is that in order to support relaxing a mode >> > requirement then any up-front directive/command line option that sets >> > a specific fp32/fp64 requirement needs to be updated to fpxx. With gnu >> > attributes this would mean updating an existing Tag_GNU_MIPS_ABI_FP >> > setting to be modeless. >> >> Not sure what you mean here, sorry. > > At the end of a unit we will know whether an FP32 or FP64 ABI can be > relaxed to FPXX. This will happen if no floating point code has been > emitted that uses odd numbered registers. All I was checking is that it > is going to be acceptable to alter the FP ABI attribute even if it was > set using the .gnu_attribute directive. I know I 'can' do it in the code > as I have it working already just checking that it is OK. I suppose this > case is going to be quite rare (hand written assembly code that includes > a .gnu_attribute 4,1 which is mode safe) but I'd like to catch as many > cases as possible and relax the ABI. Yeah, I don't think we should do any relaxation like that. If a specific attribute value is chosen we should honour it even if it doesn't seem necessary. If -mfp32, -mfp64, .module fp=32 or .module fp=64 is used we should honour it even if -mfpxx or .module fp=xx seems OK. Thanks, Richard
RE: [RFC] Introducing MIPS O32 ABI Extension for FR0 and FR1 Interlinking
> Matthew Fortune writes: > >> > Sorry, forgot about that. In that case maybe program headers would > >> > be best, like you say. I.e. we could use a combination of GNU > >> > attributes and a new program header, with the program header > >> > hopefully being more general than for just this case. I suppose > >> > this comes back to the thread from binutils@ last year about how to > >> > manage the dwindling number of free flags: > >> > > >> > https://www.sourceware.org/ml/binutils/2013-09/msg00039.html > >> > to https://www.sourceware.org/ml/binutils/2013-09/msg00099.html > >> > > > > > There are a couple of issues to resolve in order to use gnu attributes > > to record FP requirements at the module level. As it currently stands > > gnu attributes are controlled via the .gnu_attribute directive and > > these are emitted explicitly by the compiler. I think it is important > > that a more meaningful directive is available but it will need to > > interact nicely with the .gnu_attribute as well. > > > > The first problem is that there will be new ways to influence whether > > a gnu attribute is emitted or not. i.e. the command line options > > -mfp32, -mfpxx, -mfp64 will infer the relevant attribute > > Tag_GNU_MIPS_ABI_FP and if the .module directive is present then that > > would override it. Will there be any problems with a new ways to > generate a gnu attribute? > > I think we should just give an error if any .gnu_attributes are > inconsistent with the module-level setting (whether that comes from > .module or command-line flags). I would need to account for the -msoft-float and -msingle-float command line options to calculate module-level setting in order to do this, which is fine. There is however no way to infer the no-float ABI from command line options as it is not passed through from the GCC driver. This would mean the no-float ABI would always conflict with the module level setting. I suspect the only answer is to make an exception and allow a .gnu_attribute 4,0 to take precedence over a command line option (but not a .module option). This seems a little convoluted in the end. The only other alternative is to just allow the .module fp=... options to act as human readable aliases for the .gnu_attribute options and take whatever comes last. > > The second problem is that in order to support relaxing a mode > > requirement then any up-front directive/command line option that sets > > a specific fp32/fp64 requirement needs to be updated to fpxx. With gnu > > attributes this would mean updating an existing Tag_GNU_MIPS_ABI_FP > > setting to be modeless. > > Not sure what you mean here, sorry. At the end of a unit we will know whether an FP32 or FP64 ABI can be relaxed to FPXX. This will happen if no floating point code has been emitted that uses odd numbered registers. All I was checking is that it is going to be acceptable to alter the FP ABI attribute even if it was set using the .gnu_attribute directive. I know I 'can' do it in the code as I have it working already just checking that it is OK. I suppose this case is going to be quite rare (hand written assembly code that includes a .gnu_attribute 4,1 which is mode safe) but I'd like to catch as many cases as possible and relax the ABI. Regards, Matthew
Re: linux says it is a bug
On 03/04/2014 01:23 AM, Richard Biener wrote: > Doesn't sound like a bug but a feature. We can move > asm ("" : : : "memory") around freely up to the next/previous > instruction involving memory. Asms without outputs are automatically volatile. So there ought be zero change with and without the explicit use of the __volatile__ keyword. r~
Re: [RFC][PATCH 0/5] arch: atomic rework
On Mon, Mar 03, 2014 at 09:46:19PM +0100, Torvald Riegel wrote: > xagsmtp2.20140303204700.3...@vmsdvma.vnet.ibm.com > X-Xagent-Gateway: vmsdvma.vnet.ibm.com (XAGSMTP2 at VMSDVMA) > > On Mon, 2014-03-03 at 11:20 -0800, Paul E. McKenney wrote: > > On Mon, Mar 03, 2014 at 07:55:08PM +0100, Torvald Riegel wrote: > > > xagsmtp2.20140303190831.9...@uk1vsc.vnet.ibm.com > > > X-Xagent-Gateway: uk1vsc.vnet.ibm.com (XAGSMTP2 at UK1VSC) > > > > > > On Fri, 2014-02-28 at 16:50 -0800, Paul E. McKenney wrote: > > > > +o Do not use the results from the boolean "&&" and "||" when > > > > + dereferencing. For example, the following (rather improbable) > > > > + code is buggy: > > > > + > > > > + int a[2]; > > > > + int index; > > > > + int force_zero_index = 1; > > > > + > > > > + ... > > > > + > > > > + r1 = rcu_dereference(i1) > > > > + r2 = a[r1 && force_zero_index]; /* BUGGY!!! */ > > > > + > > > > + The reason this is buggy is that "&&" and "||" are often > > > > compiled > > > > + using branches. While weak-memory machines such as ARM or > > > > PowerPC > > > > + do order stores after such branches, they can speculate loads, > > > > + which can result in misordering bugs. > > > > + > > > > +o Do not use the results from relational operators ("==", "!=", > > > > + ">", ">=", "<", or "<=") when dereferencing. For example, > > > > + the following (quite strange) code is buggy: > > > > + > > > > + int a[2]; > > > > + int index; > > > > + int flip_index = 0; > > > > + > > > > + ... > > > > + > > > > + r1 = rcu_dereference(i1) > > > > + r2 = a[r1 != flip_index]; /* BUGGY!!! */ > > > > + > > > > + As before, the reason this is buggy is that relational operators > > > > + are often compiled using branches. And as before, although > > > > + weak-memory machines such as ARM or PowerPC do order stores > > > > + after such branches, but can speculate loads, which can again > > > > + result in misordering bugs. > > > > > > Those two would be allowed by the wording I have recently proposed, > > > AFAICS. r1 != flip_index would result in two possible values (unless > > > there are further constraints due to the type of r1 and the values that > > > flip_index can have). > > > > And I am OK with the value_dep_preserving type providing more/better > > guarantees than we get by default from current compilers. > > > > One question, though. Suppose that the code did not want a value > > dependency to be tracked through a comparison operator. What does > > the developer do in that case? (The reason I ask is that I have > > not yet found a use case in the Linux kernel that expects a value > > dependency to be tracked through a comparison.) > > Hmm. I suppose use an explicit cast to non-vdp before or after the > comparison? That should work well assuming that things like "if", "while", and "?:" conditions are happy to take a vdp. This assumes that p->a only returns vdp if field "a" is declared vdp, otherwise we have vdps running wild through the program. ;-) The other thing that can happen is that a vdp can get handed off to another synchronization mechanism, for example, to reference counting: p = atomic_load_explicit(&gp, memory_order_consume); if (do_something_with(p->a)) { /* fast path protected by RCU. */ return 0; } if (atomic_inc_not_zero(&p->refcnt) { /* slow path protected by reference counting. */ return do_something_else_with((struct foo *)p); /* CHANGE */ } /* Needed slow path, but raced with deletion. */ return -EAGAIN; I am guessing that the cast ends the vdp. Is that the case? Thanx, Paul
Re: [RFC][ARM] Naming for new switch to check for mixed hardfloat/softfloat compat
Thomas Preudhomme writes: >> I think the ability to detect the case of generating ABI agnostic >> code would be useful for other architectures too. > > I do not know the other architecture to know if that is the case but > according to what you said for MIPS it seems to be the case. Right now I > implemented it completely in the backend but that can be done in middle > end if several architecture would benefit from it. Yeah, that'd be great. The checking that MIPS's -mno-float should do (but doesn't do) would be a superset of what you need, since the MIPS case would include internal uses of floats. But it would definitely make sense to share the bits that can be shared and (for example) to get consistent error messages for them. > I did not do it because I thought that other architecture might care > more about different kind of ABI like the calling convention for long > long on 32 bit architectures or structure or bitfield. I did not know > if the calling convention for float parameter mattered to other > architecture. Also in the case of ARM a structure of up to 4 > floats/double would be passed via float registers so that would be > counted as a float but maybe not for MIPS. So a common switch for > several architecture might be strange if each interpret it in a > different way. How about warning for all float types (as the float option) and all vector types (as a separate option)? I'm not sure there's as much value in warning specifically about "hardware" types since that can always change in future. E.g. a while ago the only MIPS vector of interest was V2SF, but then Loongson added some integer ones, and now MSA is adding 128-bit vector types. There could be wider types in future, as happened for 512-bit AVX. >> MIPS does have an option for something similar to this which is >> -mno-float but it does not really do what you are aiming for here. >> The >> -mno-float option marks a module as float ABI agnostic but actually >> performs code gen for a soft-float ABI. It is up to the programmer to >> avoid floating point in function signatures. Perhaps this option >> would >> be useful to support the enforced compatible ABI but being able to >> relax the ABI is better still as it would require no effort from the >> end user. I'm planning on proposing these kind of changes for MIPS in >> the near future. > > Yeah that's different to no-float here since the body of functions can > use float arithmetic and even function interface as long as they are not > public. I am interesting in knowing what exactly is your use case, what > is the difference for the calling convention with regards to float on > MIPS architecture. Maybe it's just a matter to choose the right name for > the switch such as -mabi-agnostic or something? -mno-float as it stands today is really just -msoft-float with some floating-point support removed from the library to save space. One of the important examples is that the floating-point printf and scanf formats are not supported, so printf and scanf do not pull in large parts of the software floating-point library. But the restrictions that apply to -mno-float should make it link-compatible with -mhard-float too, as for your use case. Thanks, Richard
Re: [RFC] Introducing MIPS O32 ABI Extension for FR0 and FR1 Interlinking
Matthew Fortune writes: >> > Sorry, forgot about that. In that case maybe program headers would be >> > best, like you say. I.e. we could use a combination of GNU attributes >> > and a new program header, with the program header hopefully being more >> > general than for just this case. I suppose this comes back to the >> > thread from binutils@ last year about how to manage the dwindling >> > number of free flags: >> > >> > https://www.sourceware.org/ml/binutils/2013-09/msg00039.html >> > to https://www.sourceware.org/ml/binutils/2013-09/msg00099.html >> > > > There are a couple of issues to resolve in order to use gnu attributes > to record FP requirements at the module level. As it currently stands > gnu attributes are controlled via the .gnu_attribute directive and these > are emitted explicitly by the compiler. I think it is important that a > more meaningful directive is available but it will need to interact > nicely with the .gnu_attribute as well. > > The first problem is that there will be new ways to influence whether a > gnu attribute is emitted or not. i.e. the command line options -mfp32, > -mfpxx, -mfp64 will infer the relevant attribute Tag_GNU_MIPS_ABI_FP and > if the .module directive is present then that would override it. Will > there be any problems with a new ways to generate a gnu attribute? I think we should just give an error if any .gnu_attributes are inconsistent with the module-level setting (whether that comes from .module or command-line flags). > The second problem is that in order to support relaxing a mode > requirement then any up-front directive/command line option that sets a > specific fp32/fp64 requirement needs to be updated to fpxx. With gnu > attributes this would mean updating an existing Tag_GNU_MIPS_ABI_FP > setting to be modeless. Not sure what you mean here, sorry. Thanks, Richard
RE: [RFC][ARM] Naming for new switch to check for mixed hardfloat/softfloat compat
On Wed, 5 Mar 2014, Thomas Preudhomme wrote: > Right. It's actually quite simple. As soon as you meet a function which passes > or returns a float then you can mark the whole module as not agnostic and fall > back to the usual behavior. If you arrive at the end of a compiling unit > without encoutering such a case then you are agnostic. You could mark each > function individually but I don't think it's necessary to go that far. If you > want only some functions to be compatible you could put them in a separate > file. My current patch goes a bit beyond than this in that it only care about > public interface. Call withing one module can use whatever calling convention > they want, it does not change the ABI. If the function is only declared and not called or defined (in a system header etc.), of course you don't want that to affect the ABI (even in the case of an inline function in a system header, unless an out-of-line call is generated to it). But a call to a function not defined in that unit does affect the ABI compatibility, if the call involves affected types. Some libgcc functions on ARM have ABIs that depend on which AAPCS variant is in use - that is, libcalls, not just explicitly defined or called functions, can affect the ABI compatibility. But the RTABI functions don't - if you allow for that, then you increase the number of cases that end up compatible with both ABI variants. On ARM, variadic functions use only the base AAPCS and so don't affect compatibility even if they have floating-point (or vector) arguments. (This is something that's different on some other architectures with similar issues.) -- Joseph S. Myers jos...@codesourcery.com
Re: [RFC] Meta-description for tree and gimple folding
On Mon, 3 Mar 2014, Richard Biener wrote: How do I restrict some subexpression to have a single use? This kind of restrictions come via the valueize() hook - simply valueize to NULL_TREE to make the match fail (for example SSA_NAME_OCCURS_IN_ABNORMAL_PHI could be made fail that way). Shouldn't that single-use property depend more on the transformation and less on where it is called from? a+b-b -> a is always going to be a good idea (well, register pressure aside), even if a+b is used in many other places. But if you are using a*b elsewhere, turning a*b+c into FMA doesn't make so much sense. Well, we can always call has_single_use on some @i if it is an SSA_NAME. If I write a COND_EXPR matcher, could it generate code for phiopt as well? Not sure, what do you have in mind specifically? fold-const.c has the equivalent of: (define_match_and_simplify abs (COND_EXPR (LT_EXPR @0 zero_p) (NEGATE_EXPR @0) @0) (ABS_EXPR @0)) (it would help to be able to write LT_EXPR|LE_EXPR, maybe even to try automatically simplify(!a)?c:b for a?b:c) which works well on trees, but requires more complicated code in phiopt (same for min/max). How do you handle a transformation that currently tries to recursively fold something else and does the main transformation only if that simplified? And doesn't do the other folding (because it's not in the IL literally?)? Similar to the cst without overflow case, by writing custom C code and allowing that to signal failure. I am not sure if it will be easy to write code that works for generic and gimple. I'll see... -- Marc Glisse
RE: [RFC][ARM] Naming for new switch to check for mixed hardfloat/softfloat compat
Le 2014-03-04 19:14, Matthew Fortune a écrit : Hi Thomas, Hi Matthew, Do you particularly need a switch for this? You could view this as simply relaxing the ABI requirements of a module, a switch would only serve to enforce the need for a compatible ABI and error if not. If you build something for a soft-float ABI and never actually trigger any of the soft-float specific parts of the ABI then you could safely mark the module as no-float ABI (same for hard-float). I was maybe not clear enough but my patch already has this logic to detect whether any float is used at all and emit an elf attribute accordingly. However, someone who wants to be compatible with both softfloat and hardfloat would need to look at the output to see that was achieved. Such a switch would allow people to actually ensure that they managed to create a float agnostic ABI. A simple check would be if floating point types are used in parameters or returns but actually it could be more refined still if none of the arguments or returns would be passed differently between the ABI types. The problem with relaxing the ABI is that you only know whether it can be relaxed at the end of compiling all functions, I am currently doing some work for MIPS where the assembler will be calculating overall requirements based on an initial setting and then analysis of the code in the module. To relax a floating point ABI I would expect to emit an ABI attribute at the head of a file, which is either soft or hard float, but then each function would get an attribute to say if it ended up as a compatible ABI. If all global functions say compatible then the module can be relaxed to be a compatible FP ABI. Right. It's actually quite simple. As soon as you meet a function which passes or returns a float then you can mark the whole module as not agnostic and fall back to the usual behavior. If you arrive at the end of a compiling unit without encoutering such a case then you are agnostic. You could mark each function individually but I don't think it's necessary to go that far. If you want only some functions to be compatible you could put them in a separate file. My current patch goes a bit beyond than this in that it only care about public interface. Call withing one module can use whatever calling convention they want, it does not change the ABI. I think the ability to detect the case of generating ABI agnostic code would be useful for other architectures too. I do not know the other architecture to know if that is the case but according to what you said for MIPS it seems to be the case. Right now I implemented it completely in the backend but that can be done in middle end if several architecture would benefit from it. I did not do it because I thought that other architecture might care more about different kind of ABI like the calling convention for long long on 32 bit architectures or structure or bitfield. I did not know if the calling convention for float parameter mattered to other architecture. Also in the case of ARM a structure of up to 4 floats/double would be passed via float registers so that would be counted as a float but maybe not for MIPS. So a common switch for several architecture might be strange if each interpret it in a different way. MIPS does have an option for something similar to this which is -mno-float but it does not really do what you are aiming for here. The -mno-float option marks a module as float ABI agnostic but actually performs code gen for a soft-float ABI. It is up to the programmer to avoid floating point in function signatures. Perhaps this option would be useful to support the enforced compatible ABI but being able to relax the ABI is better still as it would require no effort from the end user. I'm planning on proposing these kind of changes for MIPS in the near future. Yeah that's different to no-float here since the body of functions can use float arithmetic and even function interface as long as they are not public. I am interesting in knowing what exactly is your use case, what is the difference for the calling convention with regards to float on MIPS architecture. Maybe it's just a matter to choose the right name for the switch such as -mabi-agnostic or something? Anyway, thanks for your comment on this issue. It is good to know that there might be some other uses than just ARM. Regards, Matthew Best regards, Thomas
Re: linux says it is a bug
On Tue, Mar 04, 2014 at 12:08:19PM +0100, Richard Biener wrote: > On Tue, Mar 4, 2014 at 10:33 AM, Hannes Frederic Sowa > wrote: > > On Tue, Mar 04, 2014 at 09:26:31AM +, Andrew Haley wrote: > >> On 03/04/2014 09:24 AM, Hannes Frederic Sowa wrote: > >> >> > So the bug was probably fixed more than 15 years ago. > >> > Probably :) > >> > > >> > But the __volatile__ shoud do no harm and shouldn't influence code > >> > generation in any way, no? > >> > >> Of course it will: it's a barrier. > > > > Sure. My question was about the volatile marker. asm("":::"memory") should > > act > > as the barrier alone. > > __asm__("":::"memory") > > is a memory barrier > > volatile __asm__("":::"memory") > > is a memory barrier and a barrier for other volatile instructions. Hi Andrew, What is volatile instructions?Can you give us an example? -- Regards lin zuojian
Re: [RFC] Meta-description for tree and gimple folding
On Mon, 3 Mar 2014, Kai Tietz wrote: > 2014-03-03 12:33 GMT+01:00 Richard Biener : > > On Fri, 28 Feb 2014, Kai Tietz wrote: > > > >> Hmm, this all reminds me about the approach Andrew Pinski and I came > >> up with two years ago. > > > > You are talking about the gimple folding interface? Yes, but it's > > more similar to what I proposed before that. > > Well, this interface was for rtl, gimple, and tree AFAIR. > > > >> So I doubt that we want to keep fold-const patterns similar to gimple > >> (forward-prop) ones. > >> Wouldn't it make more sense to move fold-const patterns completely > >> into gimple, and having a facility in FE to ask gimple to *pre*-fold a > >> given tree to see if a constant-expression can be achieved? > > > > That was proposed by somebody, yes. The FE would hand off an > > expression to 1) the gimplifier to gimplify it, then 2) to the > > gimple folder to simplify it. Not sure if that's a good design > > but yes, it mimics the awkward thing we do now (genericize for > > folding in fold_stmt), just the other way around - and it makes > > it very costly. > > Right, if we would redo step one, and two each time we visit same > statement again, then of course we would produce pretty high load. > By hashing this *pre*-computed gimple-expression I think the load of > such an approach would lower pretty much. Of course it is true that > we move gimplification-costs to FE. Nevertheless the avarage costs > should be in general the same as we have them now. > > > > Having a single meta-description of simplifications makes it > > possible to have the best of both worlds (no need to GENERICIZE > > back from GIMPLE and no need to GIMPLIFY from GENERIC) with > > a single point of maintainance. > > True. I am fully agreeing to the positive effects of a single > meta-description for this area. For sure it is worth to avoid the > re-doing of the same folding for GENERIC/GIMPLE again and again. > > > [the possibility to use offline verification tools for the > > transforms comes to my mind as well] > This is actually a pretty interesting idea. As it would allow us to > do testing for this area without side-effects by high-level passes, > target-properties, etc > > > If you think the .md style pattern definitions are too limiting > > can you think of sth more powerful without removing the possibility > > of implementing the matching with a state machine to make it O(1)? > > Well, I am not opposed to the use of .md style pattern defintions at > all. I see just some weaknesses on the current tree-folding > mechanism. > AST folder tries to fold into statementes by recursion into. This > causes pretty high load in different areas. Like stack-growth, > unnecessary repetition of operations on inner-statements, and complex > patterns for expression associative/distributive/commutative rules for > current operation-code. Actually it doesn't recurse - it avoids recursion by requiring each sub-pattern to be already folded. > I am thinking about a model where we use just for the > base-fold-operations the .md-style pattern definitions. On top of this > model we set a layer implementing associative/commutative/distributive > properties for statements in an optimize form. Sure, that would be a pass using the match-and-simplify infrastructure. In fact the infrastructure alone only provides enough to do the bare folding. > By this we can do two different things with lower load. One hand we > can do "virtual" folding and avoid tree-rebuilding without need. On > the second hand we can use same pass for "normalize" tree structure of > expression-chains. > Additionally we can get rid of the than pretty useless reassociation > pass, which is IMHO just necessary by gimple-folders inability to do > that. Well ... you still need a pass that re-associates (or distributes or applies whatever properties) to feed the match-and-simplify infrastructure with proper input. So no, reassoc won't get and isn't useless. Richard.
Re: GSoC question
Thanks Maxim, There is a tentative plan to merge the concepts branch into trunk, and that would probably be at the end of the summer or fall, so that might fit nicely. It probably wouldn't hurt to have the students apply, regardless of the final decisions. Writing proposals is good experience. I'll send over the mentor request. Andrew On Mon, Mar 3, 2014 at 11:46 PM, Maxim Kuvyrkov wrote: > On Saturday, March 1, 2014, Andrew Sutton wrote: >> >> Hi all, >> >> I'm not sure if this is the right place to post, but I had a question >> regarding GSoC. I'm working on the C++ concepts branch (slowly), and >> had some students express interest in contributing to its development >> over the course of the summer. I was hoping I could act as a mentor or >> co-mentor for one or both. >> >> I asked Jason Merrill off list and he indicated that this would be a >> good idea. I wanted to float the idea with the GSoC admin/co-admin >> before sending a "make me a mentor request". >> >> Thoughts? >> >> Andrew Sutton > > > Andrew, > > You are a GCC developer and that is all that's needed to be a potential > mentor. Please send the request on the GSoC website. > > Regarding the potential student projects, we ideally want them incorporated > into GCC mainline. This is a soft requirement/desire, so not 100% required, > but any and all effort to put student's work into mainline by the end of > GSoC would be highly appreciated. > > Thank you, > > -- > Maxim Kuvyrkov > > > -- > -- > Maxim Kuvyrkov >
Re: linux says it is a bug
On Tue, Mar 4, 2014 at 1:01 PM, Hans-Peter Nilsson wrote: > On Tue, 4 Mar 2014, Yury Gribov wrote: >> Richard wrote: >> > volatile __asm__("":::"memory") >> > >> > is a memory barrier and a barrier for other volatile instructions. >> >> AFAIK asm without output arguments is implicitly marked as volatile. So it >> may >> not be needed in barrier() at all. > > Yes, exactly. Had it at some time been needed, that'd be a bug. > (I have a faint recollection of that, faint enough to be a false > memory.) Ah, indeed. > brgds, H-P
Re: linux says it is a bug
On Tue, 4 Mar 2014, Yury Gribov wrote: > Richard wrote: > > volatile __asm__("":::"memory") > > > > is a memory barrier and a barrier for other volatile instructions. > > AFAIK asm without output arguments is implicitly marked as volatile. So it may > not be needed in barrier() at all. Yes, exactly. Had it at some time been needed, that'd be a bug. (I have a faint recollection of that, faint enough to be a false memory.) brgds, H-P
Re: linux says it is a bug
Richard wrote: > volatile __asm__("":::"memory") > > is a memory barrier and a barrier for other volatile instructions. AFAIK asm without output arguments is implicitly marked as volatile. So it may not be needed in barrier() at all. -Y
Re: linux says it is a bug
On Tue, Mar 04, 2014 at 12:08:19PM +0100, Richard Biener wrote: > On Tue, Mar 4, 2014 at 10:33 AM, Hannes Frederic Sowa > wrote: > > On Tue, Mar 04, 2014 at 09:26:31AM +, Andrew Haley wrote: > >> On 03/04/2014 09:24 AM, Hannes Frederic Sowa wrote: > >> >> > So the bug was probably fixed more than 15 years ago. > >> > Probably :) > >> > > >> > But the __volatile__ shoud do no harm and shouldn't influence code > >> > generation in any way, no? > >> > >> Of course it will: it's a barrier. > > > > Sure. My question was about the volatile marker. asm("":::"memory") should > > act > > as the barrier alone. > > __asm__("":::"memory") > > is a memory barrier > > volatile __asm__("":::"memory") > > is a memory barrier and a barrier for other volatile instructions. > > Nothing more, nothing less. > > Neither is a "optimization barrier" or "compiler barrier" or whatever > name you invent. Being a bit more familiar with the kernel code I often think about a memory barrier as "fencing" instruction. Thanks for the clarification, Hannes
RE: [RFC][ARM] Naming for new switch to check for mixed hardfloat/softfloat compat
Hi Thomas, Do you particularly need a switch for this? You could view this as simply relaxing the ABI requirements of a module, a switch would only serve to enforce the need for a compatible ABI and error if not. If you build something for a soft-float ABI and never actually trigger any of the soft-float specific parts of the ABI then you could safely mark the module as no-float ABI (same for hard-float). A simple check would be if floating point types are used in parameters or returns but actually it could be more refined still if none of the arguments or returns would be passed differently between the ABI types. The problem with relaxing the ABI is that you only know whether it can be relaxed at the end of compiling all functions, I am currently doing some work for MIPS where the assembler will be calculating overall requirements based on an initial setting and then analysis of the code in the module. To relax a floating point ABI I would expect to emit an ABI attribute at the head of a file, which is either soft or hard float, but then each function would get an attribute to say if it ended up as a compatible ABI. If all global functions say compatible then the module can be relaxed to be a compatible FP ABI. I think the ability to detect the case of generating ABI agnostic code would be useful for other architectures too. MIPS does have an option for something similar to this which is -mno-float but it does not really do what you are aiming for here. The -mno-float option marks a module as float ABI agnostic but actually performs code gen for a soft-float ABI. It is up to the programmer to avoid floating point in function signatures. Perhaps this option would be useful to support the enforced compatible ABI but being able to relax the ABI is better still as it would require no effort from the end user. I'm planning on proposing these kind of changes for MIPS in the near future. Regards, Matthew > -Original Message- > From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of > Thomas Preudhomme > Sent: 04 March 2014 07:02 > To: gcc@gcc.gnu.org > Subject: [RFC][ARM] Naming for new switch to check for mixed > hardfloat/softfloat compat > > [Please CC me as I'm not subscribed to this list] > > Hi there, > > I'm currently working on adding a switch to check whether public > function involve float parameters or return values. Such a check would > be useful for people trying to write code that is compatible with both > base standard (softfloat) and standard variant (hardfloat) ARM calling > convention. I also intend to set the ELF attribute Tag_ABI_VFP_args to > value 3 (code compatible with both ABI) so this check would allow to > make sure such value would be set. > > I initially thought about reusing -mfloat-abi with the value none for > that purpose since it would somehow define a new ABI where no float can > be used. However, it would then not be possible to forbit float in > public interface with the use of VFP instructions for float arithmetic > (softfp) because this switch conflates the float ABI with the use of a > floating point unit for float arithmetic. Also, gcc passes -mfloat-abi > down to the assembler and that would mean teaching the assembler about - > mfloat-abi=none as well. > > I thus think that a new switch would be better and am asking for your > opinion about it as I would like this functionality to incorporate gcc > codebase. > > Best regards, > > Thomas Preud'homme
Re: linux says it is a bug
On Tue, Mar 4, 2014 at 10:33 AM, Hannes Frederic Sowa wrote: > On Tue, Mar 04, 2014 at 09:26:31AM +, Andrew Haley wrote: >> On 03/04/2014 09:24 AM, Hannes Frederic Sowa wrote: >> >> > So the bug was probably fixed more than 15 years ago. >> > Probably :) >> > >> > But the __volatile__ shoud do no harm and shouldn't influence code >> > generation in any way, no? >> >> Of course it will: it's a barrier. > > Sure. My question was about the volatile marker. asm("":::"memory") should act > as the barrier alone. __asm__("":::"memory") is a memory barrier volatile __asm__("":::"memory") is a memory barrier and a barrier for other volatile instructions. Nothing more, nothing less. Neither is a "optimization barrier" or "compiler barrier" or whatever name you invent. Richard.
Re: linux says it is a bug
On Tue, Mar 04, 2014 at 09:26:31AM +, Andrew Haley wrote: > On 03/04/2014 09:24 AM, Hannes Frederic Sowa wrote: > >> > So the bug was probably fixed more than 15 years ago. > > Probably :) > > > > But the __volatile__ shoud do no harm and shouldn't influence code > > generation in any way, no? > > Of course it will: it's a barrier. Sure. My question was about the volatile marker. asm("":::"memory") should act as the barrier alone.
Re: linux says it is a bug
On 03/04/2014 09:24 AM, Hannes Frederic Sowa wrote: >> > So the bug was probably fixed more than 15 years ago. > Probably :) > > But the __volatile__ shoud do no harm and shouldn't influence code > generation in any way, no? Of course it will: it's a barrier. Andrew.
Re: linux says it is a bug
On Tue, Mar 04, 2014 at 09:19:40AM +, Jonathan Wakely wrote: > On 4 March 2014 09:17, Hannes Frederic Sowa > wrote: > > On Tue, Mar 04, 2014 at 10:10:21AM +0100, Richard Biener wrote: > >> On Tue, Mar 4, 2014 at 7:40 AM, lin zuojian wrote: > >> > Hi, > >> > in include/linux/compiler-gcc.h : > >> > > >> > /* Optimization barrier */ > >> > /* The "volatile" is due to gcc bugs */ > >> > #define barrier() __asm__ __volatile__("": : :"memory") > >> > > >> > The comment of Linux says this is a gcc bug.But will any sane compiler > >> > disable optimization without "volatile" key word? > >> > >> Depends what they call an "optimization barrier". A plain > >> __asm__ ("" : : : "memory") is a memory barrier. Adding volatile > >> to the asm makes it a barrier for every other volatile instruction, > >> nothing more. > > > > This is meant to be a compiler barrier not a memory barrier and got > > added by David Miller because of a problem in gcc-2.7.2: > > > > | Add __volatile__ to barrier() definition, I convinced Linus > > | to eat this patch. The problem is that with gcc-2.7.2 derived > > | compilers the instruction scheduler can move it around due to > > | a bug. This bug appears on sparc64/SMP with our old compiler > > | in that is miscompiles the beginning of exit.c:release() causing > > | lockups if the race is hit in the SMP specific code there. I > > | believe sparc32 gcc-2.7.2 sees this bug too, but I'm not too sure > > | (Anton showed me something similar once). > > > > So the bug was probably fixed more than 15 years ago. Probably :) But the __volatile__ shoud do no harm and shouldn't influence code generation in any way, no?
Re: linux says it is a bug
On Tue, Mar 4, 2014 at 10:19 AM, Jonathan Wakely wrote: > On 4 March 2014 09:17, Hannes Frederic Sowa > wrote: >> On Tue, Mar 04, 2014 at 10:10:21AM +0100, Richard Biener wrote: >>> On Tue, Mar 4, 2014 at 7:40 AM, lin zuojian wrote: >>> > Hi, >>> > in include/linux/compiler-gcc.h : >>> > >>> > /* Optimization barrier */ >>> > /* The "volatile" is due to gcc bugs */ >>> > #define barrier() __asm__ __volatile__("": : :"memory") >>> > >>> > The comment of Linux says this is a gcc bug.But will any sane compiler >>> > disable optimization without "volatile" key word? >>> >>> Depends what they call an "optimization barrier". A plain >>> __asm__ ("" : : : "memory") is a memory barrier. Adding volatile >>> to the asm makes it a barrier for every other volatile instruction, >>> nothing more. >> >> This is meant to be a compiler barrier not a memory barrier and got >> added by David Miller because of a problem in gcc-2.7.2: >> >> | Add __volatile__ to barrier() definition, I convinced Linus >> | to eat this patch. The problem is that with gcc-2.7.2 derived >> | compilers the instruction scheduler can move it around due to >> | a bug. This bug appears on sparc64/SMP with our old compiler >> | in that is miscompiles the beginning of exit.c:release() causing >> | lockups if the race is hit in the SMP specific code there. I >> | believe sparc32 gcc-2.7.2 sees this bug too, but I'm not too sure >> | (Anton showed me something similar once). > > > > So the bug was probably fixed more than 15 years ago. Doesn't sound like a bug but a feature. We can move asm ("" : : : "memory") around freely up to the next/previous instruction involving memory. Richard.
Re: linux says it is a bug
On 4 March 2014 09:17, Hannes Frederic Sowa wrote: > On Tue, Mar 04, 2014 at 10:10:21AM +0100, Richard Biener wrote: >> On Tue, Mar 4, 2014 at 7:40 AM, lin zuojian wrote: >> > Hi, >> > in include/linux/compiler-gcc.h : >> > >> > /* Optimization barrier */ >> > /* The "volatile" is due to gcc bugs */ >> > #define barrier() __asm__ __volatile__("": : :"memory") >> > >> > The comment of Linux says this is a gcc bug.But will any sane compiler >> > disable optimization without "volatile" key word? >> >> Depends what they call an "optimization barrier". A plain >> __asm__ ("" : : : "memory") is a memory barrier. Adding volatile >> to the asm makes it a barrier for every other volatile instruction, >> nothing more. > > This is meant to be a compiler barrier not a memory barrier and got > added by David Miller because of a problem in gcc-2.7.2: > > | Add __volatile__ to barrier() definition, I convinced Linus > | to eat this patch. The problem is that with gcc-2.7.2 derived > | compilers the instruction scheduler can move it around due to > | a bug. This bug appears on sparc64/SMP with our old compiler > | in that is miscompiles the beginning of exit.c:release() causing > | lockups if the race is hit in the SMP specific code there. I > | believe sparc32 gcc-2.7.2 sees this bug too, but I'm not too sure > | (Anton showed me something similar once). So the bug was probably fixed more than 15 years ago.
Re: linux says it is a bug
On Tue, Mar 04, 2014 at 10:10:21AM +0100, Richard Biener wrote: > On Tue, Mar 4, 2014 at 7:40 AM, lin zuojian wrote: > > Hi, > > in include/linux/compiler-gcc.h : > > > > /* Optimization barrier */ > > /* The "volatile" is due to gcc bugs */ > > #define barrier() __asm__ __volatile__("": : :"memory") > > > > The comment of Linux says this is a gcc bug.But will any sane compiler > > disable optimization without "volatile" key word? > > Depends what they call an "optimization barrier". A plain > __asm__ ("" : : : "memory") is a memory barrier. Adding volatile > to the asm makes it a barrier for every other volatile instruction, > nothing more. This is meant to be a compiler barrier not a memory barrier and got added by David Miller because of a problem in gcc-2.7.2: | Add __volatile__ to barrier() definition, I convinced Linus | to eat this patch. The problem is that with gcc-2.7.2 derived | compilers the instruction scheduler can move it around due to | a bug. This bug appears on sparc64/SMP with our old compiler | in that is miscompiles the beginning of exit.c:release() causing | lockups if the race is hit in the SMP specific code there. I | believe sparc32 gcc-2.7.2 sees this bug too, but I'm not too sure | (Anton showed me something similar once). Greetings, Hannes
Re: linux says it is a bug
On Tue, Mar 4, 2014 at 7:40 AM, lin zuojian wrote: > Hi, > in include/linux/compiler-gcc.h : > > /* Optimization barrier */ > /* The "volatile" is due to gcc bugs */ > #define barrier() __asm__ __volatile__("": : :"memory") > > The comment of Linux says this is a gcc bug.But will any sane compiler > disable optimization without "volatile" key word? Depends what they call an "optimization barrier". A plain __asm__ ("" : : : "memory") is a memory barrier. Adding volatile to the asm makes it a barrier for every other volatile instruction, nothing more. The term "optimization barrier" isn't well-defined. Richard.