Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-21 Thread Segher Boessenkool
At some point in the future, barrier() will be universally regarded as a hammer too big for most purposes. Whether or not removing it now You can't just remove it, it is needed in some places; you want to replace it in most places with a more fine-grained compiler barrier, I presume?

Re: RFC: drop support for gcc 4.0

2007-08-21 Thread Segher Boessenkool
How many people e.g. test -rc kernels compiled with gcc 3.2? Why would that matter? It either works or not. If it doesn't work, it can either be fixed, or support for that old compiler version can be removed. The only other policy than only remove support if things are badly broken would be

Re: RFC: drop support for gcc 4.0

2007-08-21 Thread Segher Boessenkool
Last time I tried a mips build, it would fail the compile unless I was using _exactly_ 3.4.4 (I didn't tried older versions, but did try 3.4.6, for ex.). If 3.4.4 works where 3.4.6 doesn't, you should report this as a bug; either here, or to the GCC team (but please be aware that the 3.4 series

Re: RFC: drop support for gcc 4.0

2007-08-21 Thread Segher Boessenkool
How many people e.g. test -rc kernels compiled with gcc 3.2? Why would that matter? It either works or not. If it doesn't work, it can either be fixed, or support for that old compiler version can be removed. One bug report kernel doesn't work / crash / ... when compiled with gcc 3.2, but

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-20 Thread Segher Boessenkool
And no, RMW on MMIO isn't "problematic" at all, either. An RMW op is a read op, a modify op, and a write op, all rolled into one opcode. But three actual operations. Maybe for some CPUs, but not all. ARM for instance can't use the load exclusive and store exclusive instructions to MMIO

Re: [POWERPC] Fix for assembler -g

2007-08-20 Thread Segher Boessenkool
But there is no lparmap.o! lparmap.s is the generated file. Yeah, tell that to scripts/Makefile.lib: _c_flags = $(CFLAGS) $(EXTRA_CFLAGS) $(CFLAGS_$(basetarget).o) What would do what a person expects is $(CFLAGS_$(@F)), I think. Looks good to me. Sam? We wanted to set

Re: [POWERPC] Fix for assembler -g

2007-08-20 Thread Segher Boessenkool
In a way though, good thing, or mainline would have continued to be broken :) What do I need to do to see this bug, btw? A special config symbol perhaps? CONFIG_DEBUG_INFO=y and pass-g-to-assembler-under-config_debug_info.patch from -mm Ah, an -mm patch. Gotcha. The fix most likely won't

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-20 Thread Segher Boessenkool
Such code generally doesn't care precisely when it gets the update, just that the update is atomic, and it doesn't loop forever. Yes, it _does_ care that it gets the update _at all_, and preferably as early as possible. Regardless, I'm convinced we just need to do it all in assembly. So do

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-20 Thread Segher Boessenkool
Right. ROTFL... volatile actually breaks atomic_t instead of making it safe. x++ becomes a register load, increment and a register store. Without volatile we can increment the memory directly. It seems that volatile requires that the variable is loaded into a register first and then operated

Re: [POWERPC] Fix for assembler -g

2007-08-20 Thread Segher Boessenkool
In a way though, good thing, or mainline would have continued to be broken :) What do I need to do to see this bug, btw? A special config symbol perhaps? Segher - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More

Re: [POWERPC] Fix for assembler -g

2007-08-20 Thread Segher Boessenkool
Hmm. Check the V=1 make output to see that the lparmap.c really got the -g0. The log says it didn't. Oops! It looks like the patch that got committed is the one that sets CFLAGS_lparmap.s, but really it needs to set CFLAGS_lparmap.o instead (go kbuild). Did I post the wrong one? (It's

Re: [POWERPC] Fix for assembler -g

2007-08-20 Thread Segher Boessenkool
Hmm. Check the V=1 make output to see that the lparmap.c really got the -g0. The log says it didn't. Oops! It looks like the patch that got committed is the one that sets CFLAGS_lparmap.s, but really it needs to set CFLAGS_lparmap.o instead (go kbuild). Did I post the wrong one? (It's

Re: [POWERPC] Fix for assembler -g

2007-08-20 Thread Segher Boessenkool
In a way though, good thing, or mainline would have continued to be broken :) What do I need to do to see this bug, btw? A special config symbol perhaps? Segher - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-20 Thread Segher Boessenkool
Right. ROTFL... volatile actually breaks atomic_t instead of making it safe. x++ becomes a register load, increment and a register store. Without volatile we can increment the memory directly. It seems that volatile requires that the variable is loaded into a register first and then operated

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-20 Thread Segher Boessenkool
Such code generally doesn't care precisely when it gets the update, just that the update is atomic, and it doesn't loop forever. Yes, it _does_ care that it gets the update _at all_, and preferably as early as possible. Regardless, I'm convinced we just need to do it all in assembly. So do

Re: [POWERPC] Fix for assembler -g

2007-08-20 Thread Segher Boessenkool
In a way though, good thing, or mainline would have continued to be broken :) What do I need to do to see this bug, btw? A special config symbol perhaps? CONFIG_DEBUG_INFO=y and pass-g-to-assembler-under-config_debug_info.patch from -mm Ah, an -mm patch. Gotcha. The fix most likely won't

Re: [POWERPC] Fix for assembler -g

2007-08-20 Thread Segher Boessenkool
But there is no lparmap.o! lparmap.s is the generated file. Yeah, tell that to scripts/Makefile.lib: _c_flags = $(CFLAGS) $(EXTRA_CFLAGS) $(CFLAGS_$(basetarget).o) What would do what a person expects is $(CFLAGS_$(@F)), I think. Looks good to me. Sam? We wanted to set

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-20 Thread Segher Boessenkool
And no, RMW on MMIO isn't problematic at all, either. An RMW op is a read op, a modify op, and a write op, all rolled into one opcode. But three actual operations. Maybe for some CPUs, but not all. ARM for instance can't use the load exclusive and store exclusive instructions to MMIO space.

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
The documentation simply doesn't say "+m" is allowed. The code to allow it was added for the benefit of people who do not read the documentation. Documentation for "+m" might get added later if it is decided this [the code, not the documentation] is a sane thing to have (which isn't directly

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
The "asm volatile" implementation does have exactly the same reordering guarantees as the "volatile cast" thing, I don't think so. "asm volatile" creates a side effect. Yeah. Side effects aren't allowed to be reordered wrt sequence points. Yeah. This is exactly the same reason as why

Re: [PATCH] powerpc: Implement atomic{,64}_{read,write}() without volatile

2007-08-17 Thread Segher Boessenkool
Instead, use asm() like all other atomic operations already do. +static __inline__ long atomic64_read(const atomic_t *v) +static __inline__ void atomic64_set(atomic_t *v, long i) s/atomic_t/atomic64_t/ in both lines. I've edited my copy of the patch. Ouch, sorry about that. Segher

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
No it does not have any volatile semantics. atomic_dec() can be reordered at will by the compiler within the current basic unit if you do not add a barrier. "volatile" has nothing to do with reordering. If you're talking of "volatile" the type-qualifier keyword, then

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
atomic_dec() writes to memory, so it _does_ have "volatile semantics", implicitly, as long as the compiler cannot optimise the atomic variable away completely -- any store counts as a side effect. I don't think an atomic_dec() implemented as an inline "asm volatile" or one that uses a "forget"

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
#define forget(a) __asm__ __volatile__ ("" :"=m" (a) :"m" (a)) [ This is exactly equivalent to using "+m" in the constraints, as recently explained on a GCC list somewhere, in response to the patch in my bitops series a few weeks back where I thought "+m" was bogus. ] [It wasn't

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
#define forget(a) __asm__ __volatile__ ("" :"=m" (a) :"m" (a)) [ This is exactly equivalent to using "+m" in the constraints, as recently explained on a GCC list somewhere, in response to the patch in my bitops series a few weeks back where I thought "+m" was bogus. ] [It wasn't

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
Here, I should obviously admit that the semantics of *(volatile int *)& aren't any neater or well-defined in the _language standard_ at all. The standard does say (verbatim) "precisely what constitutes as access to object of volatile-qualified type is implementation-defined", but GCC does help

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
Now the second wording *IS* technically correct, but come on, it's 24 words long whereas the original one was 3 -- and hopefully anybody reading the shorter phrase *would* have known anyway what was meant, without having to be pedantic about it :-) Well you were talking pretty formal (and

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
In a reasonable world, gcc should just make that be (on x86) addl $1,i(%rip) on x86-64, which is indeed what it does without the volatile. But with the volatile, the compiler gets really nervous, and doesn't dare do it in one instruction, and thus generates crap like movl

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
(and yes, it is perfectly legitimate to want a non-volatile read for a data type that you also want to do atomic RMW operations on) ...which is undefined behaviour in C (and GCC) when that data is declared volatile, which is a good argument against implementing atomics that way in itself.

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
Of course, since *normal* accesses aren't necessarily limited wrt re-ordering, the question then becomes one of "with regard to *what* does it limit re-ordering?". A C compiler that re-orders two different volatile accesses that have a sequence point in between them is pretty clearly a buggy

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
atomic_dec() already has volatile behavior everywhere, so this is semantically okay, but this code (and any like it) should be calling cpu_relax() each iteration through the loop, unless there's a compelling reason not to. I'll allow that for some hardware drivers (possibly this one) such a

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Almost everything is a tradeoff; and so is this. I don't believe most people would find disabling all compiler

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Almost everything is a tradeoff; and so is this. I don't believe most people would find disabling all compiler

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
atomic_dec() already has volatile behavior everywhere, so this is semantically okay, but this code (and any like it) should be calling cpu_relax() each iteration through the loop, unless there's a compelling reason not to. I'll allow that for some hardware drivers (possibly this one) such a

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
Of course, since *normal* accesses aren't necessarily limited wrt re-ordering, the question then becomes one of with regard to *what* does it limit re-ordering?. A C compiler that re-orders two different volatile accesses that have a sequence point in between them is pretty clearly a buggy

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
(and yes, it is perfectly legitimate to want a non-volatile read for a data type that you also want to do atomic RMW operations on) ...which is undefined behaviour in C (and GCC) when that data is declared volatile, which is a good argument against implementing atomics that way in itself.

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
In a reasonable world, gcc should just make that be (on x86) addl $1,i(%rip) on x86-64, which is indeed what it does without the volatile. But with the volatile, the compiler gets really nervous, and doesn't dare do it in one instruction, and thus generates crap like movl

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
Now the second wording *IS* technically correct, but come on, it's 24 words long whereas the original one was 3 -- and hopefully anybody reading the shorter phrase *would* have known anyway what was meant, without having to be pedantic about it :-) Well you were talking pretty formal (and

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
#define forget(a) __asm__ __volatile__ ( :=m (a) :m (a)) [ This is exactly equivalent to using +m in the constraints, as recently explained on a GCC list somewhere, in response to the patch in my bitops series a few weeks back where I thought +m was bogus. ] [It wasn't explained

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
Here, I should obviously admit that the semantics of *(volatile int *) aren't any neater or well-defined in the _language standard_ at all. The standard does say (verbatim) precisely what constitutes as access to object of volatile-qualified type is implementation-defined, but GCC does help us

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
#define forget(a) __asm__ __volatile__ ( :=m (a) :m (a)) [ This is exactly equivalent to using +m in the constraints, as recently explained on a GCC list somewhere, in response to the patch in my bitops series a few weeks back where I thought +m was bogus. ] [It wasn't explained

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
atomic_dec() writes to memory, so it _does_ have volatile semantics, implicitly, as long as the compiler cannot optimise the atomic variable away completely -- any store counts as a side effect. I don't think an atomic_dec() implemented as an inline asm volatile or one that uses a forget macro

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
No it does not have any volatile semantics. atomic_dec() can be reordered at will by the compiler within the current basic unit if you do not add a barrier. volatile has nothing to do with reordering. If you're talking of volatile the type-qualifier keyword, then

Re: [PATCH] powerpc: Implement atomic{,64}_{read,write}() without volatile

2007-08-17 Thread Segher Boessenkool
Instead, use asm() like all other atomic operations already do. +static __inline__ long atomic64_read(const atomic_t *v) +static __inline__ void atomic64_set(atomic_t *v, long i) s/atomic_t/atomic64_t/ in both lines. I've edited my copy of the patch. Ouch, sorry about that. Segher

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
The asm volatile implementation does have exactly the same reordering guarantees as the volatile cast thing, I don't think so. asm volatile creates a side effect. Yeah. Side effects aren't allowed to be reordered wrt sequence points. Yeah. This is exactly the same reason as why

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-17 Thread Segher Boessenkool
The documentation simply doesn't say +m is allowed. The code to allow it was added for the benefit of people who do not read the documentation. Documentation for +m might get added later if it is decided this [the code, not the documentation] is a sane thing to have (which isn't directly

Re: [PATCH 1/23] document preferred use of volatile with atomic_t

2007-08-16 Thread Segher Boessenkool
the atomic operations in assembler, like Segher Boessenkool did for powerpc in response to my previous patchset. Puh-lease. I DO NOT DISTRUST THE COMPILER, I just don't assume it will do whatever I would like it to do without telling it. It's a machine you know, and it is very well documented.

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
Here, I should obviously admit that the semantics of *(volatile int *)& aren't any neater or well-defined in the _language standard_ at all. The standard does say (verbatim) "precisely what constitutes as access to object of volatile-qualified type is implementation-defined", but GCC does help

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
Note that "volatile" is a type-qualifier, not a type itself, so a cast of the _object_ itself to a qualified-type i.e. (volatile int) would not make the access itself volatile-qualified. There is no such thing as "volatile-qualified access" defined anywhere; there only is the concept of a

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
I can't speak for this particular case, but there could be similar code examples elsewhere, where we do the atomic ops on an atomic_t object inside a higher-level locking scheme that would take care of the kind of problem you're referring to here. It would be useful for such or similar code if

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
I'd go so far as to say that anywhere where you want a non-"volatile" atomic_read, either your code is buggy, or else an int would work just as well. Even, the only way to implement a "non-volatile" atomic_read() is essentially as a plain int (you can do some tricks so you cannot assign to the

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
The only thing volatile on an asm does is create a side effect on the asm statement; in effect, it tells the compiler "do not remove this asm even if you don't need any of its outputs". It's not disabling optimisation likely to result in bugs, heisen- or otherwise; _not_ putting the volatile on

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Almost everything is a tradeoff; and so is this. I don't believe most people would find disabling all compiler

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Almost everything is a tradeoff; and so is this. I don't believe most people would find disabling all compiler

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
The only thing volatile on an asm does is create a side effect on the asm statement; in effect, it tells the compiler do not remove this asm even if you don't need any of its outputs. It's not disabling optimisation likely to result in bugs, heisen- or otherwise; _not_ putting the volatile on an

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
I'd go so far as to say that anywhere where you want a non-volatile atomic_read, either your code is buggy, or else an int would work just as well. Even, the only way to implement a non-volatile atomic_read() is essentially as a plain int (you can do some tricks so you cannot assign to the

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
I can't speak for this particular case, but there could be similar code examples elsewhere, where we do the atomic ops on an atomic_t object inside a higher-level locking scheme that would take care of the kind of problem you're referring to here. It would be useful for such or similar code if

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
Note that volatile is a type-qualifier, not a type itself, so a cast of the _object_ itself to a qualified-type i.e. (volatile int) would not make the access itself volatile-qualified. There is no such thing as volatile-qualified access defined anywhere; there only is the concept of a

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-16 Thread Segher Boessenkool
Here, I should obviously admit that the semantics of *(volatile int *) aren't any neater or well-defined in the _language standard_ at all. The standard does say (verbatim) precisely what constitutes as access to object of volatile-qualified type is implementation-defined, but GCC does help us

Re: [PATCH 1/23] document preferred use of volatile with atomic_t

2007-08-16 Thread Segher Boessenkool
operations in assembler, like Segher Boessenkool did for powerpc in response to my previous patchset. Puh-lease. I DO NOT DISTRUST THE COMPILER, I just don't assume it will do whatever I would like it to do without telling it. It's a machine you know, and it is very well documented. (And most

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
A volatile default would disable optimizations for atomic_read. atomic_read without volatile would allow for full optimization by the compiler. Seems that this is what one wants in many cases. Name one such case. An atomic_read should do a load from memory. If the programmer puts an

Re:

2007-08-15 Thread Segher Boessenkool
"compilation unit" is a C standard term. It typically boils down to "single .c file". As you mentioned later, "single .c file with all the other files (headers or other .c files) that it pulls in via #include" is actually "translation unit", both in the C standard as well as gcc docs.

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Precisely the point -- use of volatile (whether in casts or on asms) in these cases are intended to disable those

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Almost everything is a tradeoff; and so is this. I don't believe most people would find disabling all compiler

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
No; compilation units have nothing to do with it, GCC can optimise across compilation unit boundaries just fine, if you tell it to compile more than one compilation unit at once. Last I checked, the Linux kernel build system did compile each .c file as a separate compilation unit. I have

Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool
Please check the definition of "cache coherence". Which of the twelve thousand such definitions? :-) Summary: the CPU is indeed within its rights to execute loads and stores to a single variable out of order, -but- only if it gets the same result that it would have obtained by executing

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
Possibly these were too trivial to expose any potential problems that you may have been referring to, so would be helpful if you could write a more concrete example / sample code. The trick is to have a sufficiently complicated expression to force the compiler to run out of registers. You

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
No; compilation units have nothing to do with it, GCC can optimise across compilation unit boundaries just fine, if you tell it to compile more than one compilation unit at once. Last I checked, the Linux kernel build system did compile each .c file as a separate compilation unit. I have some

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
Of course, if we find there are more callers in the kernel who want the volatility behaviour than those who don't care, we can re-define the existing ops to such variants, and re-name the existing definitions to somethine else, say "atomic_read_nonvolatile" for all I care. Do we really need

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
I think this was just terminology confusion here again. Isn't "any code that it cannot currently see" the same as "another compilation unit", and wouldn't the "compilation unit" itself expand if we ask gcc to compile more than one unit at once? Or is there some more specific "definition" for

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
What volatile does are a) never optimise away a read (or write) to the object, since the data can change in ways the compiler cannot see; and b) never move stores to the object across a sequence point. This does not mean other accesses cannot be reordered wrt the volatile access. If the

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
What you probably mean is that the compiler has to assume any code it cannot currently see can do anything (insofar as allowed by the relevant standards etc.) I think this was just terminology confusion here again. Isn't "any code that it cannot currently see" the same as "another compilation

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
How does the compiler know that msleep() has got barrier()s? Because msleep_interruptible() is in a separate compilation unit, the compiler has to assume that it might modify any arbitrary global. No; compilation units have nothing to do with it, GCC can optimise across compilation unit

Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool
Well if there is only one memory location involved, then smp_rmb() isn't going to really do anything anyway, so it would be incorrect to use it. rmb() orders *any* two reads; that includes two reads from the same location. If the two reads are to the same location, all CPUs I am aware of will

Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool
Well if there is only one memory location involved, then smp_rmb() isn't going to really do anything anyway, so it would be incorrect to use it. rmb() orders *any* two reads; that includes two reads from the same location. If the two reads are to the same location, all CPUs I am aware of

Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool
Well if there is only one memory location involved, then smp_rmb() isn't going to really do anything anyway, so it would be incorrect to use it. rmb() orders *any* two reads; that includes two reads from the same location. Consider that smp_rmb basically will do anything from flushing the

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
"Volatile behaviour" itself isn't consistently defined (at least definitely not consistently implemented in various gcc versions across platforms), It should be consistent across platforms; if not, file a bug please. but it is /expected/ to mean something like: "ensure that every such access

Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool
Well if there is only one memory location involved, then smp_rmb() isn't going to really do anything anyway, so it would be incorrect to use it. rmb() orders *any* two reads; that includes two reads from the same location. Consider that smp_rmb basically will do anything from flushing the

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
Volatile behaviour itself isn't consistently defined (at least definitely not consistently implemented in various gcc versions across platforms), It should be consistent across platforms; if not, file a bug please. but it is /expected/ to mean something like: ensure that every such access

Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool
Well if there is only one memory location involved, then smp_rmb() isn't going to really do anything anyway, so it would be incorrect to use it. rmb() orders *any* two reads; that includes two reads from the same location. If the two reads are to the same location, all CPUs I am aware of

Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool
Well if there is only one memory location involved, then smp_rmb() isn't going to really do anything anyway, so it would be incorrect to use it. rmb() orders *any* two reads; that includes two reads from the same location. If the two reads are to the same location, all CPUs I am aware of will

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
How does the compiler know that msleep() has got barrier()s? Because msleep_interruptible() is in a separate compilation unit, the compiler has to assume that it might modify any arbitrary global. No; compilation units have nothing to do with it, GCC can optimise across compilation unit

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
I think this was just terminology confusion here again. Isn't any code that it cannot currently see the same as another compilation unit, and wouldn't the compilation unit itself expand if we ask gcc to compile more than one unit at once? Or is there some more specific definition for compilation

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
What you probably mean is that the compiler has to assume any code it cannot currently see can do anything (insofar as allowed by the relevant standards etc.) I think this was just terminology confusion here again. Isn't any code that it cannot currently see the same as another compilation

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
What volatile does are a) never optimise away a read (or write) to the object, since the data can change in ways the compiler cannot see; and b) never move stores to the object across a sequence point. This does not mean other accesses cannot be reordered wrt the volatile access. If the

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
Of course, if we find there are more callers in the kernel who want the volatility behaviour than those who don't care, we can re-define the existing ops to such variants, and re-name the existing definitions to somethine else, say atomic_read_nonvolatile for all I care. Do we really need

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
No; compilation units have nothing to do with it, GCC can optimise across compilation unit boundaries just fine, if you tell it to compile more than one compilation unit at once. Last I checked, the Linux kernel build system did compile each .c file as a separate compilation unit. I have some

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
Possibly these were too trivial to expose any potential problems that you may have been referring to, so would be helpful if you could write a more concrete example / sample code. The trick is to have a sufficiently complicated expression to force the compiler to run out of registers. You

Re: [PATCH 6/24] make atomic_read() behave consistently on frv

2007-08-15 Thread Segher Boessenkool
Please check the definition of cache coherence. Which of the twelve thousand such definitions? :-) Summary: the CPU is indeed within its rights to execute loads and stores to a single variable out of order, -but- only if it gets the same result that it would have obtained by executing them

Re:

2007-08-15 Thread Segher Boessenkool
compilation unit is a C standard term. It typically boils down to single .c file. As you mentioned later, single .c file with all the other files (headers or other .c files) that it pulls in via #include is actually translation unit, both in the C standard as well as gcc docs. Yeah.

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Almost everything is a tradeoff; and so is this. I don't believe most people would find disabling all compiler

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
No; compilation units have nothing to do with it, GCC can optimise across compilation unit boundaries just fine, if you tell it to compile more than one compilation unit at once. Last I checked, the Linux kernel build system did compile each .c file as a separate compilation unit. I have

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
Part of the motivation here is to fix heisenbugs. If I knew where they By the same token we should probably disable optimisations altogether since that too can create heisenbugs. Precisely the point -- use of volatile (whether in casts or on asms) in these cases are intended to disable those

Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

2007-08-15 Thread Segher Boessenkool
A volatile default would disable optimizations for atomic_read. atomic_read without volatile would allow for full optimization by the compiler. Seems that this is what one wants in many cases. Name one such case. An atomic_read should do a load from memory. If the programmer puts an

Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Segher Boessenkool
Yeah. Compiler errors are more annoying though I dare say ;-) Actually, compile-time errors are fine, Yes, they don't cause data corruption or anything like that, but I still don't think the 390 people want to ship a kernel that doesn't build -- and it seems they still need to support GCC

Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Segher Boessenkool
"+m" works. We use it. It's better than the alternatives. Pointing to stale documentation doesn't change anything. Well, perhaps on i386. I've seen some older versions of the s390 gcc die with an ICE because I have used "+m" in some kernel inline assembly. I'm happy to hear that this issue

Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Segher Boessenkool
Yes, though I would use "=m" on the output list and "m" on the input list. The reason is that I've seen gcc fall on its face with an ICE on s390 due to "+m". The explanation I've got from our compiler people was quite esoteric, as far as I remember gcc splits "+m" to an input operand and an

Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Segher Boessenkool
Note that last line. Segher, how about you just accept that Linux uses gcc as per reality, and that sometimes the reality is different from your expectations? "+m" works. It works _most of the time_. Ask Martin. Oh you don't even have to, he told you two mails ago. My last mail simply

Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Segher Boessenkool
Note that last line. Segher, how about you just accept that Linux uses gcc as per reality, and that sometimes the reality is different from your expectations? +m works. It works _most of the time_. Ask Martin. Oh you don't even have to, he told you two mails ago. My last mail simply

Re: [PATCH] make atomic_t volatile on all architectures

2007-08-12 Thread Segher Boessenkool
Yes, though I would use =m on the output list and m on the input list. The reason is that I've seen gcc fall on its face with an ICE on s390 due to +m. The explanation I've got from our compiler people was quite esoteric, as far as I remember gcc splits +m to an input operand and an output

<    1   2   3   4   5   6   7   8   9   >