Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, 8 Feb 2001, Stephen Wille Padnos wrote: > "Richard B. Johnson" wrote: > > > > On Thu, 8 Feb 2001, Stephen Wille Padnos wrote: > > > > > "Richard B. Johnson" wrote: > > > [snip] > > > > Another problem with 'volatile' has to do with pointers. When > > > > it's possible for some object to be modified by some external > > > > influence, we see: > > > > > > > > volatile struct whatever *ptr; > > > > > > > > Now, it's unclear if gcc knows that we don't give a damn about > > > > the address contained in 'ptr'. We know that it's not going to > > > > change. What we are concerned with are the items within the > > > > 'struct whatever'. From what I've seen, gcc just reloads the > > > > pointer. > > > > > [snip] > > > Yes. My point is that a lot of authors have declared just about everything > > 'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to > > be "safe". It's likely that there are many hundreds of thousands of > > unneeded register-reloads because of this. > > > > It might be useful for somebody who has a lot of time on his/her > > hands to go through some of these drivers. > > I would be willing to do this (on the slow boat - I don't have THAT much > spare time :), but only if we can be sure that the gcc optimizer will > correctly handle a normal pointer to volatile data. Your experiences > would seem to indicate that the optimizer needs fixing before much > effort should be spent on this. > Well the question for that is; "What compiler?". I'm currently using egcs-2.91.66, one of the "approved" versions for compiling the kernel. It treats all volatiles about the same: volatile int i; volatile int *p; int volatile *q; volatile int * volatile r; void foo() { while(*p == i) ; while(*q == i) ; while(*r == i) ; } ...makes : .file "main.c" .version"01.01" gcc2_compiled.: .text .align 4 .globl foo .typefoo,@function foo: pushl %ebp movl %esp,%ebp nop .align 4 .L2: movl p,%eax movl (%eax),%edx movl i,%eax cmpl %eax,%edx je .L4 jmp .L3 .align 4 .L4: jmp .L2 .align 4 .L3: nop .align 4 .L5: movl q,%eax movl (%eax),%edx movl i,%eax cmpl %eax,%edx je .L7 jmp .L6 .align 4 .L7: jmp .L5 .align 4 .L6: nop .align 4 .L8: movl r,%eax movl (%eax),%edx movl i,%eax cmpl %eax,%edx je .L10 jmp .L9 .align 4 .L10: jmp .L8 .align 4 .L9: .L1: movl %ebp,%esp popl %ebp ret .Lfe1: .sizefoo,.Lfe1-foo .comm i,4,4 .comm p,4,4 .comm q,4,4 .comm r,4,4 .ident "GCC: (GNU) egcs-2.91.66 19990314 (egcs-1.1.2 release)" Since there seems to be a rather big difference between what is expected to be done, and what happens to be the result, this certainly contributes to the possible over-use of 'volatile' in some kernel code. It's certainly better to be safe than sorry, but in some cases "safe" is just a bit "strange". FYI, ../linux/drivers/net/atp.c doesn't use 'volatile' at all. However, ../linux/drivers/net/bmac.c uses it 40 times. I'll bet a buck that both of the drivers work and the one without 'volatile' keywords does the work with fewer instructions. These are just two drivers chosen at random. The driver I've been working on to make 'bullet proof', pcnet32.c uses 'volatile' twice. And, at least in one occasion, the wrong thing is declared volatile (the value in a pointer to a structure ), however gcc doesn't seem to care because it reloads the values of the structure members every time, anyway. So, in this case, the address-value in the pointer will never change, but gcc reloads all the pointed-to members anyway, so the 'volatile' keyword not useful. Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
"Richard B. Johnson" wrote: > > On Thu, 8 Feb 2001, Stephen Wille Padnos wrote: > > > "Richard B. Johnson" wrote: > > [snip] > > > Another problem with 'volatile' has to do with pointers. When > > > it's possible for some object to be modified by some external > > > influence, we see: > > > > > > volatile struct whatever *ptr; > > > > > > Now, it's unclear if gcc knows that we don't give a damn about > > > the address contained in 'ptr'. We know that it's not going to > > > change. What we are concerned with are the items within the > > > 'struct whatever'. From what I've seen, gcc just reloads the > > > pointer. > > > [snip] > Yes. My point is that a lot of authors have declared just about everything > 'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to > be "safe". It's likely that there are many hundreds of thousands of > unneeded register-reloads because of this. > > It might be useful for somebody who has a lot of time on his/her > hands to go through some of these drivers. I would be willing to do this (on the slow boat - I don't have THAT much spare time :), but only if we can be sure that the gcc optimizer will correctly handle a normal pointer to volatile data. Your experiences would seem to indicate that the optimizer needs fixing before much effort should be spent on this. -- Stephen Wille Padnos Programmer, Engineer, Problem Solver [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, 8 Feb 2001, Stephen Wille Padnos wrote: > "Richard B. Johnson" wrote: > [snip] > > Another problem with 'volatile' has to do with pointers. When > > it's possible for some object to be modified by some external > > influence, we see: > > > > volatile struct whatever *ptr; > > > > Now, it's unclear if gcc knows that we don't give a damn about > > the address contained in 'ptr'. We know that it's not going to > > change. What we are concerned with are the items within the > > 'struct whatever'. From what I've seen, gcc just reloads the > > pointer. > > > > Cheers, > > Dick Johnson > > > gcc should treat > volatile struct whatever *ptr; > > as a different case than > struct whatever * volatile ptr; > > which is also different from > volatile struct whatever * volatile ptr; > > I think (but can't find my K C book to confirm :) that the first case > declares the struct as volatile, and the second case declares the > pointer volatile (the third case declares a volatile pointer to a > structure with volatile parts). So, the programmer should have the > choice, if gcc is dealing with volatile correctly. > > Of course, that doesn't mean that the authors have made the right choice > :) > Yes. My point is that a lot of authors have declared just about everything 'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to be "safe". It's likely that there are many hundreds of thousands of unneeded register-reloads because of this. It might be useful for somebody who has a lot of time on his/her hands to go through some of these drivers. Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
"Richard B. Johnson" wrote: [snip] > Another problem with 'volatile' has to do with pointers. When > it's possible for some object to be modified by some external > influence, we see: > > volatile struct whatever *ptr; > > Now, it's unclear if gcc knows that we don't give a damn about > the address contained in 'ptr'. We know that it's not going to > change. What we are concerned with are the items within the > 'struct whatever'. From what I've seen, gcc just reloads the > pointer. > > Cheers, > Dick Johnson > gcc should treat volatile struct whatever *ptr; as a different case than struct whatever * volatile ptr; which is also different from volatile struct whatever * volatile ptr; I think (but can't find my K C book to confirm :) that the first case declares the struct as volatile, and the second case declares the pointer volatile (the third case declares a volatile pointer to a structure with volatile parts). So, the programmer should have the choice, if gcc is dealing with volatile correctly. Of course, that doesn't mean that the authors have made the right choice :) -- Stephen Wille Padnos Programmer, Engineer, Problem Solver [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, 8 Feb 2001, Hugh Dickins wrote: > On Wed, 7 Feb 2001, Linus Torvalds wrote: > > On Wed, 7 Feb 2001, Hugh Dickins wrote: > > > > > > None of those optimizes this: I believe the semantics of "||" (don't > > > try next test if first succeeds) forbid the optimization "|" gives? > > > > No. The optimization is entirely legal - but the fact that > > "constant_test_bit()" uses a "volatile unsigned int *" is the reason why > > gcc thinks it can't optimize it. > > Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up > with three "mov"s. But take the "volatile"s out of constant_test_bit(), > and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97) > jumps - which is what originally offended me. > > But Mark (in test program in private mail) shows gcc combining bits > into one test and one jump, just as we'd hope (and I wrongly thought > forbidden). Perhaps the inline function nature of constant_test_bit() > (which Mark didn't use) gets in the way of combining those tests. > > > You could try to remove the volatile from test_bit, and see if that fixes > > it - but then we'd have to find and add the proper "rmb()" calls to people > > who do the endless loop kind of thing like above. > > That is not an inviting path to me, at least not any time soon! > > I think this all argues for the little patch I suggested - just avoid > test_bit() here. But it was only intended as a quick little suggestion: > looks like our tastes differ, and you prefer taking the _tiny_ hit of > using the regular macros, to seeing "1< The use of the key word 'volatile' has gone just a bit too far in some cases. given: funct() { volatile unsigned int; } Is plain dumb. There is nobody else that can touch that local variable except the code in funct(). Even if it's recursive, the Nth invocation still can't (using legal 'C' code) touch that variable. Therefore, it should not be declared volatile. Another problem with 'volatile' has to do with pointers. When it's possible for some object to be modified by some external influence, we see: volatile struct whatever *ptr; Now, it's unclear if gcc knows that we don't give a damn about the address contained in 'ptr'. We know that it's not going to change. What we are concerned with are the items within the 'struct whatever'. From what I've seen, gcc just reloads the pointer. Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, 8 Feb 2001, Hugh Dickins wrote: > On Thu, 8 Feb 2001, David Weinehall wrote: > > > > Well, after all, it's debugging code, and the code now is easy to read. > > Your code, while more efficient, isn't. I think that clarity takes > > priority over efficiency in non-critical code such as debugging > > code. Of course, this is my personal opinion... > > I agree my version isn't _as_ easy, and if this code only got built > into DEBUG kernels, I would never have bothered about it; but it's > built into every kernel, on executed paths, so it's no less critical. Since it's DEBUG code only and nicely "hidden" in a .h file, why not have the efficient code with a well-written comment documenting what the code does and why it is there ? regards, Rik -- Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, 8 Feb 2001, David Weinehall wrote: > > Well, after all, it's debugging code, and the code now is easy to read. > Your code, while more efficient, isn't. I think that clarity takes > priority over efficiency in non-critical code such as debugging > code. Of course, this is my personal opinion... I agree my version isn't _as_ easy, and if this code only got built into DEBUG kernels, I would never have bothered about it; but it's built into every kernel, on executed paths, so it's no less critical. Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, Feb 08, 2001 at 04:24:23PM +, Hugh Dickins wrote: > On Wed, 7 Feb 2001, Linus Torvalds wrote: > > On Wed, 7 Feb 2001, Hugh Dickins wrote: > > > > > > None of those optimizes this: I believe the semantics of "||" (don't > > > try next test if first succeeds) forbid the optimization "|" gives? > > > > No. The optimization is entirely legal - but the fact that > > "constant_test_bit()" uses a "volatile unsigned int *" is the reason why > > gcc thinks it can't optimize it. > > Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up > with three "mov"s. But take the "volatile"s out of constant_test_bit(), > and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97) > jumps - which is what originally offended me. > > But Mark (in test program in private mail) shows gcc combining bits > into one test and one jump, just as we'd hope (and I wrongly thought > forbidden). Perhaps the inline function nature of constant_test_bit() > (which Mark didn't use) gets in the way of combining those tests. > > > You could try to remove the volatile from test_bit, and see if that fixes > > it - but then we'd have to find and add the proper "rmb()" calls to people > > who do the endless loop kind of thing like above. > > That is not an inviting path to me, at least not any time soon! > > I think this all argues for the little patch I suggested - just avoid > test_bit() here. But it was only intended as a quick little suggestion: > looks like our tastes differ, and you prefer taking the _tiny_ hit of > using the regular macros, to seeing "1< /> Northern lights wander \\ // Project MCA Linux hacker// Dance across the winter sky // \> http://www.acc.umu.se/~tao/http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Wed, 7 Feb 2001, Linus Torvalds wrote: > On Wed, 7 Feb 2001, Hugh Dickins wrote: > > > > None of those optimizes this: I believe the semantics of "||" (don't > > try next test if first succeeds) forbid the optimization "|" gives? > > No. The optimization is entirely legal - but the fact that > "constant_test_bit()" uses a "volatile unsigned int *" is the reason why > gcc thinks it can't optimize it. Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up with three "mov"s. But take the "volatile"s out of constant_test_bit(), and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97) jumps - which is what originally offended me. But Mark (in test program in private mail) shows gcc combining bits into one test and one jump, just as we'd hope (and I wrongly thought forbidden). Perhaps the inline function nature of constant_test_bit() (which Mark didn't use) gets in the way of combining those tests. > You could try to remove the volatile from test_bit, and see if that fixes > it - but then we'd have to find and add the proper "rmb()" calls to people > who do the endless loop kind of thing like above. That is not an inviting path to me, at least not any time soon! I think this all argues for the little patch I suggested - just avoid test_bit() here. But it was only intended as a quick little suggestion: looks like our tastes differ, and you prefer taking the _tiny_ hit of using the regular macros, to seeing "1
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Wed, 7 Feb 2001, Linus Torvalds wrote: On Wed, 7 Feb 2001, Hugh Dickins wrote: None of those optimizes this: I believe the semantics of "||" (don't try next test if first succeeds) forbid the optimization "|" gives? No. The optimization is entirely legal - but the fact that "constant_test_bit()" uses a "volatile unsigned int *" is the reason why gcc thinks it can't optimize it. Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up with three "mov"s. But take the "volatile"s out of constant_test_bit(), and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97) jumps - which is what originally offended me. But Mark (in test program in private mail) shows gcc combining bits into one test and one jump, just as we'd hope (and I wrongly thought forbidden). Perhaps the inline function nature of constant_test_bit() (which Mark didn't use) gets in the way of combining those tests. You could try to remove the volatile from test_bit, and see if that fixes it - but then we'd have to find and add the proper "rmb()" calls to people who do the endless loop kind of thing like above. That is not an inviting path to me, at least not any time soon! I think this all argues for the little patch I suggested - just avoid test_bit() here. But it was only intended as a quick little suggestion: looks like our tastes differ, and you prefer taking the _tiny_ hit of using the regular macros, to seeing "1PG_bitshift"s in DEBUG_ADD_PAGE. Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, Feb 08, 2001 at 04:24:23PM +, Hugh Dickins wrote: On Wed, 7 Feb 2001, Linus Torvalds wrote: On Wed, 7 Feb 2001, Hugh Dickins wrote: None of those optimizes this: I believe the semantics of "||" (don't try next test if first succeeds) forbid the optimization "|" gives? No. The optimization is entirely legal - but the fact that "constant_test_bit()" uses a "volatile unsigned int *" is the reason why gcc thinks it can't optimize it. Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up with three "mov"s. But take the "volatile"s out of constant_test_bit(), and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97) jumps - which is what originally offended me. But Mark (in test program in private mail) shows gcc combining bits into one test and one jump, just as we'd hope (and I wrongly thought forbidden). Perhaps the inline function nature of constant_test_bit() (which Mark didn't use) gets in the way of combining those tests. You could try to remove the volatile from test_bit, and see if that fixes it - but then we'd have to find and add the proper "rmb()" calls to people who do the endless loop kind of thing like above. That is not an inviting path to me, at least not any time soon! I think this all argues for the little patch I suggested - just avoid test_bit() here. But it was only intended as a quick little suggestion: looks like our tastes differ, and you prefer taking the _tiny_ hit of using the regular macros, to seeing "1PG_bitshift"s in DEBUG_ADD_PAGE. Well, after all, it's debugging code, and the code now is easy to read. Your code, while more efficient, isn't. I think that clarity takes priority over efficiency in non-critical code such as debugging code. Of course, this is my personal opinion... /David _ _ // David Weinehall [EMAIL PROTECTED] / Northern lights wander \\ // Project MCA Linux hacker// Dance across the winter sky // \ http://www.acc.umu.se/~tao// Full colour fire / - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, 8 Feb 2001, David Weinehall wrote: Well, after all, it's debugging code, and the code now is easy to read. Your code, while more efficient, isn't. I think that clarity takes priority over efficiency in non-critical code such as debugging code. Of course, this is my personal opinion... I agree my version isn't _as_ easy, and if this code only got built into DEBUG kernels, I would never have bothered about it; but it's built into every kernel, on executed paths, so it's no less critical. Hugh - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, 8 Feb 2001, Hugh Dickins wrote: On Thu, 8 Feb 2001, David Weinehall wrote: Well, after all, it's debugging code, and the code now is easy to read. Your code, while more efficient, isn't. I think that clarity takes priority over efficiency in non-critical code such as debugging code. Of course, this is my personal opinion... I agree my version isn't _as_ easy, and if this code only got built into DEBUG kernels, I would never have bothered about it; but it's built into every kernel, on executed paths, so it's no less critical. Since it's DEBUG code only and nicely "hidden" in a .h file, why not have the efficient code with a well-written comment documenting what the code does and why it is there ? regards, Rik -- Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, 8 Feb 2001, Hugh Dickins wrote: On Wed, 7 Feb 2001, Linus Torvalds wrote: On Wed, 7 Feb 2001, Hugh Dickins wrote: None of those optimizes this: I believe the semantics of "||" (don't try next test if first succeeds) forbid the optimization "|" gives? No. The optimization is entirely legal - but the fact that "constant_test_bit()" uses a "volatile unsigned int *" is the reason why gcc thinks it can't optimize it. Ah, yes, I hadn't noticed that, the "volatile" is indeed why it ends up with three "mov"s. But take the "volatile"s out of constant_test_bit(), and DEBUG_ADD_PAGE still expands to three tests and three (four if 2.97) jumps - which is what originally offended me. But Mark (in test program in private mail) shows gcc combining bits into one test and one jump, just as we'd hope (and I wrongly thought forbidden). Perhaps the inline function nature of constant_test_bit() (which Mark didn't use) gets in the way of combining those tests. You could try to remove the volatile from test_bit, and see if that fixes it - but then we'd have to find and add the proper "rmb()" calls to people who do the endless loop kind of thing like above. That is not an inviting path to me, at least not any time soon! I think this all argues for the little patch I suggested - just avoid test_bit() here. But it was only intended as a quick little suggestion: looks like our tastes differ, and you prefer taking the _tiny_ hit of using the regular macros, to seeing "1PG_bitshift"s in DEBUG_ADD_PAGE. The use of the key word 'volatile' has gone just a bit too far in some cases. given: funct() { volatile unsigned int; } Is plain dumb. There is nobody else that can touch that local variable except the code in funct(). Even if it's recursive, the Nth invocation still can't (using legal 'C' code) touch that variable. Therefore, it should not be declared volatile. Another problem with 'volatile' has to do with pointers. When it's possible for some object to be modified by some external influence, we see: volatile struct whatever *ptr; Now, it's unclear if gcc knows that we don't give a damn about the address contained in 'ptr'. We know that it's not going to change. What we are concerned with are the items within the 'struct whatever'. From what I've seen, gcc just reloads the pointer. Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
"Richard B. Johnson" wrote: [snip] Another problem with 'volatile' has to do with pointers. When it's possible for some object to be modified by some external influence, we see: volatile struct whatever *ptr; Now, it's unclear if gcc knows that we don't give a damn about the address contained in 'ptr'. We know that it's not going to change. What we are concerned with are the items within the 'struct whatever'. From what I've seen, gcc just reloads the pointer. Cheers, Dick Johnson gcc should treat volatile struct whatever *ptr; as a different case than struct whatever * volatile ptr; which is also different from volatile struct whatever * volatile ptr; I think (but can't find my KR C book to confirm :) that the first case declares the struct as volatile, and the second case declares the pointer volatile (the third case declares a volatile pointer to a structure with volatile parts). So, the programmer should have the choice, if gcc is dealing with volatile correctly. Of course, that doesn't mean that the authors have made the right choice :) -- Stephen Wille Padnos Programmer, Engineer, Problem Solver [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, 8 Feb 2001, Stephen Wille Padnos wrote: "Richard B. Johnson" wrote: [snip] Another problem with 'volatile' has to do with pointers. When it's possible for some object to be modified by some external influence, we see: volatile struct whatever *ptr; Now, it's unclear if gcc knows that we don't give a damn about the address contained in 'ptr'. We know that it's not going to change. What we are concerned with are the items within the 'struct whatever'. From what I've seen, gcc just reloads the pointer. Cheers, Dick Johnson gcc should treat volatile struct whatever *ptr; as a different case than struct whatever * volatile ptr; which is also different from volatile struct whatever * volatile ptr; I think (but can't find my KR C book to confirm :) that the first case declares the struct as volatile, and the second case declares the pointer volatile (the third case declares a volatile pointer to a structure with volatile parts). So, the programmer should have the choice, if gcc is dealing with volatile correctly. Of course, that doesn't mean that the authors have made the right choice :) Yes. My point is that a lot of authors have declared just about everything 'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to be "safe". It's likely that there are many hundreds of thousands of unneeded register-reloads because of this. It might be useful for somebody who has a lot of time on his/her hands to go through some of these drivers. Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
"Richard B. Johnson" wrote: On Thu, 8 Feb 2001, Stephen Wille Padnos wrote: "Richard B. Johnson" wrote: [snip] Another problem with 'volatile' has to do with pointers. When it's possible for some object to be modified by some external influence, we see: volatile struct whatever *ptr; Now, it's unclear if gcc knows that we don't give a damn about the address contained in 'ptr'. We know that it's not going to change. What we are concerned with are the items within the 'struct whatever'. From what I've seen, gcc just reloads the pointer. [snip] Yes. My point is that a lot of authors have declared just about everything 'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to be "safe". It's likely that there are many hundreds of thousands of unneeded register-reloads because of this. It might be useful for somebody who has a lot of time on his/her hands to go through some of these drivers. I would be willing to do this (on the slow boat - I don't have THAT much spare time :), but only if we can be sure that the gcc optimizer will correctly handle a normal pointer to volatile data. Your experiences would seem to indicate that the optimizer needs fixing before much effort should be spent on this. -- Stephen Wille Padnos Programmer, Engineer, Problem Solver [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Thu, 8 Feb 2001, Stephen Wille Padnos wrote: "Richard B. Johnson" wrote: On Thu, 8 Feb 2001, Stephen Wille Padnos wrote: "Richard B. Johnson" wrote: [snip] Another problem with 'volatile' has to do with pointers. When it's possible for some object to be modified by some external influence, we see: volatile struct whatever *ptr; Now, it's unclear if gcc knows that we don't give a damn about the address contained in 'ptr'. We know that it's not going to change. What we are concerned with are the items within the 'struct whatever'. From what I've seen, gcc just reloads the pointer. [snip] Yes. My point is that a lot of authors have declared just about everything 'volatile' `grep volatile /usr/src/linux/drivers/net/*.c`, just to be "safe". It's likely that there are many hundreds of thousands of unneeded register-reloads because of this. It might be useful for somebody who has a lot of time on his/her hands to go through some of these drivers. I would be willing to do this (on the slow boat - I don't have THAT much spare time :), but only if we can be sure that the gcc optimizer will correctly handle a normal pointer to volatile data. Your experiences would seem to indicate that the optimizer needs fixing before much effort should be spent on this. Well the question for that is; "What compiler?". I'm currently using egcs-2.91.66, one of the "approved" versions for compiling the kernel. It treats all volatiles about the same: volatile int i; volatile int *p; int volatile *q; volatile int * volatile r; void foo() { while(*p == i) ; while(*q == i) ; while(*r == i) ; } ...makes : .file "main.c" .version"01.01" gcc2_compiled.: .text .align 4 .globl foo .typefoo,@function foo: pushl %ebp movl %esp,%ebp nop .align 4 .L2: movl p,%eax movl (%eax),%edx movl i,%eax cmpl %eax,%edx je .L4 jmp .L3 .align 4 .L4: jmp .L2 .align 4 .L3: nop .align 4 .L5: movl q,%eax movl (%eax),%edx movl i,%eax cmpl %eax,%edx je .L7 jmp .L6 .align 4 .L7: jmp .L5 .align 4 .L6: nop .align 4 .L8: movl r,%eax movl (%eax),%edx movl i,%eax cmpl %eax,%edx je .L10 jmp .L9 .align 4 .L10: jmp .L8 .align 4 .L9: .L1: movl %ebp,%esp popl %ebp ret .Lfe1: .sizefoo,.Lfe1-foo .comm i,4,4 .comm p,4,4 .comm q,4,4 .comm r,4,4 .ident "GCC: (GNU) egcs-2.91.66 19990314 (egcs-1.1.2 release)" Since there seems to be a rather big difference between what is expected to be done, and what happens to be the result, this certainly contributes to the possible over-use of 'volatile' in some kernel code. It's certainly better to be safe than sorry, but in some cases "safe" is just a bit "strange". FYI, ../linux/drivers/net/atp.c doesn't use 'volatile' at all. However, ../linux/drivers/net/bmac.c uses it 40 times. I'll bet a buck that both of the drivers work and the one without 'volatile' keywords does the work with fewer instructions. These are just two drivers chosen at random. The driver I've been working on to make 'bullet proof', pcnet32.c uses 'volatile' twice. And, at least in one occasion, the wrong thing is declared volatile (the value in a pointer to a structure ), however gcc doesn't seem to care because it reloads the values of the structure members every time, anyway. So, in this case, the address-value in the pointer will never change, but gcc reloads all the pointed-to members anyway, so the 'volatile' keyword not useful. Cheers, Dick Johnson Penguin : Linux version 2.4.1 on an i686 machine (799.53 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Wed, 7 Feb 2001, Linus Torvalds wrote: > No. The optimization is entirely legal - but the fact that > "constant_test_bit()" uses a "volatile unsigned int *" is the reason why > gcc thinks it can't optimize it. This thing did attract me somewhat and I decided to learn a little about compilers. Result: Unfortunately it's not just the volatile, there's a bunch of conditions you have to fulfill to have the compiler optimize this. (Sounds like work for the compiler guys). Test program is attached, inspecting the code (egcs 2.91.66 and gcc-2.96 (-69) generate the same code gives the following conclusions: - f1(unsigned long f): manually optimized if (f & ((1 << 1) | (1 << 2) | (1 << 4))) { -> optimized code (of course) - f2(unsigned long f): leave some work to the compiler if ((f & (1 << 1)) || (f & (1 << 2)) || (f & (1 << 4))) { -> optimized code (good) - f3(unsigned int f): use constant_test_bit macro if (constant_test_bit(1, ) || constant_test_bit(2, ) || constant_test_bit(4, )) { -> optimized code where #define constant_test_bit(nr, addr) \ (((1UL << (nr & 31)) & ((unsigned int*)(addr))[nr >> 5]) != 0) (doesn't optimize when putting *const* unsigned int there) - f4: same thing as f3, but use (unsigned long f) instead of (unsigned int f) -> no optimization - f5: same thing as f3, but use inline function for constant_test_bit -> no optimization - f6: same thing as f3, but use test_bit instead of constant_test_bit, where #define test_bit(nr,addr) \ (__builtin_constant_p(nr) ? \ constant_test_bit((nr),(addr)) : \ variable_test_bit((nr),(addr))) -> no optimization Conclusion: With the compilers tested, lots of cases are not optimized although the could be in theory: - casting even from unsigned int to unsigned long breaks optimization - macros are better than inline - Even though evaluated at compile time, __builtin_constant_p breaks optimization here, too. BTW: volatile makes optimization impossible as well, of course, it leads to repeated reloads of the variable, whereas otherwise it's cached in a register in the above "no optimization" cases. That's expected behavior. --Kai Test code: -- #define ADDR (*(volatile long *) addr) static __inline__ int inl_constant_test_bit(int nr, const void * addr) { return ((1UL << (nr & 31)) & (((unsigned int *) addr)[nr >> 5])) != 0; } #define constant_test_bit(nr, addr) (((1UL << (nr & 31)) & ((unsigned int*)(addr))[nr >> 5]) != 0) static __inline__ int variable_test_bit(int nr, volatile void * addr) { int oldbit; __asm__ __volatile__( "btl %2,%1\n\tsbbl %0,%0" :"=r" (oldbit) :"m" (ADDR),"Ir" (nr)); return oldbit; } #define test_bit(nr,addr) \ (__builtin_constant_p(nr) ? \ constant_test_bit((nr),(addr)) : \ variable_test_bit((nr),(addr))) int f1(unsigned long f) { if (f & ((1 << 1) | (1 << 2) | (1 << 4))) { return 1; } return 0; } int f2(unsigned long f) { if ((f & (1 << 1)) || (f & (1 << 2)) || (f & (1 << 4))) { return 1; } return 0; } int f3(unsigned int f) { if (constant_test_bit(1, ) || constant_test_bit(2, ) || constant_test_bit(4, )) { return 1; } return 0; } int f4(unsigned long f) { if (constant_test_bit(1, ) || constant_test_bit(2, ) || constant_test_bit(4, )) { return 1; } return 0; } int f5(unsigned int f) { if (inl_constant_test_bit(1, ) || inl_constant_test_bit(2, ) || inl_constant_test_bit(4, )) { return 1; } return 0; } int f6(unsigned int f) { if (test_bit(1, ) || test_bit(2, ) || test_bit(4, )) { return 1; } return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Wed, 7 Feb 2001, Hugh Dickins wrote: > > The "(1< activate_page_nolock() compiled -O2 -march=i686 with egcs-2.91.66 (RH7.0 > kgcc), gcc-2.96-69 (RH7.0 gcc+fixes), gcc-2.97 (gcc-snapshot-20010207-1). > > None of those optimizes this: I believe the semantics of "||" (don't > try next test if first succeeds) forbid the optimization "|" gives? No. The optimization is entirely legal - but the fact that "constant_test_bit()" uses a "volatile unsigned int *" is the reason why gcc thinks it can't optimize it. Oh, well. That "volatile" is really totally bogus. But it's there because there are probably drivers that do while (test_bit(...)) /* nothing */; and the compiler woul doptimize it away a bit too much without the volatile. Dang. You could try to remove the volatile from test_bit, and see if that fixes it - but then we'd have to find and add the proper "rmb()" calls to people who do the endless loop kind of thing like above. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Wed, 7 Feb 2001, Linus Torvalds wrote: > > I'd rather not do these kinds of things that the compiler should be able > to trivially do for us. > > (gcc sometimes _does_ do these things. I've seen it. Why doesn't it do it > here? Did you check the code? Have you asked the gcc lists?) The "(1< 239: 8b 43 18mov0x18(%ebx),%eax 23c: a8 80 test $0x80,%al 23e: 75 08 jne248 240: 8b 43 18mov0x18(%ebx),%eax 243: f6 c4 08test $0x8,%ah 246: 74 19 je 261 2.97 is jumpier: mov and je mov test jne mov test jne jmp. That looks worse to me: David, earlier on you advertized http://www.codesourcery.com/gcc-snapshots/ Is this something worth your pursuing with the gcc guys? Hugh --- linux-2.4.2-pre1/include/linux/swap.h Wed Feb 7 15:21:13 2001 +++ linux/include/linux/swap.h Wed Feb 7 17:21:25 2001 @@ -200,8 +200,8 @@ * with the pagemap_lru_lock held! */ #define DEBUG_ADD_PAGE \ - if (PageActive(page) || PageInactiveDirty(page) || \ - PageInactiveClean(page)) BUG(); + if ((page)->flags & ((1
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Wed, 7 Feb 2001, Hugh Dickins wrote: > > Micro-optimization season? I'd rather not do these kinds of things that the compiler should be able to trivially do for us. (gcc sometimes _does_ do these things. I've seen it. Why doesn't it do it here? Did you check the code? Have you asked the gcc lists?) Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
[PATCH] micro-opt DEBUG_ADD_PAGE
On Tue, 6 Feb 2001, Linus Torvalds wrote: > > - if (bh->b_size % correct_size) { > > + if (bh->b_size != correct_size) { > > Actually, I'd rather leave it in, but speed it up with the saner and > fasterif (bh->b_size & (correct_size-1)) { Micro-optimization season? --- linux-2.4.2-pre1/include/linux/swap.h Wed Feb 7 15:21:13 2001 +++ linux/include/linux/swap.h Wed Feb 7 17:21:25 2001 @@ -200,8 +200,8 @@ * with the pagemap_lru_lock held! */ #define DEBUG_ADD_PAGE \ - if (PageActive(page) || PageInactiveDirty(page) || \ - PageInactiveClean(page)) BUG(); + if ((page)->flags & ((1
[PATCH] micro-opt DEBUG_ADD_PAGE
On Tue, 6 Feb 2001, Linus Torvalds wrote: - if (bh-b_size % correct_size) { + if (bh-b_size != correct_size) { Actually, I'd rather leave it in, but speed it up with the saner and fasterif (bh-b_size (correct_size-1)) { Micro-optimization season? --- linux-2.4.2-pre1/include/linux/swap.h Wed Feb 7 15:21:13 2001 +++ linux/include/linux/swap.h Wed Feb 7 17:21:25 2001 @@ -200,8 +200,8 @@ * with the pagemap_lru_lock held! */ #define DEBUG_ADD_PAGE \ - if (PageActive(page) || PageInactiveDirty(page) || \ - PageInactiveClean(page)) BUG(); + if ((page)-flags ((1PG_active)|(1PG_inactive_dirty)| \ + (1PG_inactive_clean))) BUG(); #define ZERO_PAGE_BUG \ if (page_count(page) == 0) BUG(); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Wed, 7 Feb 2001, Hugh Dickins wrote: Micro-optimization season? I'd rather not do these kinds of things that the compiler should be able to trivially do for us. (gcc sometimes _does_ do these things. I've seen it. Why doesn't it do it here? Did you check the code? Have you asked the gcc lists?) Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Wed, 7 Feb 2001, Linus Torvalds wrote: I'd rather not do these kinds of things that the compiler should be able to trivially do for us. (gcc sometimes _does_ do these things. I've seen it. Why doesn't it do it here? Did you check the code? Have you asked the gcc lists?) The "(1PG_bitshift)" part of it is done, sure; but I've rechecked activate_page_nolock() compiled -O2 -march=i686 with egcs-2.91.66 (RH7.0 kgcc), gcc-2.96-69 (RH7.0 gcc+fixes), gcc-2.97 (gcc-snapshot-20010207-1). None of those optimizes this: I believe the semantics of "||" (don't try next test if first succeeds) forbid the optimization "|" gives? 2.91 and 2.96 give three movs (two unnecessary), three tests, three jumps (first two not usually taken): 232: 8b 43 18mov0x18(%ebx),%eax 235: a8 40 test $0x40,%al 237: 75 0f jne248 activate_page_nolock+0x4c 239: 8b 43 18mov0x18(%ebx),%eax 23c: a8 80 test $0x80,%al 23e: 75 08 jne248 activate_page_nolock+0x4c 240: 8b 43 18mov0x18(%ebx),%eax 243: f6 c4 08test $0x8,%ah 246: 74 19 je 261 activate_page_nolock+0x65 2.97 is jumpier: mov and je mov test jne mov test jne jmp. That looks worse to me: David, earlier on you advertized http://www.codesourcery.com/gcc-snapshots/ Is this something worth your pursuing with the gcc guys? Hugh --- linux-2.4.2-pre1/include/linux/swap.h Wed Feb 7 15:21:13 2001 +++ linux/include/linux/swap.h Wed Feb 7 17:21:25 2001 @@ -200,8 +200,8 @@ * with the pagemap_lru_lock held! */ #define DEBUG_ADD_PAGE \ - if (PageActive(page) || PageInactiveDirty(page) || \ - PageInactiveClean(page)) BUG(); + if ((page)-flags ((1PG_active)|(1PG_inactive_dirty)| \ + (1PG_inactive_clean))) BUG(); #define ZERO_PAGE_BUG \ if (page_count(page) == 0) BUG(); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] micro-opt DEBUG_ADD_PAGE
On Wed, 7 Feb 2001, Hugh Dickins wrote: The "(1PG_bitshift)" part of it is done, sure; but I've rechecked activate_page_nolock() compiled -O2 -march=i686 with egcs-2.91.66 (RH7.0 kgcc), gcc-2.96-69 (RH7.0 gcc+fixes), gcc-2.97 (gcc-snapshot-20010207-1). None of those optimizes this: I believe the semantics of "||" (don't try next test if first succeeds) forbid the optimization "|" gives? No. The optimization is entirely legal - but the fact that "constant_test_bit()" uses a "volatile unsigned int *" is the reason why gcc thinks it can't optimize it. Oh, well. That "volatile" is really totally bogus. But it's there because there are probably drivers that do while (test_bit(...)) /* nothing */; and the compiler woul doptimize it away a bit too much without the volatile. Dang. You could try to remove the volatile from test_bit, and see if that fixes it - but then we'd have to find and add the proper "rmb()" calls to people who do the endless loop kind of thing like above. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/