Re: Is this a valid construct?
Matthew Dharm wrote: > > Does the following pseudocode do what I think it does? > > Assume the semaphore is properly initialized to locked. > > int flagvar = 0; > struct semaphore blocking_sem; > > void function_called_from_kernel_thread(void) > { > chew_on_hardware(); > flagvar = 1; > down(blocking_sem); > > if (flagvar) > printk("something went wrong") > else > printk("everything okay") > } > > void function_called_from_interrupt_context() > { > flagvar = 0; > up(blocking_sem); > } > > void function_to_call_from_timeout() > { > up(blocking_sem); > } > > The idea is this -- I chew on the hardware, then sleep on the semaphore. I > then either get woken up by an IRQ (which may never come), or the timeout. > I then try to use the flagvar to determine which of the two happened. > > This _looks_ valid to me... but I'm seeing occurances where I get the IRQ > (yes, I'm sure of it) but flagvar == 1, which confuses me. > What if the timeout happens, does the "up", *then* the irq happens while still in the timeout routine? Consider using two flags, as a debugging aid. flag_irq_happened flag_timeout_happened You can then check 4 possible outcomes: 1. irq happened, no timeout (expected operation) 2. no irq, normal timeout (expected failure) 3. both irq and timeout happened. The window for this may be small, but small windows eventually happen - perhaps several times in a minute. 4. neither happened, so something else went wrong with the semaphore. > Is this one of those places where I need the "volatile" keyword? Maybe. Take a look at the generated code, check if the memory read is optimized away. (I.e. flagvar is "cached" in a register in function_called_from_kernel_thread(). Or substituted for the constant "1", as gcc notices that "flagvar" isn't changed between the assignment and the test using a false assumption about no parallell execution. "volatile" is supposed to help with that, telling the compiler that "this thing may change in mysterious ways, unknown to the generated code, i.e. other unknown code or hardware. Helge Hafting - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Is this a valid construct?
You know, it would help if I posted the right pseudocode. Yes, I found this race condition a while ago. I do set flagvar _before_ I chew on the hardware. Nevertheless, what I'm seeing looks like either (a) the change to flagvar isn't being propagated from the IRQ handler to the thread, or (b) the down() fails to sleep for no good reason. Matt On Mon, Oct 16, 2000 at 05:34:02PM -0500, Sudhindra Herle wrote: > You have a race condition. > > > int flagvar = 0; > > struct semaphore blocking_sem; > > > > void function_called_from_kernel_thread(void) > > { > > chew_on_hardware(); > ^^^ > As soon as you do/did this, the IRQ must've happened. So, before the next > statement is executed, the IRQ handler is called which clears flagvar. > > > flagvar = 1; > This is executed _after_ the IRQ handler completes. So, you stomp over > whatever > the IRQ handler did. > > > down(blocking_sem); > Since the IRQ handler did up(), this down() won't sleep. > > Try setting flagvar = 1 _before_ you chew_on_hardware(). > Make sure that what ever you do is SMP safe. > > Store the tick count of when you do the IRQ handler and when you do > chew_on_hardware(). You'll see if I am right. > i.e., try: > > before = system_ticks_NOW > flagvar = 1; > chew_on_hardware (); > after = system_ticks_NOW > > printk (..., before, after, irq) > > > interrupt_handler ( ..) > { > irq = system_ticks_NOW > ... > } > > > Cheers, > -Sudhi. -- Matthew Dharm Home: [EMAIL PROTECTED] Maintainer, Linux USB Mass Storage Driver DP: And judging from the scores, Stef has the sma... T: LET'S NOT GO THERE! -- Dust Puppy and Tanya User Friendly, 12/11/1997 PGP signature
Re: Is this a valid construct?
Also sprach Matthew Dharm: } Does the following pseudocode do what I think it does? } } Assume the semaphore is properly initialized to locked. } } int flagvar = 0; } struct semaphore blocking_sem; } } void function_called_from_kernel_thread(void) } { } chew_on_hardware(); } flagvar = 1; } down(blocking_sem); } } if (flagvar) } printk("something went wrong") } else } printk("everything okay") } } } } void function_called_from_interrupt_context() } { } flagvar = 0; } up(blocking_sem); } } } } void function_to_call_from_timeout() } { } up(blocking_sem); } } } } The idea is this -- I chew on the hardware, then sleep on the semaphore. I } then either get woken up by an IRQ (which may never come), or the timeout. } I then try to use the flagvar to determine which of the two happened. } } This _looks_ valid to me... but I'm seeing occurances where I get the IRQ } (yes, I'm sure of it) but flagvar == 1, which confuses me. } Are you sure there isn't a race on the flagvar variable? Like, the interrupt happens, it gets set to 0, then before we can ``up'' the semaphore, it's set to 1 in the function_called_from_kernel_thread()? -- || Bill Wendling[EMAIL PROTECTED] PGP signature
RE: Is this a valid construct?
You have a race condition. > int flagvar = 0; > struct semaphore blocking_sem; > > void function_called_from_kernel_thread(void) > { > chew_on_hardware(); ^^^ As soon as you do/did this, the IRQ must've happened. So, before the next statement is executed, the IRQ handler is called which clears flagvar. > flagvar = 1; This is executed _after_ the IRQ handler completes. So, you stomp over whatever the IRQ handler did. > down(blocking_sem); Since the IRQ handler did up(), this down() won't sleep. Try setting flagvar = 1 _before_ you chew_on_hardware(). Make sure that what ever you do is SMP safe. Store the tick count of when you do the IRQ handler and when you do chew_on_hardware(). You'll see if I am right. i.e., try: before = system_ticks_NOW flagvar = 1; chew_on_hardware (); after = system_ticks_NOW printk (..., before, after, irq) interrupt_handler ( ..) { irq = system_ticks_NOW ... } Cheers, -Sudhi. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: Is this a valid construct?
You have a race condition. int flagvar = 0; struct semaphore blocking_sem; void function_called_from_kernel_thread(void) { chew_on_hardware(); ^^^ As soon as you do/did this, the IRQ must've happened. So, before the next statement is executed, the IRQ handler is called which clears flagvar. flagvar = 1; This is executed _after_ the IRQ handler completes. So, you stomp over whatever the IRQ handler did. down(blocking_sem); Since the IRQ handler did up(), this down() won't sleep. Try setting flagvar = 1 _before_ you chew_on_hardware(). Make sure that what ever you do is SMP safe. Store the tick count of when you do the IRQ handler and when you do chew_on_hardware(). You'll see if I am right. i.e., try: before = system_ticks_NOW flagvar = 1; chew_on_hardware (); after = system_ticks_NOW printk (..., before, after, irq) interrupt_handler ( ..) { irq = system_ticks_NOW ... } Cheers, -Sudhi. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Is this a valid construct?
Also sprach Matthew Dharm: } Does the following pseudocode do what I think it does? } } Assume the semaphore is properly initialized to locked. } } int flagvar = 0; } struct semaphore blocking_sem; } } void function_called_from_kernel_thread(void) } { } chew_on_hardware(); } flagvar = 1; } down(blocking_sem); } } if (flagvar) } printk("something went wrong") } else } printk("everything okay") } } } } void function_called_from_interrupt_context() } { } flagvar = 0; } up(blocking_sem); } } } } void function_to_call_from_timeout() } { } up(blocking_sem); } } } } The idea is this -- I chew on the hardware, then sleep on the semaphore. I } then either get woken up by an IRQ (which may never come), or the timeout. } I then try to use the flagvar to determine which of the two happened. } } This _looks_ valid to me... but I'm seeing occurances where I get the IRQ } (yes, I'm sure of it) but flagvar == 1, which confuses me. } Are you sure there isn't a race on the flagvar variable? Like, the interrupt happens, it gets set to 0, then before we can ``up'' the semaphore, it's set to 1 in the function_called_from_kernel_thread()? -- || Bill Wendling[EMAIL PROTECTED] PGP signature
Re: Is this a valid construct?
You know, it would help if I posted the right pseudocode. Yes, I found this race condition a while ago. I do set flagvar _before_ I chew on the hardware. Nevertheless, what I'm seeing looks like either (a) the change to flagvar isn't being propagated from the IRQ handler to the thread, or (b) the down() fails to sleep for no good reason. Matt On Mon, Oct 16, 2000 at 05:34:02PM -0500, Sudhindra Herle wrote: You have a race condition. int flagvar = 0; struct semaphore blocking_sem; void function_called_from_kernel_thread(void) { chew_on_hardware(); ^^^ As soon as you do/did this, the IRQ must've happened. So, before the next statement is executed, the IRQ handler is called which clears flagvar. flagvar = 1; This is executed _after_ the IRQ handler completes. So, you stomp over whatever the IRQ handler did. down(blocking_sem); Since the IRQ handler did up(), this down() won't sleep. Try setting flagvar = 1 _before_ you chew_on_hardware(). Make sure that what ever you do is SMP safe. Store the tick count of when you do the IRQ handler and when you do chew_on_hardware(). You'll see if I am right. i.e., try: before = system_ticks_NOW flagvar = 1; chew_on_hardware (); after = system_ticks_NOW printk (..., before, after, irq) interrupt_handler ( ..) { irq = system_ticks_NOW ... } Cheers, -Sudhi. -- Matthew Dharm Home: [EMAIL PROTECTED] Maintainer, Linux USB Mass Storage Driver DP: And judging from the scores, Stef has the sma... T: LET'S NOT GO THERE! -- Dust Puppy and Tanya User Friendly, 12/11/1997 PGP signature