On 11/11/15 21:05, Richard Weinberger wrote:
> On Wed, Nov 11, 2015 at 9:46 PM, Thomas Meyer <tho...@m3y3r.de> wrote:
>> Am Montag, den 09.11.2015, 15:03 +0000 schrieb Anton Ivanov:
>>> It throws a couple of harmless "epoll del fd" warnings on reboot
>>> which
>>> result the fact that disable_fd/enable_fd are not removed in the
>>> terminal/line code.
>>>
>>> These are harmless and will go away once the term/line code gets
>>> support
>>> for real write IRQs in addition to read at some point in the future.
>>>
>>> I have fixed the file descriptor leak in the reboot case.
>> Hi,
>>
>> now with the list on copy!
>>
>> Richard: can you make some sense out of these stack traces? I can
>> provide the config if you want!
>>
>> I see a lot of bugs of type "BUG: spinlock recursion on CPU#0" with
>> this patch:
>>
>> I did look over your patch and found two errors in the irq_lock
>> spinlock handling:
>>
>> http://m3y3r.dyndns.org:9100/gerrit/#/c/2/1..2/arch/um/kernel/irq.c
>>
>> But it still seems to miss something as above bugs still occurs, but
>> now the system boots up a bit more at least.
>>
>> Example:
>>   [  225.570000] BUG: spinlock lockup suspected on CPU#0, chmod/516
>>   [  225.570000]  lock: irq_lock+0x0/0x18, .magic: dead4ead, .owner:
> Hmmm, UML is UP and does not support PREEMPT, so all spinlocks
> should be a no-op.

In that case, if I understand correctly what is going on, there are a 
couple of places - the free_irqs(), activate_fd and the sigio handler 
itself, where it should not be a mutex, not a spinlock. It is there to 
ensure that you cannot use it in an interrupt context while it is being 
modified.

If spinlock is a NOP it fails to perform this duty. The code should also 
be different - it should return on try_lock so it does not deadlock so 
spinlock_irqsave is the wrong locking primitive as it does not have this 
functionality.

That is an issue both with this patch and with the original poll based 
controller - there free_irq, add_fd, reactivate_fd can all theoretically 
produce a race if you are adding/removing devices while under high IO load.

A.

> Do you have lock debugging enabled?
>
> I this case I'd start gdb and inspect the memory. Maybe a stack corruption.
>

------------------------------------------------------------------------------
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

Reply via email to