On Monday 07 August 2006 23:13, Jeff Dike wrote:
> On Sun, Aug 06, 2006 at 05:44:31PM +0200, Paolo 'Blaisorblade' Giarrusso
wrote:
> > I had this patch in my queue since some time, because it fixes some
> > spinlocks vs sleeps issues; please verify whether after your
> > restructuring it is still needed (it applied before this restructuring).
>
> I believe this patch is no longer needed. It looks like all calls to
> activate_fd are in process context. However, there are a few places
> where it can be reached inside a spinlock. These cases look like the
> spinlock is held for too long, and needs to be narrowed.
> Here's my work, if you feel like checking it.
Very interesting, thanks (and sorry for the long delay).
> Increasing indendation
> is going up the call chain.
Have you found a tool for this or just done by hand? (I've seen Understanding
the Linux VMM talking about tools for call graphs).
> At the top of each chain, there's a
> "proc", along with my reason for believing the procedure is only
> called in process context. When that's not there, it's because that
> call tree had already been covered ealier.
> There's some stuff in the network which I didn't go into because it's
> code I have no clue about.
After the networking locking patch, I'll recheck, but uml_net_open should now
be covered by a simple mutex so there's no problem.
I also think that most of them should become mutexes or disappear.
> Every procedure can call activate_fd under a spinlock is so marked.
> activate_fd
> um_request_irq
> line_setup_irq
> enable_chan
> line_open - SPINLOCK
This should be a mutex, but it's not very simple to do this since the tty
layer is not so nice as char/block/network layer, to my current knowledge.
So, the other road is that you simply reduce spinlock holding time, or that
you merge this patch when I resend (I've done some changes). Everything
however applies to SMP only, luckily, for now; IRQ disabling can cause
hangs/races on UP kernels.
And, sadly, sigio_lock must become a irq disabling spinlock; so once, at boot
time, we'll have a call to sleeping kmalloc with disabled irqs, if you don't
merge this, and this call can crash.
> uml_net_open - struct net_device.open, SPINLOCK
Same stuff, converted in my patch about network (which I'll resend).
> write_sigio_irq
> write_sigio_workaround - SPINLOCK
> maybe_sigio_broken
> activate_fd
You deleted (in [PATCH 1/4] UML - SIGIO cleanups) a comment about turning that
spinlock into a semaphore - the comment talked about reactivate_fd, however
it is still a fact that maybe_sigio_broken() should be an initcall.
Some empiric debug about _when_ it is called (printk's) would help doing the
right thing; currently it uses a spinlock to check the flag saying whether it
has already been called.
It should not - when the variable is first set in an initcall, you can simply
add a memory barrier to ensure visibility; maybe you do not even need the
barrier.
I say you don't need from existing practice, it seems, but it seems you need
because there's no reason to assume otherwise.
I do not want to touch that code since I don't understand it fully right now;
probably it makes sense, but probably documenting it would help, and it could
lead to some restructuring. The following considerations are based on the
little I have seen - I have decided not to study fully the code, I'm looking
at it but understanding it fully is not easy (btw, I've studied the code
pre-"SIGIO cleanups" and then looked at the patch).
I hope many of the following notes are wrong and that you can document it and
that I can understand my errors. However I cannot ignore my feelings about
code & data structures, which are truely redundant.
current_poll is maybe distinct since all SIGIO handling also triggers a single
IRQ (requested by write_sigio_irq), but that IRQ simply consumes the wakeup
from the fd and reactivates itself, so:
* I don't see its purpose: statistics? Triggering IRQs processing?
* This is somehow strange - and leads to the above recursion.
* I'd like you to write up _why_ you _actively_ build a model where
write_sigio_thread(), after there is activity on a single fd, removes it from
current_poll, while the irq handler almost always calls reactivate_fd() to
undo this; most drivers do it for each IRQ, except for TELNETD_IRQ and
XTERM_IRQ, which must be one-shot; but they call free_irq, so this is not
very useful for them, and anyway they should request this behaviour with a
flag if they need it (or they can ignore subsequent irqs until irq freeing by
setting their own flags).
If this is used to serialize irqs, it is a bad solution, and irq serialization
is probably a bad idea this way - it should be handled at another level, i.e.
the existing irq disabling inside irq processing; for IRQs running on
multiple CPUs the solution is likely to use a spinlock anyway in most cases
(this is the usual rule I think).
In particular (and a comment patch is preferred to a mail) why
arch/um/os-Linux/sigio.c has a list of fds to poll, i.e. current_poll, and
arch/um/os-Linux/irq.c has another one, i.e. pollfds, and there is finally
active_fds used for irq too; also if current_poll excludes just the "SIGIO
write" irq it is also redundant.
arch/um/os-Linux/irq.c:os_free_irq_by_cb, line 94 reads as:
printk("os_free_irq_by_cb - mismatch between "
"active_fds and pollfds, fd %d vs %d\n",
which can be probably considered as a sign of non-normalization (in DB
parlance) of data structure.
active_fds could probably be a set of pointers to pollfds array - there should
theoretically be a single array holding everything and active_fds pointing to
it.
Failing that (since the pollfds array must be such to be used by poll), I do
not see why active_fds must contain fds instead of relying on ones in
pollfds.
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade
http://www.user-mode-linux.org/~blaisorblade
Chiacchiera con i tuoi amici in tempo reale!
http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
User-mode-linux-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel