Re: Problems with the demo CD and qemu

2005-02-21 Thread Adam Lackorzynski

On Fri Feb 18, 2005 at 11:24:07 +0100, Cedric Roux wrote:
 (I just tried a boot without CONFIG_HANDLE_SEGMENTS, it crashed,
 I did not try to know why.)

Probably because Linux tries to use it and the kernel rejects it.

 It would be nice to have some infos about this issue, though,
 just to understand what's exactly going on. Do you, Adam, or
 someone else, have pointers or info about it? A second question

Well, it's hard to stay calm when talking about this issue, also, you'll
find some nice rants out there. But I'll try.
Once upon a time someone decided that fast access to a thread local
storage (TLS) for user programs is necessary. As x86 doesn't have many
registers a segment is used to point to the TLS. It's %gs. L4 uses %gs
to point to the UTCB. In Linux, this is implemented by using 3 GDT
entries for each thread which are reloaded for every threadswitch. glibc
still supports the old method with LDT entries, but that didn't work
quite well last time I tried. I guess nobody's using it and it's buggy
somehow but I don't really know. Fiasco can play the LDT game which can
also be useful for other things. But fortunately TLS usage can be
disabled either by rm'ing the tls dirs or using LD_ASSUME_KERNEL, so we
can live with it.

Standard pointers:
http://people.redhat.com/drepper/nptl-design.pdf
http://people.redhat.com/drepper/tls.pdf

 is: is the CONFIG_HANDLE_SEGMENTS mandatory and why? More generally,
 what option in the fiasco configuration are mandatory and what
 are optionnal, and what's the best configuration (having speed
 constraints in mind)?

The option is mandatory right now, I'll need some support outside of
L4Linux to make it optional. For performance stuff, Fiasco outputs the
options on startup you should enable/disable, just try this.




Adam
-- 
Adam [EMAIL PROTECTED]
  Lackorzynski http://os.inf.tu-dresden.de/~adam/

___
l4-hackers mailing list
l4-hackers@os.inf.tu-dresden.de
http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers


Re: Problems with the demo CD and qemu

2005-02-18 Thread Cedric Roux
On Mon, 14 Feb 2005, Adam Lackorzynski wrote:

 On Mon Feb 14, 2005 at 10:40:17 +0100, Cedric Roux wrote:
 
 I know, but haven't found time yet to look deeper into it. Something
 broke, that's sure. Disabling TLS could help for the time being (rm -r
 /lib/tls...).

L4Linux is there, thanks.

The protection fault was occuring when fiasco calls switch_cpu,
there is a pop gs that generates the protection fault.
That's all I can to help you. The TLS stuff is out of my knowledge,
and I run out of time for this. The system boots without the
TLS handling of the tls/libc (or libpthread), that's fine for me :)
(I just tried a boot without CONFIG_HANDLE_SEGMENTS, it crashed,
I did not try to know why.)

It would be nice to have some infos about this issue, though,
just to understand what's exactly going on. Do you, Adam, or
someone else, have pointers or info about it? A second question
is: is the CONFIG_HANDLE_SEGMENTS mandatory and why? More generally,
what option in the fiasco configuration are mandatory and what
are optionnal, and what's the best configuration (having speed
constraints in mind)?

Thanks,
Cedric.

 
 
 Adam
 


___
l4-hackers mailing list
l4-hackers@os.inf.tu-dresden.de
http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers


Re: Problems with the demo CD and qemu

2005-02-14 Thread Adam Lackorzynski
On Mon Feb 14, 2005 at 10:40:17 +0100, Cedric Roux wrote:
 Maybe you can test this specific case and not call switch_to_irq_idle_loop
 when the calling thread is the IRQ one?

I guess that's broken then, I'll look into it.

 Now, I have some General Protection Faults occuring at l4linux boot time
 I don't know where and I don't know why. Is it a known issue?
 because I don't have much time to investigate it.
 And if it is a known issue, what should I do to fix this?
 (otherwise, I'll live with it, trying to get the point when time
 will be there)

I know, but haven't found time yet to look deeper into it. Something
broke, that's sure. Disabling TLS could help for the time being (rm -r
/lib/tls...).


Adam
-- 
Adam [EMAIL PROTECTED]
  Lackorzynski http://os.inf.tu-dresden.de/~adam/

___
l4-hackers mailing list
l4-hackers@os.inf.tu-dresden.de
http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers


Re: Problems with the demo CD and qemu

2005-02-12 Thread Adam Lackorzynski
On Fri Feb 11, 2005 at 20:05:31 +0100, Cedric Roux wrote:
 ethernet card (ne2k-pci) sends an IRQ (number 9).
 The IRQ thread passes wait_for_irq_message_hw then calls do_IRQ.
 do_IRQ does its stuff, then calls irq_exit.
 
 In irq_exit, we have a softirq pending (don't ask me why, that's just
 the way it is), so we call do_softirq.
 
 We then enter net_tx_action.
 
 I then pass the details. To be short, we enter the TCP/IP stack,
 do some stuff, then go back into the ethernet driver code,
 in ei_start_xmit (8390.c).
 
 This function calls disable_irq_nosync, which calls
 desc-handler-disable, which is in fact do_l4lx_irq_dev_disable.
 
 This one will call switch_to_irq_idle_loop.
 
 I don't exactly know what happens next (lack of time), but if
 I remove the call to switch_to_irq_idle_loop (and of course
 the corresponding call to switch_to_irq_thread) in
 do_l4lx_irq_dev_disable (respectively do_l4lx_irq_dev_enable)
 everything works fine (well, I don't get crashes when I do
 my telnet anymore).

Thanks for this ample explanation.
 
 My questions are:
   1 - why to call this switch_to_irq_idle_loop? what's
   the purpose of it?

The purpose is to prevent that interrupts get through. The tricky part
here has been IRQ probing. I guess I need to reevaluate this issue...

   2 - if I remove this call, do I get a wrong system or
   is it ok? what do I lose if it is ok (speed?)?

Should be ok if it works on your system.

   3 - a comment in switch_to_irq_idle_loop says:
 /* Looks like interrupts are disabled multiple times in 2.6 */
   shouldn't you use a counter in switch_to_irq_thread and
   only do the switch if it's back to zero? (I mean, imagine 2
   calls to switch_to_irq_idle_loop followed by 1 call to
   switch_to_irq_thread, should it really come back from idle
   at this point?)

That's not what I would expect from the hardware, disable just disables
it, no matter how ofter you do it.


 (By the way, the l4linux kernel won't compile with 4k stacks,

It compiled for me as of today but I had to fix some small issues to
make it actually work (but I only tested this slightly). Should hit CVS
by tomorrow.

 you never call irq_ctx_init, maybe you should call it in
 init_IRQ?)

No, l4linux has always worked more like the 4k-IRQ way, not as the old
way in 8k-stacks.


Adam
-- 
Adam [EMAIL PROTECTED]
  Lackorzynski http://os.inf.tu-dresden.de/~adam/

___
l4-hackers mailing list
l4-hackers@os.inf.tu-dresden.de
http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers


Re: Problems with the demo CD and qemu

2005-02-11 Thread Cedric Roux
Hi again L4 Hackers,

here is what's going on.

ethernet card (ne2k-pci) sends an IRQ (number 9).

The IRQ thread passes wait_for_irq_message_hw then calls do_IRQ.

do_IRQ does its stuff, then calls irq_exit.

In irq_exit, we have a softirq pending (don't ask me why, that's just
the way it is), so we call do_softirq.

We then enter net_tx_action.

I then pass the details. To be short, we enter the TCP/IP stack,
do some stuff, then go back into the ethernet driver code,
in ei_start_xmit (8390.c).

This function calls disable_irq_nosync, which calls
desc-handler-disable, which is in fact do_l4lx_irq_dev_disable.

This one will call switch_to_irq_idle_loop.

I don't exactly know what happens next (lack of time), but if
I remove the call to switch_to_irq_idle_loop (and of course
the corresponding call to switch_to_irq_thread) in
do_l4lx_irq_dev_disable (respectively do_l4lx_irq_dev_enable)
everything works fine (well, I don't get crashes when I do
my telnet anymore).

My questions are:
  1 - why to call this switch_to_irq_idle_loop? what's
  the purpose of it?
  2 - if I remove this call, do I get a wrong system or
  is it ok? what do I lose if it is ok (speed?)?
  3 - a comment in switch_to_irq_idle_loop says:
/* Looks like interrupts are disabled multiple times in 2.6 */
  shouldn't you use a counter in switch_to_irq_thread and
  only do the switch if it's back to zero? (I mean, imagine 2
  calls to switch_to_irq_idle_loop followed by 1 call to
  switch_to_irq_thread, should it really come back from idle
  at this point?)

Thank you by advance.

Best regards,
Cedric.

(By the way, the l4linux kernel won't compile with 4k stacks,
you never call irq_ctx_init, maybe you should call it in
init_IRQ?)

On Thu, 10 Feb 2005, Cedric Roux wrote:

 Hello L4 Hackers,
 
 here follows a description of what I did. My questions come to the end
 of the message. Sorry for the length, but I wanted to be clear.

[SNIP]

 0Kernel panic: Aiee, killing interrupt handler!
 0In interrupt handler - not syncing
 
 I would like to know:
   1 - what's going on? I suspected some kind of weird IRQ firing because
   of the use of qemu, but as far as my investigations have told me,
   it doesn't seem to be that. I believed that one IRQ went to fast
   after a first one, so the linux was not yet out of the driver
   code, but the interrupts were enabled, thus crashing everything.
   I think I was wrong, no?
   2 - how to solve this. What code/doc should I read to debug it, where
   to dig. I am a bit confused for now.
 
 Thank you by advance for your help.
 
 Best regards,
 Cedric.


___
l4-hackers mailing list
l4-hackers@os.inf.tu-dresden.de
http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers