Hi,

After some more debugging I think I've solved the problem (at least :)
An explanation follows.

First, although apparently it seemed that the modules causing the
problem were the ones related with networking (as David Fernández said
in his mail) that assumption was wrong. In fact, the modules causing the
problem are the ones that print some message in the "kernel message
buffer" (I don't know what is the right name, but I'm referring to the
message log that appears when the dmesg command is used) when modprobe
load them. For example, ip_tables prints something like "ip_tables: (C)
2000-2006 Netfilter Core Team".

The solution to the problem is using con1 instead of con0, I mean:

./linux ubd0=/tmp/root_fs_debug con=null con1=pts uml_dir=/tmp umid=run

instead of

./linux ubd0=/tmp/root_fs_debug con=null con1=pts uml_dir=/tmp umid=run

In that case you can 'iptables -L' works without problems.

Why? When the module is loaded it prints its message not only in the
internal kernel message buffer but also in con0 (this can be checked
running simply "./linux ubd0=/tmp/root_fs_debug", without con=
redirectors). When con0 is redirected to null (con0=null) there is no
problem, but if con0 is redirected to a pts (con0=pts) I guess that, in
the moment of printing the message, some problem occurs with the output
(as Jeff suggest in its mail) thus causing the vm hang.

This explains also why putting the module in /etc/modules works. The
loading of the list of modules in /etc/modules is performed before UML
assign virtual console to pts devices (the sequence can be checked
observing the booting log).

I think that my solution is more a workaround that a definitive
solution. Why when con0 is assigned to a pts modules can not (because of
it hangs the vm) print its message but when it is assigned to null it
works? Is there a bug in the UML kernel that need to be fixed? Or maybe
the bug is in modprobe? I leave the question open for the experts in the
UML internals... :)

Regarding the tests suggested by Paolo:

> Please try logging in via SSH and reproducing the problem and the stacktrace, 
> and also removing con=null - also have you double checked con=null is ok 
> (maybe it was con=none, I'm not sure). I'm not sure screen is perfectly safe 
> to use (it should be).

Do you really need that I perform these test or considers the report
above is enough? If it's really needed I can do them, but it would take
me some time (and maybe now it isn't a good idea because of they won't
provide additional useful information :)

Best regards,

--------------------
Fermín Galán Márquez
CTTC - Centre Tecnològic de Telecomunicacions de Catalunya
Parc Mediterrani de la Tecnologia, Av. del Canal Olímpic s/n, 08860
Castelldefels, Spain
Room 1.02
Tel : +34 93 645 29 12
Fax : +34 93 645 29 01
Email address: fermin dot galan at cttc dot es

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
User-mode-linux-user mailing list
User-mode-linux-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-user

Reply via email to