Ok, I'm finally back from all the traveling I have been doing, and I can
focus some effort on this. Thanks for sending the full logs, that gave
me enough information to go on.
I just pulled the current git and I didn't have any problem, but I don't
have a machine with a SMIC interface. However, I can't see how that
would make a difference in this case.
Looking at your trace and the code, I can't see anything wrong.
The place where it says:
start_smic_transaction - 18 08
is the last message sent to the BMC as part of startup. That's what the
"get_guid()" function in ipmi_msghandler.c does, and it does get a
response. It returns an error response, but that's ok, as many systems
do not have a GUID and either way it should kick off the initialization
code again (and the GUID code has been there a while). At this point in
time the IPMI interface is fully set up and operational, it's just doing
some housekeeping for setting up the sysfs and proc information. (The
channel scan will not occur because this is an IPMI 1.0 system)
I don't think bisecting is going to help, as the code changes in
question are not going to have anything to do with where the break
appears to occur. From what I can tell, one of the following things is
happening:
* Somehow the wakeup to get_guid() is not happening properly.
* add_proc_entries() in ipmi_msghandler.c is hanging someplace.
* The sysfs initialization in ipmi_bmc_register() is hanging.
* The proc entries added in try_smi_init() in ipmi_si_intf.c are
hanging.
So, can you check the following after attempting to load the module?
* Can you look in /proc/ipmi and /proc/ipmi/0 see what exists? If
/proc/ipmi/0 exists, and the version, ipmb, and stats files exist
in it, that means add_proc_entries() succeeded. If type,
si_stats, and params exist, that means initialization should be
complete.
* Can you look in /sys/class/ipmi? If ipmi0 exists there, that
means the sysfs code probably worked ok.
I'm guessing this is either some transient bug someplace else or some
latent bug in the IPMI code that is being exercised by some other change.
If you are really adventurous, you could compile the kernel with the
MAGIC_SYSRQ config enabled, then to a sysrq-T to get a backtrace of all
tasks. Then hunt down the modprobe task. A serial console is the best
for this, if you have it, because you can send it to a file easily.
That would tell exactly where it is hanging. However, we are getting
pretty far into the kernel hacker realm here.
Thanks,
-corey
------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables
unlimited royalty-free distribution of the report engine
for externally facing server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer