On 5/19/07, Eric Saxe <[EMAIL PROTECTED]> wrote:
Thomas De Schampheleire wrote:
> This is run on a simulated serengeti machine.
>
> In the meanwhile, I have tried two things:
>
> 1/ I made my own sbdp_cpu_poweroff function(), which is correctly
> recognized. The execution *seems* to go well, but I have nevertheless
> the idea that it is not correct. Using `psradm -v -a -n`  I can verify
> which cpus were already online and which weren't. When I run this
> command a first time after my module has powered a processor off, all
> seems ok. The cpus that were powered off are put back online.
> BUT when I run the psradm command a second time (without reloading the
> module), I get an error saying that lpl_topo_verify() failed.
> Based on the error code, the problem seems to be
> LPL_TOPO_LPL_BAD_NCPU, which is triggered in
> http://src.opensolaris.org/source/xref/onnv/aside/usr/src/uts/common/os/lgrp.c
>
> on lines 2185 and 2244.
> The lpl_topo_verify() function was executed from the following stack:
> lgrp_config
> cpu_add_active_internal
> cpu_online
> p_online_internal
> p_online
Somewhere you are introducing a lgrp topology inconsistency, and
lpl_topo_verify
has noticed. You might want to look for existing callers to
lgrp_config() in the code
paths you have changed, and verify that you aren't side stepping any
calls in your new
implementation. CPU_ADD/DEL are invoked from cpu_add_unit/cpu_del_unit, and
CPU_ONLINE/OFFLINE are invoked from
cpu_add_active_internal/cpu_remove_active

The challenge you are facing is that you're changing code paths that
various subsystems
hang off of to deal with reconfiguration events. :) If you inadvertently
change those paths
you risk side stepping those hooks which can confuse the subsystems. :P


That's what I was afraid of. The code in sbdp_cpu_poweroff() has quite
some things I don't completely understand, and it's all pretty
low-level.

This is where my other question comes in, whether CPU_OFFLINE and
CPU_POWEROFF actually differ from the OS point of view. Remember that
I am running OpenSolaris from a simulator. Since both cpu states
appear to be equivalent in their behavior, I can just stay in the
CPU_OFFLINE state (this is how the OS perceives it), and signal to the
simulator that we actually want to be powered off. As I mentioned,
offlining worked without a problem, and since my time is limited I
will use this approach instead.

Thanks again for the help,
Thomas

-Eric

_______________________________________________
opensolaris-code mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/opensolaris-code

Reply via email to