Hi,

Something has changed in the last few weeks that has affected
the symbol binding in mdb modules.  I have not worked out
what influenced this change, nor even whether it is really
a new bug or a newly-exposed existing bug, but it does
break at least one dcmd and perhaps more.

This email is both a heads-up and to prompt any theories or
confessions.  A glance around the mdb source finds no
recent changes in this area that I can see (closest is the
BrandZ stuff and it seems innocent), so my guesses are
perhaps the switch to Studio 11 or the recent elimination
of spec files 6357230.  I'll log a bug later today.

I have found in recent weeks that (in my project bfu archives,
in sync with onnv) that ::nvlist has worked from kmdb but
not mdb -k or mdb with a process target.  I finally took
the time to debug today, and it turns out that the following

usr/src/cmd/mdb/common/modules/libnvpair/libnvpair.c:

static const mdb_dcmd_t dcmds[] = {
         { NVPAIR_DCMD_NAME, NVPAIR_DCMD_USAGE, NVPAIR_DCMD_DESCR,
                 nvpair_print },
         { NVLIST_DCMD_NAME, NVLIST_DCMD_USAGE, NVLIST_DCMD_DESCR,
                 nvlist_print },
         { NULL }
};

is at runtime now picking up the nvlist_print from /usr/lib/libnvpair.so.1
rather than the intended one in usr/src/cmd/mdb/common/modules/genunix/nvpair.c
which it has certainly picked up in the past.  We get an immediate
segmentation fault.

Running mdb with LD_DEBUG=bindings on snv_47 shows:

07054: 1: binding file=/usr/lib/mdb/kvm/amd64/genunix.so to 
file=/usr/lib/mdb/kvm/amd64/genunix.so: symbol `nvlist_print'

Doing the same on onnv gate archives of last night shows:

100728: 1: binding file=/usr/lib/mdb/kvm/amd64/genunix.so to 
file=/lib/64/libnvpair.so.1: symbol `nvlist_print'

To reproduce just ::nvlist on any nvlist you can find.  For example on
a Nevada amd64 system:

mdb -k <<EOM
*mc_list::print mc_t mc_nvl | ::nvlist
EOM

My current fix is to change the dcmd function to be 'print_nvlist' and so avoid
the issue.  I'm not familiar with the rules for loading mdb modules and
satisfying their symbols, so I don't know if using nvlist_print as before
was legal (if ill-advised).  I've not checked for any other dcmds or walkers
that could be affected/exposed by this.

Thanks

Gavin

-- 
Gavin Maltby, Solaris Kernel Development.

Reply via email to