Yesterday I noticed that virtualbox doesn't work any more on my snv_91, which I 
had bfu'ed to
post snv_94 opensolaris bits (compiled from 2008-07-11 sources).

Quick check with an unmodified snv_91 and xen matrix-unstable bits (based on 
snv_93,
booted on metal)  shows that virtualbox starts fine on those older onnv-gate 
bits.

Problem with the post snv_94 bits is that the VBoxSVC process crashes with a 
SIGSEGV:

    % virtualbox
    ERROR: 0 bytes read from child process

dmesg:

Jul 14 00:18:01 max genunix: [ID 603404 kern.notice] NOTICE: core_log: 
VBoxSVC[869] core dumped: /cores/VBoxSVC-869


# pflags /cores/VBoxSVC-869
core '/cores/VBoxSVC-869' of 869:       /opt/VirtualBox//VBoxSVC --automate
        data model = _ILP32  flags = ORPHAN|MSACCT|MSFORK
 /1:    flags = 0
        sigmask = 0xffffbefc,0x0000ffff  cursig = SIGSEGV

# pstack /cores/VBoxSVC-869
core '/cores/VBoxSVC-869' of 869:       /opt/VirtualBox//VBoxSVC --automate
 afc83d14 PR_EnterMonitor (0) + 23
 af41133a _PR_InitLinker (af57b000, 80418f8, af56493c, 0, 1, 0) + 3e
 af41a4e1 PR_Init  (0, 1, 0) + 195
 af56493c prldap_nspr_init (afffc7dc, af5a0a00, 9, 8041944, affd38fc, afffc178) 
+ 74
 af5658e9 _init    (afffc178, af7e0948, afffc7dc, 804196c, affd671e, af156e5c) 
+ 25
 affd38fc call_init (af1a0ae8, 1) + f8
 affd3e9f load_completion (af7e0948) + ef
 affd90f6 dlsym_intn (af7e0e48, afc9c94a, afb70018, 8041a24) + 19a
 affd9172 dlsym_check (af7e0e48, afc9c94a, afb70018, 8041a24) + 6e
 affd91ea dlsym    (af7e0e48, afc9c94a, af7e0948, afb9ee94, afbf16d4, afffc350) 
+ 4e
 afc6d758 pr_FindSymbolInProg (afc9c94a, affc7ed4, afb70018, 2f08) + 38
 afc6d790 _PR_InitZones (afc84b79, afcb43e4, 8041aa8, afc7254b, 8041ab8, 
afb70018) + 21
 afc723b7 _PR_InitStuff (8041ab8, afb70018, 8041ab8, afc84b79, 8041ad8, 
afcb43e4) + 2d
 afc7254b _PR_ImplicitInitialization (8041ad8, afcb43e4, 8041ad8, afc50d32, 
afc50e22, afcf86c8) + b
 afc84b79 PR_GetCurrentThread (afc50e22, afcf86c8, 0, afcf86c8, 8041af8, 
afcb43e4) + 23
 afc50d32 _ZN9nsIThread10GetCurrentEPPS_ (afcf86c8, 0, 0, afc50dfe) + 1e
 afc50e22 _ZN9nsIThread13SetMainThreadEv (af722a00, 8041b20, af946f74, 3, 
8041b2c, affd003d) + 30
 afc891a0 NS_InitXPCOM2 (8046cf0, 8278980, 82771a8, 0) + 30
 0818c585 _ZN3com10InitializeEv (0, 0, 8273a18, 1) + 4c1
 08179c8f main     (2, 8046e80, 8046e8c) + 20b
 080b26e8 _start   (2, 8047020, 8047039, 0, 8047044, 8047064) + 80

In mdb we see that libnspr4.so`_PR_InitLinker+0x39 is calling PR_EnterMonitor,
which got resolved as "VBoxXPCOM.so`PR_EnterMonitor". Apparently there also
is a "libnspr4.so`PR_EnterMonitor" symbol.  Most likely the expected behavior
is to call libnspr4.so`PR_EnterMonitor.

It seems that part of the problem is the lazyloaded libnspr4.so, which is
used by /usr/lib/libldap.so.5. Both  libnspr4.so and VBoxXPCOM.so seem
to contain lots of duplicate symbol definitions (This might be a virtualbox
problem, maybe they should dynamically link against libnspr4.so, or something
like that ...).  It seems that with older nevada builds, the lazyloaded  
libnspr4.so
isn't loaded at all, but with current opensolaris bits, it gets loaded and this 
causes
problems.


Workaround:
===========

Start VBoxSVC with environment variable LD_NODIRECT=1:


# cd /opt/VirtualBox

# mv VBoxSVC VBoxSVC.real

# cat > VBoxSVC
#!/bin/sh

LD_NODIRECT=1
export LD_NODIRECT 

exec /opt/VirtualBox/VBoxSVC.real "$@"

# chmod +x VBoxSVC
 
 
This message posted from opensolaris.org

Reply via email to