Hello !
I am helping some guys with migration to DTrace in their (quite big)
application. during the migration they hit very strange problem.
Here is the background:
1. The application is 64 bit (amd64).
2. The application manages storage and uses config_admin(3CFGADM) to
do it dynamically.
3. The application is heavily instrumented with USDT probes (thousands of them).
The issue we observe is as that when the application is running under
dtrace as soon as one of the APIs from config_admin(3CFGADM) is called
the application hangs. top shows everything is idle, but dtrace, which
consumes around 24% of CPU (entirely in the kernel, according to top).
At this point it can last for hours without making any visible
progress. We've tried different things to pinpoint the problem. For
example, recompiling everything to 32bit doesn't exhibit this issue at
al. Also if we ran the the application with only small number of
probes enabled (couple of dozens) it manages to get over config_admin
call, albeit _very_ slowly. When stuck, as soon as the dtrace process
is terminated, the application resumes and continues as expected.
Judging by other means we feel that the problem manifests itself when
config_admin loads /usr/lib/64/libsun_fc.so.1 object. This object
comes as the final link of the following chain:
/usr/lib/amd64/libcfgadm.so.1
/usr/lib/cfgadm/amd64/fp.so.1
/usr/lib/64/libHBAAPI.so.1
/usr/lib/64/libsun_fc.so.1
A number of times we've managed to crash the application when
terminated the dtrace with Ctrl-C. When this happened the stack always
showed the rtld_db_dlactivity at the top of the stack (see below).
I am scratching my head trying to figure out where the problem is, or
what is the possible workaround can be. Is there any way to diagnose
the problem further ? I'd appreciate any help.
cc5r1:/var/core# mdb core.1746
Loading modules: [ libumem.so.1 libc.so.1 libuutil.so.1 libavl.so.1
libnvpair.so.1 libsysevent.so.1 ld.so.1 ]
> $C
fffffd7fd11786b0 ld.so.1`rtld_db_dlactivity()
fffffd7fd11786f0 ld.so.1`leave+0x3d()
fffffd7fd1178760 ld.so.1`call_init+0x126()
fffffd7fd1178790 ld.so.1`load_completion+0x10e()
fffffd7fd1178810 ld.so.1`dlmopen_intn+0x223()
fffffd7fd1178830 ld.so.1`dlmopen_check+0xa6()
fffffd7fd1178860 ld.so.1`dlopen+0x43()
fffffd7fd1178ba0 libHBAAPI.so.1`HBA_LoadLibrary+0x287()
fffffd7fd1179840 fp.so.1`findMatchingAdapterPort+0x3a()
fffffd7fd1179b10 fp.so.1`fca_change_state+0x38()
fffffd7fd1179e50 fp.so.1`cfga_change_state+0x334()
fffffd7fd117ab40 libcfgadm.so.1`config_change_state+0x11e()
fffffd7fd117abd0
iopd.so`__1cLConfigAdminLchangeState6MnKcfga_cmd_t_ipkpcbbI_nKcfga_err_t__+0x177()
fffffd7fd117ac50
iopd.so`__1cLConfigAdminLchangeState6MnKcfga_cmd_t_pnPFixLenStrVector4UbE__bbI_nKcfga_err_t__+0xd4()
fffffd7fd117ac80
iopd.so`__1cLConfigAdminRunconfigureStatic6Mb_nKcfga_err_t__+0x3f()
fffffd7fd117acd0
iopd.so`__1cLConfigAdminRunconfigureStatic6Mpkc_nKcfga_err_t__+0x7f()
fffffd7fd117ad50
iopd.so`__1cLConfigAdminFretry6Mm0A_Mpkc_nKcfga_err_t_iII2_b_+0x96()
fffffd7fd117b150
iopd.so`__1cPiop_smon_serverIiop_init6MpnPsmon_iop_init_s__nNcllf_status_e__+0x13d()
fffffd7fd117bcf0
iopd.so`__1cPiop_smon_serverGserver6MpcLpnJdoor_desc_I_v_+0x61d()
fffffd7fd117fd70 libcommon.so.1`server_proc+0xc3()
0000000000000000 libc.so.1`__door_return+0x4d()
--
Regards,
Cyril
_______________________________________________
dtrace-discuss mailing list
[email protected]