Hi, Thread update:
> *cladm_dbg/s 0xffffff04f55ad008: th ffffff04f834b580 tm 17107404: failfastd(385):start th ffffff04f834b580 tm 17107413: failfastd(385):fork1 th ffffff04f834b580 tm 17107464: failfastd(385):fork1 th ffffff04f834b580 tm 17107465: failfastd(385):done th ffffff04f8349880 tm 17107509: failfastd(393):fork1 th ffffff04f8351c20 tm 17107541: cl_exec384,1:Main: Default sched class = 1 th ffffff04f8351c20 tm 17107543: cl_exec384,1:Main: starting the cl_exec service th ffffff04f8351c20 tm 17107744: cl_exec384,1:Main: wait for daemon to be ready th ffffff04f8351c20 tm 17107753: cl_exec384,1:Main: cl_exec server object : <cl_exec.1> th ffffff04f8349880 tm 17107857: failfastd(393):ready th ffffff04f8349880 tm 17107866: failfastd(393):synchro file th ffffff04f8349880 tm 17107869: failfastd(393):write pipe th ffffff04f834b580 tm 17107870: failfastd(385):read pipe th ffffff04f834b580 tm 17107871: failfastd(385):exit th ffffff04ea72be20 tm 17107892: cl_exec394,1:Worker: create daemon process th ffffff04ea72be20 tm 17107945: cl_exec394,1:Worker: starting th ffffff04ea72be20 tm 17107950: cl_exec394,1:Worker: create signals thread th ffffff04f8348a80 tm 17107961: cl_exec394,3:signal thread starting th ffffff04f8348e00 tm 17108028: cl_exec395,1:Daemon: starting daemon process th ffffff04f8348e00 tm 17108101: cl_exec395,1:Daemon: create signals thread th ffffff04eb1e0580 tm 17108110: cl_exec395,3:signal thread starting th ffffff04f8348e00 tm 17108223: cl_exec395,1:Daemon: bind server object <cl_exec.1> th ffffff04f8351c20 tm 17117775: cl_exec384,1:Main: wait for cl_exec obj th ffffff04f8351c20 tm 17117839: cl_exec384,1:Main: cl_exec obj resolved in name server th ffffff04f8351c20 tm 17117841: cl_exec384,1:Main: daemon is ready th ffffff04f8351c20 tm 17117841: cl_exec384,1:Main: service is online th ffffff04f833cb00 tm 17122308: clexec405,1:main th ffffff04f833cb00 tm 17122312: clexec405,1:daemonize th ffffff04f833cb00 tm 17122363: clexec405,1:daemonize fork th ffffff04f833cb00 tm 17122363: clexec405,1:wait_for_daemon th ffffff04f833c780 tm 17122420: clexec406,1:create_process_pair fork1 th ffffff04f833c780 tm 17122464: clexec406,1:daemon_process th ffffff04f833c400 tm 17122513: clexec407,1:worker_process th ffffff04f833c780 tm 17122927: clexec406,1:daemon_process ready th ffffff04f833cb00 tm 17132542: clexec405,1:wait ha_mounter th ffffff04f833c400 tm 17132593: clexec407,1:wait signal thread th ffffff04f833cb00 tm 17132603: clexec405,1:nameserver resolved th ffffff04f833cb00 tm 17132612: clexec405,1:end file created th ffffff04f833cb00 tm 17132613: clexec405,1:main_end th ffffff04f9f54760 tm 17698695: cmm_callback_worker:ha_mounter.1 exec /usr/cluster/lib/sc/run_reserve -c reset_shared_bus th ffffff04f8339c20 tm 17698705: clexec406,11:execit</usr/cluster/lib/sc/run_reserve -c reset_shared_bus> th ffffff04faf808a0 tm 17698729: clexec407,3:worker_thread fork1 </usr/cluster/lib/sc/run_reserve -c reset_shared_bus> fd 3 th ffffff04f9f48740 tm 17698876: clexec682,3:execl </usr/cluster/lib/sc/run_reserve -c reset_shared_bus> th ffffff04f833c080 tm 17707441: clexec407,2:catch signal 18 18 si_code 1 si_pid 682 si_uid 3 th ffffff04faf808a0 tm 17707444: clexec407,3:<0> fd 3 retval 0 data.len 6 th ffffff04f8339c20 tm 17707457: clexec406,11:execit</usr/cluster/lib/sc/run_reserve -c reset_shared_bus> error 0 th ffffff04f9f54760 tm 17707462: cmm_callback_worker:ha_mounter.1 exec /usr/cluster/lib/sc/run_reserve -c reset_shared_bus excep 0 th ffffff04fa50cc80 tm 17709101: mount_client_impl::activate:ha_mounter.1 alive 1 except 0 th ffffff04f833c080 tm 17717030: clexec407,2:do_log clexecd: Got an unexpected signal 18 in process work_process (pid=407, ppid=406) th ffffff04f833c080 tm 17717055: clexec407,2:do_exit 1 th ffffff04f833b880 tm 17717415: clexec406,3:do_log clexecd: Daemon exiting because child died. th ffffff04f833b880 tm 17717437: clexec406,3:do_exit 4 So this command fails on boot: /usr/cluster/lib/sc/reserve -c reset_shared_bus -h ohac-test-2 Btw, I tried that on b124 and b127. On Wed, Nov 25, 2009 at 11:53 AM, Piotr Jasiukajtis <estseg at gmail.com> wrote: > Hi, > > I have build OHAC core+agents against ON b127. > > Once the cluster is booted 'clexecd' daemon crashes and failfast > restarts the node. > > Oct 10 14:43:10 ohac-test-2 : [ID 719008 daemon.error] clexecd: Got an > unexpected signal 18 in process work_process (pid=427, ppid=426) > Oct 10 14:43:10 ohac-test-2 Cluster.Framework: [ID 899305 > daemon.error] clexecd: Daemon exiting because child died. > Oct 10 14:43:13 ohac-test-2 savecore: [ID 570001 auth.error] reboot > after panic: Failfast: Aborting zone "global" (zone ID 0) because > "clexecd" died 30 secon > ds ago. > > Have anyone from Sun tried to build OHAC core against the latest ON > build? If so, do you have a patch for that or something? > > Thanks and regards, > > -- > Piotr Jasiukajtis | estibi | SCA OS0072 > http://estseg.blogspot.com > -- Piotr Jasiukajtis | estibi | SCA OS0072 http://estseg.blogspot.com