Hi, On 11/6/2015 1:08 PM, Girish Nagaraj wrote: > Does it help to check this issue on physical machine? Yes, this will be good option for us to debug , we can avoid all VM network virtual adopter related problems .
-AVM On 11/6/2015 1:08 PM, Girish Nagaraj wrote: > Hi Mahesh, > > If it is issue in opensaf or some configuration that I have missed, > Is it possible to get any clue from the logs? which logs should I > enable/check? > > I belive "connection lost with dh server" means it interprets that > other process that it is communicating with went down? > > with which process rde is communicating after opensaf start? I want > to see if it is still up or went down because of some reason, some log > message it prints? > > Does it help to check this issue on physical machine? > > Regards, > Girish > > > On Thu, Nov 5, 2015 at 4:13 AM, A V Mahesh > <[email protected] <mailto:[email protected]>> wrote: > > Hi Girish Nagaraj, > > I tested with ( below configuration) internal network. > Intel PRO/1000 MT Desktop. > with static IP 192.168.56.101 , and we didn't find any > issue , Opensaf comes up. > > It looks like some system spescpic issue. > > > -AVM > > > On 11/5/2015 2:00 PM, Girish Nagaraj wrote: >> Hi Mahesh, >> >> I tried with Internal network. Intel PRO/1000 MT Desktop. Same >> behavior is >> seen. >> >> Can you please try with internal network. >> >> Regards, >> Girish >> >> -----Original Message----- >> From: A V Mahesh [mailto:[email protected]] >> Sent: Thursday, November 05, 2015 12:32 PM >> To: Girish Nagaraj;[email protected] >> <mailto:[email protected]> >> Subject: Re: [users] opensaf start fails in debian >> >> Hi, >> >> Currently I don't have "Realtek PCIe GBE Family Network >> Controller" . >> >> So I may not be able to reproduce the problem , but I have >> seen some >> problems reported in the net, regarding "Realtek PCIe GBE >> Family >> Network Controller >> saying Keeps Losing Connection", so try with some other adopter >> or ty to >> update Realtek drivers. >> >> -AVM >> >> On 11/5/2015 11:36 AM, Girish Nagaraj wrote: >>> Hi Mahesh, >>> >>> I am using 32-bit machine, haven't checked with 64-bit >>> machine. >>> >>> Yes debian-i386 vm on oracle virtual box, Ethernet adapter >>> type: >>> Bridged Adapter - Realtek PCIe GBE Family Controller >>> >>> Regards, >>> Girish >>> >>> -----Original Message----- >>> From: A V Mahesh [mailto:[email protected]] >>> Sent: Thursday, November 05, 2015 11:03 AM >>> To:[email protected] >>> <mailto:[email protected]> >>> Subject: Re: [users] opensaf start fails in debian >>> >>> Hi Girish Nagaraj, >>> >>> Is this issue reproducible only on debian-i386 ( on 32-bit >>> specific >>> system ) ? >>> we just bring up opensaf on 64-bit system by configuring for >>> simplex >>> `./immxml-clustersize -s 1` and it works. >>> >>> Based on logs it doesn't look like imm.xml configuration >>> related issue >>> , >>> >>> Can you please provide your node setup , Is this debian-i386 >>> virtual >>> machine or physical system or container ? >>> if virtual machine , what is your Ethernet adopter type ( >>> Host-only >>> networking / Bridged networking adopter) ? >>> >>> >>> -AVM >>> >>> On 11/5/2015 10:08 AM, Girish Nagaraj wrote: >>>> Hi, >>>> >>>> >>>> >>>> I have debian Jessie, in this I have installed opensaf-4.6 and >>>> configured for simplex ./immxml-clustersize -s 1 >>>> >>>> >>>> >>>> starting opensaf fails >>>> >>>> >>>> >>>> #/usr/local/sbin>/etc/init.d/opensafd start >>>> >>>> [....] Starting opensafd (via systemctl): opensafd.serviceJob >>>> for >>>> opensafd.service failed. See 'systemctl status >>>> opensafd.service' and >>>> 'journalctl -xn' for details. >>>> >>>> failed! >>>> >>>> #/usr/local/sbin>systemctl status opensafd.service >>>> >>>> ● opensafd.service - OpenSAF daemon >>>> >>>> Loaded: loaded (/lib/systemd/system/opensafd.service; >>>> disabled) >>>> >>>> Active: failed (Result: timeout) since Wed 2015-11-04 >>>> 23:22:13 >>>> EST; 14s ago >>>> >>>> Process: 5508 ExecStart=/etc/init.d/opensafd start >>>> (code=exited, >>>> status=0/SUCCESS) >>>> >>>> >>>> >>>> Nov 04 23:20:52 debian-i386 osafamfnd[5658]: NO Assigning >>>> 'safSi=SC-2N,safApp=OpenSAF' ACTIVE to >>>> 'safSu=SC-1,safSg=2N,safApp=OpenSAF' >>>> >>>> Nov 04 23:20:52 debian-i386 osafimmnd[5585]: NO Implementer >>>> connected: 5 >>>> (safCheckPointService) <217, 2060f> >>>> >>>> Nov 04 23:20:52 debian-i386 osafamfnd[5658]: NO Assigned >>>> 'safSi=SC-2N,safApp=OpenSAF' ACTIVE to >>>> 'safSu=SC-1,safSg=2N,safApp=OpenSAF' >>>> >>>> Nov 04 23:20:52 debian-i386 opensafd[5799]: ZebHA(1.2.0 - >>>> 0:000000000000) services successfully started >>>> >>>> Nov 04 23:20:52 debian-i386 osafimmnd[5585]: NO Implementer >>>> connected: 6 >>>> (safEvtService) <216, 2060f> >>>> >>>> Nov 04 23:20:52 debian-i386 opensafd[5508]: Starting ZebHA >>>> Services >>>> (Using TCP):. >>>> >>>> Nov 04 23:22:13 debian-i386 osafrded[5553]: MDTM:socket_recv() >>>> = 0, >>>> conn lost with dh server, exiting library err :Success >>>> >>>> Nov 04 23:22:13 debian-i386 systemd[1]: opensafd.service start >>>> operation timed out. Terminating. >>>> >>>> Nov 04 23:22:13 debian-i386 systemd[1]: Failed to start >>>> OpenSAF daemon. >>>> >>>> Nov 04 23:22:13 debian-i386 systemd[1]: Unit opensafd.service >>>> entered >>>> failed state. >>>> >>>> #/usr/local/sbin> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> Nov 04 23:22:13 debian-i386 osafrded[5553]: MDTM:socket_recv() >>>> = 0, >>>> conn lost with dh server, exiting library err :Success >>>> >>>> >>>> >>>> It fails at: >>>> >>>> >>>> >>>> Breakpoint 1, mdtm_process_poll_recv_data_tcp () at >>>> mds_dt_trans.c:588 >>>> >>>> 588 syslog(LOG_ERR, >>>> "MDTM:socket_recv() >>>> = %d, conn lost with dh server, exiting library err :%s", >>>> recd_bytes, >>>> strerror(errno)); >>>> >>>> #0 mdtm_process_poll_recv_data_tcp () at mds_dt_trans.c:588 >>>> >>>> #1 0xb766d89e in mdtm_process_recv_events_tcp () at >>>> mds_dt_trans.c:767 >>>> >>>> #2 0xb76b5efb in start_thread (arg=0xb744df40) at >>>> pthread_create.c:309 >>>> >>>> #3 0xb755a62e in clone () at >>>> ../sysdeps/unix/sysv/linux/i386/clone.S:129 >>>> >>>> >>>> >>>> Please let me know how to fix/debug this issue. >>>> >>>> >>>> >>>> I tried changing cluster_id many times, didn’t help. >>>> >>>> >>>> >>>> Regards, >>>> >>>> Girish >>>> >>> >>> ---------------------------------------------------------------------- >>> -------- _______________________________________________ >>> Opensaf-users mailing list >>> [email protected] >>> <mailto:[email protected]> >>> https://lists.sourceforge.net/lists/listinfo/opensaf-users >>> > > > > > > . ------------------------------------------------------------------------------ _______________________________________________ Opensaf-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-users
