[email protected] writes:
> Hello,
> 
> Thank you for using Dell EMC OMSA.
> 
> Kindly note that, OMSA is not officially certified and supported on CentOS. 
> However, we have tried to reproduce on R730 server and CentOS 6.8 64-bit. We 
> are unable to reproduce the issue.
> 
> Can you please help us providing more details about the issue?
> 
> Whether you have installed directly OMSA8.4 version or it has been upgraded 
> from different version?
 
 In brief: it seems to fail on machines not running the latest CentOS 6.8,
 ie. running 6.4/6.6.

 All machines were upgraded to 8.4 last October and were working fine since.
 When I noticed what is happening, I removed 8.4 completely and reinstalled
 from scratch, no change. By completely, I mean yum remove and then rm -rf 
/opt/dell.

> Can you share us the coredump or stack trace of the crash?
 
 I don't get much for either dsm_sa_eventmgrd or dsm_sa_snmpd.

[root@host ]# /opt/dell/srvadmin/sbin/srvadmin-services.sh start ; strace -p 
$(ps -ef |awk '/\/opt\/dell\/srvadmin\/sbin\/dsm_sa_snmpd/{print $2}')
Starting Systems Management Device Drivers:
Starting dell_rbu:                                         [  OK  ]
Starting ipmi driver: 
Already started                                            [  OK  ]
Starting Systems Management Data Engine:
Starting dsm_sa_datamgrd:                                  [  OK  ]
Starting dsm_sa_eventmgrd:                                 [  OK  ]
Starting dsm_sa_snmpd:                                     [  OK  ]
Starting DSM SA Shared Services:                           [  OK  ]

Starting DSM SA Connection Service:                        [  OK  ]

Process 12017 attached - interrupt to quit
rt_sigtimedwait([INT TERM], NULL, NULL, 8 <unfinished ...>
+++ killed by SIGSEGV +++
[root@host ~]# 

> Have you upgraded any OS pkg's?

 Since last July, no packages have been updated or installed other than
 dell-system-update, dsucatalog, invcol. That's for the 6.4/6.6 systems.

> Any other useful data for reproducing the issue?
 
 It's a bizarre problem.  Machines that have seen no other updates in the
 past year than OMSA/dsu etc. This is a machine where everything was working
 fine up until a few minutes ago. It runs CentOS 6.6 and had OMSA 8.4 installed
 last October.

[root@host log]# cat yum.log
Jan 24 11:25:35 Updated: dsucatalog-17.01.00-TDDR9.noarch
Jan 24 11:25:36 Updated: dell-system-update-1.3.1-17.01.00.x86_64
Jan 24 13:05:16 Erased: dsucatalog
Jan 24 13:17:57 Installed: dsucatalog-17.01.00-TDDR9.noarch
Jan 24 13:29:58 Installed: 
invcol_WF06C_LN64_16.12.200.896_A00-16.12.200.896-WF06C.x86_64
Feb 28 14:00:50 Updated: dsucatalog-17.02.00-WF25X.noarch
Feb 28 14:00:50 Updated: dell-system-update-1.4.0-17.02.00.x86_64
Mar 01 11:49:16 Erased: dsucatalog
Mar 01 12:01:41 Installed: dsucatalog-17.02.00-WF25X.noarch
[root@host log]# 
[root@host log]# cd
[root@host ~]# /opt/dell/srvadmin/sbin/srvadmin-services.sh status
dell_rbu (module) is running
ipmi driver is running
dsm_sa_datamgrd (pid 31664 31382) is running
dsm_sa_eventmgrd (pid 31617) is running
dsm_sa_snmpd (pid 31645) is running
dsm_om_shrsvcd (pid 31898) is running
dsm_om_connsvcd (pid 31964 31963) is running
[root@host log]# /opt/dell/srvadmin/sbin/srvadmin-services.sh stop

Shutting down DSM SA Shared Services:                      [  OK  ]


Shutting down DSM SA Connection Service:                   [  OK  ]


Stopping Systems Management Data Engine:
Stopping dsm_sa_snmpd:                                     [  OK  ]
Stopping dsm_sa_eventmgrd:                                 [  OK  ]
Stopping dsm_sa_datamgrd:                                  [  OK  ]
Stopping Systems Management Device Drivers:
Stopping dell_rbu:                                         [  OK  ]
[root@host log]# ps -ef |grep dsm
root     27396 18309  0 17:50 pts/1    00:00:00 grep dsm
[root@host log]# /opt/dell/srvadmin/sbin/srvadmin-services.sh start
Starting Systems Management Device Drivers:
Starting dell_rbu:                                         [  OK  ]
Starting ipmi driver: 
Already started                                            [  OK  ]
Starting Systems Management Data Engine:
Starting dsm_sa_datamgrd:                                  [  OK  ]
Starting dsm_sa_eventmgrd:                                 [  OK  ]
Starting dsm_sa_snmpd:                                     [  OK  ]
Starting DSM SA Shared Services:                           [  OK  ]

Starting DSM SA Connection Service:                        [  OK  ]

[root@host log]# /opt/dell/srvadmin/sbin/srvadmin-services.sh status
dell_rbu (module) is running
ipmi driver is running
dsm_sa_datamgrd (pid 27888 27616) is running
dsm_sa_eventmgrd is stopped
dsm_sa_snmpd is stopped
dsm_om_shrsvcd (pid 27937) is running
dsm_om_connsvcd (pid 28005 28004) is running

Mar 10 17:51:11 host kernel: dsm_sa_eventmgr[27854]: segfault at 0 ip 
00007f24a56e6220 sp 00007f24a640e0f8 error 4 in 
libc-2.12.so[7f24a55bd000+18a000]
Mar 10 17:51:11 host snmpd[2130]: [smux_process] peek failed: Success
Mar 10 17:51:11 host kernel: dsm_sa_snmpd[27882]: segfault at 7fc800000000 ip 
00007fc8139d2220 sp 00007fc8146d9828 error 4 in 
libc-2.12.so[7fc8138a9000+18a000]
Mar 10 17:52:47 host kernel: MaserIE[28384]: segfault at 7f8b00000000 ip 
00007f8bda6f0220 sp 00007ffc0e967098 error 4 in 
libc-2.12.so[7f8bda5c7000+18a000]

_______________________________________________
Linux-PowerEdge mailing list
[email protected]
https://lists.us.dell.com/mailman/listinfo/linux-poweredge

Reply via email to