Dell - Internal Use - Confidential
Can you please share the following details and logs to investigate this issue? 1) Operating system (version) 2) OMSA version Also please collect the logs. Steps to collect dcomsm.log file: 1) Stop Data Manager Service (./srvadmin-services.sh stop) 2) Go to /opt/dell/srvadmin/etc/srvadmin-storage and open stsvc.ini file. 3) Change "Debug=Off" to "Debug=On" 4) Change all the debug levels from "DebugLevels=0,0,0,0,0,0,0,0,0,0,0" to "DebugLevels=3,3,3,3,3,3,3,3,3,3,3" 5) Start Data Manager Service (./srvadmin-services.sh stop) 6) dcomsm.log file will be generated at /opt/dell/srvadmin/var/log/openmanage location. Regards, Deepesh CP -----Original Message----- From: linux-poweredge-bounces-Lists On Behalf Of linux-poweredge-request-Lists Sent: Friday, February 10, 2017 11:30 PM To: linux-poweredge-Lists <[email protected]> Subject: Linux-PowerEdge Digest, Vol 153, Issue 4 Send Linux-PowerEdge mailing list submissions to [email protected] To subscribe or unsubscribe via the World Wide Web, visit https://lists.us.dell.com/mailman/listinfo/linux-poweredge or, via email, send a message with subject or body 'help' to [email protected] You can reach the person managing the list at [email protected] When replying, please edit your Subject line so it is more specific than "Re: Contents of Linux-PowerEdge digest..." Today's Topics: 1. dsm_sa_datamgrd segfault that crahses whole system (lejeczek) ---------------------------------------------------------------------- Message: 1 Date: Fri, 10 Feb 2017 13:55:03 +0000 From: lejeczek <[email protected]> Subject: [Linux-PowerEdge] dsm_sa_datamgrd segfault that crahses whole system To: [email protected] Message-ID: <[email protected]> Content-Type: text/plain; charset="utf-8" hi everybody also hopefully Dell tech as this should be directly hardware related I believe. I manage to segfault omsa(which then crashes the whole system): [ 1117.103438] dsm_sa_datamgrd[28952]: segfault at 0 ip 00007f2e1ab57b46 sp 00007f2e1197c020 error 4 in libdsm_sm_sasvil.so[7f2e1aae8000+bb000] Simply by having one H700 in one specific PCI slot in my R815 servers. Server(s) setup in somewhat not-usual, I've stumbled upon this segfault purely by a chance. I have "embedded" H200 in "integrated storage controller card slot" I have a Dell Broadcom 4port NIC in "expansion-card slot 2" I have a H700 in "expansion-card riser 1" Lastly I have a H800 in "expansion-card slot 5" H700 was installed for we are going to move hdd array from H200 to H700(but not just yet so we put the H700 card only). 1)Now: when H700 is in "expansion-card slot 2" everything is working perfectly fine, no segfaults. But "expansion-card slot 1" is pcieX8 which matches H700, and "expansion-card slot 2" is only pcieX4. 2)Now: segfault seems to occur only when I run omxxx storage on that specific H700 controller. $ omreport storage vdisk vdisk=0 controller=1 - H200, no segfaults $ omreport storage vdisk vdisk=0 controller=0 - H700, segfaul! omreport system summary - does not cause it. After ~20sec after segfault the system suffers from hard/cold power cycle. So this seems critical. It would be expected of tech-team to investigate it. Separately I'm going to report "tech support request" but felt sharing with other R815's I should do. b.w.&r. L -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20170210/77419bf9/attachment-0001.html ------------------------------ _______________________________________________ Linux-PowerEdge mailing list [email protected] https://lists.us.dell.com/mailman/listinfo/linux-poweredge End of Linux-PowerEdge Digest, Vol 153, Issue 4 *********************************************** _______________________________________________ Linux-PowerEdge mailing list [email protected] https://lists.us.dell.com/mailman/listinfo/linux-poweredge
