Is there anyway to troubleshoot the disks to see which one is on defect? Regards
Em seg, 3 de dez de 2018 às 15:08, Ryan Ratliff (rratliff) < rratl...@cisco.com> escreveu: > #1 0x044a9935 in raise () from /lib/tls/libc.so.6 > #2 0x044ab399 in abort () from /lib/tls/libc.so.6 > #3 0x0842e457 in preabort () at ProcessCMProcMon.cpp:80 > #4 0x0842fe7c in CMProcMon::verifySdlRouterServices () at > ProcessCMProcMon.cpp:720 > > > The ccm process is killing itself because it isn’t getting enough > resources. > > Nov 29 17:26:12 CMBL-03-01 local7 2 : 1: CMBL-03-01.localdomain: Nov 29 > 2018 19:26:12.340 UTC : %UC_CALLMANAGER-2-CallManagerFailure: > %[HostName=CMBL-03-01][IPAddress=192.168.183.3][Reason=4][Text=CCM > Intentional Abort: SignalName: SIPSetupInd, DestPID: > SIPD[1:100:67:7]][AppID=Cisco > CallManager][ClusterID=StandAloneCluster][NodeID=CMBL-03-01]: Indicates an > internal failure in Unified CM > > > So much good info in the syslog. > Here’s a super-useful tidbit. > > Nov 28 03:59:23 CMBL-03-01 local7 2 : 1543: CMBL-03-01.localdomain: Nov 28 > 2018 05:59:23.840 UTC : %UC_RTMT-2-RTMT_ALERT: > %[AlertName=CallProcessingNodeCpuPegging][AlertDetail= > Processor load over configured threshold for configured duration of time . > Configured high threshold is 90 % tomcat (2 percent) uses most of the CPU. > > Processor_Info: > > For processor instance 1: %CPU= 99, %User= 2, %System= 2, %Nice= 0, > %Idle= 0, %IOWait= 97, %softirq= 0, %irq= 0. > > For processor instance _Total: %CPU= 93, %User= 2, %System= 1, %Nice= 0, > %Idle= 7, %IOWait= 90, %softirq= 0, %irq= 0. > > For processor instance 0: %CPU= 86, %User= 2, %System= 1, %Nice= 0, > %Idle= 14, %IOWait= 83, %softirq= 0, %irq= 0. > > For processor instance 3: %CPU= 87, %User= 2, %System= 2, %Nice= 0, > %Idle= 13, %IOWait= 83, %softirq= 0, %irq= 0. > > For processor instance 2: %CPU= 99, %User= 4, %System= 1, %Nice= 0, > %Idle= 0, %IOWait= 96, %softirq= 0, %irq= 0. > ][AppID=Cisco AMC Service][ClusterID=][NodeID=CMBL-03-01]: RTMT Alert > > > Looking back just a bit further, and there are a TON of these. > > Nov 15 21:22:00 CMBL-03-01 local7 2 : 582: CMBL-03-01.localdomain: Nov 15 > 2018 23:22:00.256 UTC : %UC_RTMT-2-RTMT_ALERT: %[ > AlertName=HardwareFailure][AlertDetail= At Thu Nov 15 21:22:00 BRST > 2018 on node 192.168.183.3, the following HardwareFailure events generated: > hwStringMatch : Nov 15 21:21:26 CMBL-03-01 daemon 4 Director Agent: > LSIESG_DiskDrive_Modified > 500605B0027C6D50 Command timeout on PD 01(e0xfc/s1) Path > 500000e116ac4ce2, CDB: 2a 00 10 98 b9 9d 00 00 08 00 Sev: 3. AppID : Cisco > Syslog Agent ClusterID : NodeID : CMBL-03-01 TimeStamp : Thu Nov 15 > 21:21:26 BRST 2018 hwStringMatch : Nov 15 21:21:26 CMBL-03-01 daemon 4 > Director Agent: LSIESG_AlertIndication 500605B0027C6D50 Command timeout on > PD 01(e0xfc/s1) Path 500000e116ac4ce2, CDB: 2a 00 10 98 b9 9d 00 00 08 00 > Sev: 3. AppID : Cisco Syslog Agent ClusterID : NodeID : CMBL-03-01 > TimeStamp : Thu Nov 15 21:21:27 BRST 2018 hwStringMatch : Nov 15 > 21:21:26 CMBL-03-01][AppID=Cisco AMC > Service][ClusterID=][NodeID=CMBL-03-01]: RTMT Alert > > > You’ve lost or are in the middle of losing at least one disk drive. It > probably lost them all at the same time on the 13th and the OS marked the > entire filesystem readonly. > > -Ryan > > On Dec 3, 2018, at 9:28 AM, Nilson Costa <nilsonl...@gmail.com> wrote: > > Hello All, > > I´m deploying a new CUCM on a customer that has an old one working just as > call routing for a Genesys system for call center. > > As you can see the picture below, they have some MGCP Gateways connected > to this CUCM where the calls come in and via some CTI route points, > controlled by Genesys, route the call to to 2 Avaya PBX or to a another CUCM > > <image.png> > On november 13th they lost access to Tomcat on the Publisher, when we > looked at the server several services were restarting including Cisco > CallManager, just on the Publisher. > We decided to reboot the whole cluster, but after the reboot we are facing > some wierd issues that are not that relevant, I think, but there is one > which we are really worried > > The Cisco CallManager process are still restarting ramdomly and generating > some coredumps I´m attaching this logs here also I´m attaching the syslogs > from the publisher. > > Can anybody here on the group help me finding out what is triggering the > Cisco CallManager restart? > > -- > Nilson Lino da Costa Junior > <coredump.txt><publiser-syslog-29-11.txt> > _______________________________________________ > cisco-voip mailing list > cisco-voip@puck.nether.net > https://puck.nether.net/mailman/listinfo/cisco-voip > > > -- Nilson Lino da Costa Junior
_______________________________________________ cisco-voip mailing list cisco-voip@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-voip