Re: [c-nsp] ouch 7204vxr reloaded
How far apart are these issues geographically? Honestly it sounds like you are just having stuff break. It happens. I've had weeks like that were stuff that has ran for years with out issue starts to fail. None of the problems you are having are never been seen before. I've had a disk array controller with block errors. I've had a interface go whacked and routers restart. It happens. On Thu, Apr 29, 2010 at 8:25 PM, Mike mike-cisconspl...@tiedyenetworks.com wrote: This is becomming a crisis. The logical problem soliving procedure here is producing no leads or answers, I just have stuff thats beginning to die and experience 'never been seen before' malfunctions all across the network. From today, I have: An adtran ta5000 in a telco collocation space, suddenly experience the sudden restart of a single (adsl) card. No reason given. A customer router (soekris engineering SBC) had an ethernet port simply lock up and require a power cycle. A disk array controller in my noc suddenly threw up disk block errors ALL of these have _nothing_ in common. No mains power, no network connections, nothing. They are further physically seperated by substantial distance and administrative domains. So every day now I am experiencing these exceptional 'never in a lifetime' events. I am beginning to think there's something envionmental happening that is having a wide area of effect, maybe like an exceptional elctromagnetic or alpha partical storm of some kind? I can't possibly be the only one here. Mike- eNinja wrote: Let's apply logic... 1 - What changed prior to the 'events'? 2 - What's common to all the impacted devices experiencing the 'events'? Location, vendor, etc 3 - Which other devices could be experiencing similar 'events' but aren't. Eninja ___ cisco-nsp mailing list cisco-...@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
Joseph Jackson wrote: How far apart are these issues geographically? Honestly it sounds like you are just having stuff break. It happens. I've had weeks like that were stuff that has ran for years with out issue starts to fail. None of the problems you are having are never been seen before. I've had a disk array controller with block errors. I've had a interface go whacked and routers restart. It happens. Blocks apart and 30 miles or more. I've been running this network for the past 8 years and also have had 'stuff break', but never on the scale or frequency as what is happening now. The types of issues are simply unbelivable - a 7200 pairity error, following a disk raid controller error, following peculiuer ethernet interface errors on several devices resulting in lockups, following adsl cards in a telco facillity restarting, following 'burnt out' microwave point to point transceiver, following locked up for 45 minutes access point that came back and never hiccuped again its all too unbelivable to me. This has all happened in the space of 4 days, there is something more than 'shit happens' at work here. Mike- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
We can't help unless you post some data or logs. Richard Golodner Sent via BlackBerry from T-Mobile -Original Message- From: Mike mike-cisconspl...@tiedyenetworks.com Date: Fri, 30 Apr 2010 10:02:14 To: Joseph Jacksonrecou...@gmail.com Cc: Cisco-nspcisco-nsp@puck.nether.net Subject: Re: [c-nsp] ouch 7204vxr reloaded Joseph Jackson wrote: How far apart are these issues geographically? Honestly it sounds like you are just having stuff break. It happens. I've had weeks like that were stuff that has ran for years with out issue starts to fail. None of the problems you are having are never been seen before. I've had a disk array controller with block errors. I've had a interface go whacked and routers restart. It happens. Blocks apart and 30 miles or more. I've been running this network for the past 8 years and also have had 'stuff break', but never on the scale or frequency as what is happening now. The types of issues are simply unbelivable - a 7200 pairity error, following a disk raid controller error, following peculiuer ethernet interface errors on several devices resulting in lockups, following adsl cards in a telco facillity restarting, following 'burnt out' microwave point to point transceiver, following locked up for 45 minutes access point that came back and never hiccuped again its all too unbelivable to me. This has all happened in the space of 4 days, there is something more than 'shit happens' at work here. Mike- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
Where is 'here'? Where are these equipment located (geographically)? You still need to research what changed in the locale, atmosphere, network, environment etc. prior to these events occuring. Stay calm, the answer is out there ;-) eninja On Apr 30, 2010, at 3:25 AM, Mike mike- cisconspl...@tiedyenetworks.com wrote: This is becomming a crisis. The logical problem soliving procedure here is producing no leads or answers, I just have stuff thats beginning to die and experience 'never been seen before' malfunctions all across the network. From today, I have: An adtran ta5000 in a telco collocation space, suddenly experience the sudden restart of a single (adsl) card. No reason given. A customer router (soekris engineering SBC) had an ethernet port simply lock up and require a power cycle. A disk array controller in my noc suddenly threw up disk block errors ALL of these have _nothing_ in common. No mains power, no network connections, nothing. They are further physically seperated by substantial distance and administrative domains. So every day now I am experiencing these exceptional 'never in a lifetime' events. I am beginning to think there's something envionmental happening that is having a wide area of effect, maybe like an exceptional elctromagnetic or alpha partical storm of some kind? I can't possibly be the only one here. Mike- eNinja wrote: Let's apply logic... 1 - What changed prior to the 'events'? 2 - What's common to all the impacted devices experiencing the 'events'? Location, vendor, etc 3 - Which other devices could be experiencing similar 'events' but aren't. Eninja ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
FWIW - I ran into something like this on a couple of sites next to a Navy base many years ago. The issues coincided with tests of the ship-board long-range radar. The only way we could tell is by being on site and watching as the dish swept and following it was the path of devastation... My point - there may be a commonality that is not obvious. Look for something that could generate a sizable EM field in someplace near the centre of the issues. Could be a lot of different things, don't rule out even the slightly outlandish. Good luck Brian On 10-04-30 10:02 AM, Mike mike-cisconspl...@tiedyenetworks.com wrote: Joseph Jackson wrote: How far apart are these issues geographically? Honestly it sounds like you are just having stuff break. It happens. I've had weeks like that were stuff that has ran for years with out issue starts to fail. None of the problems you are having are never been seen before. I've had a disk array controller with block errors. I've had a interface go whacked and routers restart. It happens. Blocks apart and 30 miles or more. I've been running this network for the past 8 years and also have had 'stuff break', but never on the scale or frequency as what is happening now. The types of issues are simply unbelivable - a 7200 pairity error, following a disk raid controller error, following peculiuer ethernet interface errors on several devices resulting in lockups, following adsl cards in a telco facillity restarting, following 'burnt out' microwave point to point transceiver, following locked up for 45 minutes access point that came back and never hiccuped again its all too unbelivable to me. This has all happened in the space of 4 days, there is something more than 'shit happens' at work here. Mike- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
For an additonal data point, the error has not recurred. HOWEVER, all across my network (wireless ISP), I have been having 'events' with all sorts of malfunctioning equipment; I've had blown out microwave point to point links, locked up wireless access points and strange rebootings, all in devices that otherwise have been up and running for years now without issue. There's been so much stuff I've been seriously considering the possibility of sabotage, except that the types of issues and the types of gear simply don't lend themselves to any sort of sabotage scenario I can think up short of someone with an x-matter-mitter standing on the moon shooting alpha particles at us. Anyone else experiencing 'odd' failures lately?? Mike- Tony wrote: What's really strange is that we had a sup720 failover to the redundant Sup in a 7609 recently. I opened a TAC case and the reason I got for it from Cisco was: = the device experienced a CPU parity error ... These occur when an energy level within the chip (for example, a one or a zero) changes - most often the result of cosmic radiation. = The resolution was to monitor it for 48hrs and if it didn't happen again it was a once off cause by cosmic radiation and nothing they could do about it. The error didn't recur and so case was closed. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
In my experience, a PMPE error is usually exactly what it sounds like - a flipped bit in RAM that the ECC was unable to correct. If you don't run into another one anytime soon, you're probably OK - occasional electromagnetic (or even cosmic ray) interference can cause this - but if you get another anytime soon you'll probably want to replace your DIMM, which will probably cost much less than a Smartnet contract.* -C * Assuming you don't buy the memory from Cisco. On Apr 27, 2010, at 5:53 28PM, Mike wrote: Howdy, Well that was fun, I discovered that my trusty 7204vxr reloaded unexpectedly and I find myself without a good explanation. Show version gives me 'processor memory pairity error': System returned to ROM by processor memory parity error at PC 0x60640F70, address 0x0 at 03:09:00 PST Tue Apr 27 2010 System restarted at 04:10:28 PDT Tue Apr 27 2010 System image file is disk0:c7200-is-mz.123-26.bin and digging thru the 'show tech' gave me: Pid 3: Process OSPF Hello stack 0x63D733F0 savedsp 0x63D75660 Flags: analyze on_old_queue Status 0x Orig_ra 0x Routine0x Signal 0 Caller_pc 0x60CDAE84 Callee_pc 0x60806190 Dbg_events 0x State 1 Totmalloc 548640 Totfree 441612 Totgetbuf 15876 Totretbuf 0 Edisms0x60CD71A4 Eparm 0x64A91E7C Elapsed0xC6E634 Ncalls0xE09B9AA Ngiveups 0x491E Priority_q 3 Ticks_5s 2 Cpu_5sec 81 Cpu_1min 40 Cpu_5min 8 Stacksize 0x2328 Lowstack 0x2328 Ttyptr 0x63D55FA8 Mem_holding 0x0Thrash_count 0 Wakeup_reasons 0x0FFF Default_wakeup_reasons 0x0FFF Direct_wakeup_major 0x Direct_wakeup_minor 0x So my inexperienced glancing would say it was something to do with OSPF. My question tho is, #1, how do I really debug a problem like this, and #2, what would the minimum cisco contract be required to make sure I have access to the cco/bug advisor and possibly updated IOS for this device? Its been a tank with absolutely zero issues in this enviorment for more than a year, but this event underscores the fact that we have no real support route and probabbly should get on some program even for our little operation. Thanks. Mike- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
You may want to check your power feeds for unexplained variances in voltage or phase... -C On Apr 29, 2010, at 2:13 51AM, Mike wrote: For an additonal data point, the error has not recurred. HOWEVER, all across my network (wireless ISP), I have been having 'events' with all sorts of malfunctioning equipment; I've had blown out microwave point to point links, locked up wireless access points and strange rebootings, all in devices that otherwise have been up and running for years now without issue. There's been so much stuff I've been seriously considering the possibility of sabotage, except that the types of issues and the types of gear simply don't lend themselves to any sort of sabotage scenario I can think up short of someone with an x-matter-mitter standing on the moon shooting alpha particles at us. Anyone else experiencing 'odd' failures lately?? Mike- Tony wrote: What's really strange is that we had a sup720 failover to the redundant Sup in a 7609 recently. I opened a TAC case and the reason I got for it from Cisco was: = the device experienced a CPU parity error ... These occur when an energy level within the chip (for example, a one or a zero) changes - most often the result of cosmic radiation. = The resolution was to monitor it for 48hrs and if it didn't happen again it was a once off cause by cosmic radiation and nothing they could do about it. The error didn't recur and so case was closed. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
Let's apply logic... 1 - What changed prior to the 'events'? 2 - What's common to all the impacted devices experiencing the 'events'? Location, vendor, etc 3 - Which other devices could be experiencing similar 'events' but aren't. Eninja On Apr 29, 2010, at 8:13 AM, Mike mike- cisconspl...@tiedyenetworks.com wrote: For an additonal data point, the error has not recurred. HOWEVER, all across my network (wireless ISP), I have been having 'events' with all sorts of malfunctioning equipment; I've had blown out microwave point to point links, locked up wireless access points and strange rebootings, all in devices that otherwise have been up and running for years now without issue. There's been so much stuff I've been seriously considering the possibility of sabotage, except that the types of issues and the types of gear simply don't lend themselves to any sort of sabotage scenario I can think up short of someone with an x-matter-mitter standing on the moon shooting alpha particles at us. Anyone else experiencing 'odd' failures lately?? Mike- Tony wrote: What's really strange is that we had a sup720 failover to the redundant Sup in a 7609 recently. I opened a TAC case and the reason I got for it from Cisco was: = the device experienced a CPU parity error ... These occur when an energy level within the chip (for example, a one or a zero) changes - most often the result of cosmic radiation. = The resolution was to monitor it for 48hrs and if it didn't happen again it was a once off cause by cosmic radiation and nothing they could do about it. The error didn't recur and so case was closed. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
This is becomming a crisis. The logical problem soliving procedure here is producing no leads or answers, I just have stuff thats beginning to die and experience 'never been seen before' malfunctions all across the network. From today, I have: An adtran ta5000 in a telco collocation space, suddenly experience the sudden restart of a single (adsl) card. No reason given. A customer router (soekris engineering SBC) had an ethernet port simply lock up and require a power cycle. A disk array controller in my noc suddenly threw up disk block errors ALL of these have _nothing_ in common. No mains power, no network connections, nothing. They are further physically seperated by substantial distance and administrative domains. So every day now I am experiencing these exceptional 'never in a lifetime' events. I am beginning to think there's something envionmental happening that is having a wide area of effect, maybe like an exceptional elctromagnetic or alpha partical storm of some kind? I can't possibly be the only one here. Mike- eNinja wrote: Let's apply logic... 1 - What changed prior to the 'events'? 2 - What's common to all the impacted devices experiencing the 'events'? Location, vendor, etc 3 - Which other devices could be experiencing similar 'events' but aren't. Eninja ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
Mike, This is a PMPE and as such, the tracebacks et al are invalid and so too the decodes. There is _no_ software fix to prevent PMPEs. Most PMPEs are caused by cosmic radiation and sometimes (albeit rarely) from built up ESD due to improper personnel handling of components. Since lightning doesn't strike the same spot twice, chances are there won't be a recurrence on the same processor. Monitor and replace the NPE with mem if this recurs within the next 6 months otherwise it was a transient issue and no further action is required. ITMT, ensure safe ESD compliance when handling components. Eninja On Apr 27, 2010, at 11:53 PM, Mike mike- cisconspl...@tiedyenetworks.com wrote: Howdy, Well that was fun, I discovered that my trusty 7204vxr reloaded unexpectedly and I find myself without a good explanation. Show version gives me 'processor memory pairity error': System returned to ROM by processor memory parity error at PC 0x60640F70, address 0x0 at 03:09:00 PST Tue Apr 27 2010 System restarted at 04:10:28 PDT Tue Apr 27 2010 System image file is disk0:c7200-is-mz.123-26.bin and digging thru the 'show tech' gave me: Pid 3: Process OSPF Hello stack 0x63D733F0 savedsp 0x63D75660 Flags: analyze on_old_queue Status 0x Orig_ra 0x Routine0x Signal 0 Caller_pc 0x60CDAE84 Callee_pc 0x60806190 Dbg_events 0x State 1 Totmalloc 548640 Totfree 441612 Totgetbuf 15876 Totretbuf 0 Edisms0x60CD71A4 Eparm 0x64A91E7C Elapsed0xC6E634 Ncalls0xE09B9AA Ngiveups 0x491E Priority_q 3 Ticks_5s 2 Cpu_5sec 81 Cpu_1min 40 Cpu_5min 8 Stacksize 0x2328 Lowstack 0x2328 Ttyptr 0x63D55FA8 Mem_holding 0x0Thrash_count 0 Wakeup_reasons 0x0FFF Default_wakeup_reasons 0x0FFF Direct_wakeup_major 0x Direct_wakeup_minor 0x So my inexperienced glancing would say it was something to do with OSPF. My question tho is, #1, how do I really debug a problem like this, and #2, what would the minimum cisco contract be required to make sure I have access to the cco/bug advisor and possibly updated IOS for this device? Its been a tank with absolutely zero issues in this enviorment for more than a year, but this event underscores the fact that we have no real support route and probabbly should get on some program even for our little operation. Thanks. Mike- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
In many cases (at least for us), it often makes much more sense to have a cold/warm/hot spare than to maintain a support contract. I've also had more than one case when TAC really couldn't help unless I had a reproducible, live problem for them to troubleshoot. Not trying to steer you away from SmartNet, just suggesting you consider options. Sincerely, Michael Malitsky Date: Tue, 27 Apr 2010 14:53:28 -0700 From: Mike mike-cisconspl...@tiedyenetworks.com To: 'Cisco-nsp' cisco-nsp@puck.nether.net Subject: [c-nsp] ouch 7204vxr reloaded Howdy, Well that was fun, I discovered that my trusty 7204vxr reloaded unexpectedly and I find myself without a good explanation. Show version gives me 'processor memory pairity error': System returned to ROM by processor memory parity error at PC 0x60640F70, address 0x0 at 03:09:00 PST Tue Apr 27 2010 System restarted at 04:10:28 PDT Tue Apr 27 2010 System image file is disk0:c7200-is-mz.123-26.bin and digging thru the 'show tech' gave me: snip So my inexperienced glancing would say it was something to do with OSPF. My question tho is, #1, how do I really debug a problem like this, and #2, what would the minimum cisco contract be required to make sure I have access to the cco/bug advisor and possibly updated IOS for this device? Its been a tank with absolutely zero issues in this enviorment for more than a year, but this event underscores the fact that we have no real support route and probabbly should get on some program even for our little operation. Thanks. Mike- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
What's really strange is that we had a sup720 failover to the redundant Sup in a 7609 recently. I opened a TAC case and the reason I got for it from Cisco was: = the device experienced a CPU parity error ... These occur when an energy level within the chip (for example, a one or a zero) changes - most often the result of cosmic radiation. = The resolution was to monitor it for 48hrs and if it didn't happen again it was a once off cause by cosmic radiation and nothing they could do about it. The error didn't recur and so case was closed. regards, Tony. --- On Wed, 28/4/10, eNinja eni...@gmail.com wrote: From: eNinja eni...@gmail.com Subject: Re: [c-nsp] ouch 7204vxr reloaded To: Mike mike-cisconspl...@tiedyenetworks.com Cc: Cisco-nsp cisco-nsp@puck.nether.net Received: Wednesday, 28 April, 2010, 11:53 PM Mike, This is a PMPE and as such, the tracebacks et al are invalid and so too the decodes. There is _no_ software fix to prevent PMPEs. Most PMPEs are caused by cosmic radiation and sometimes (albeit rarely) from built up ESD due to improper personnel handling of components. Since lightning doesn't strike the same spot twice, chances are there won't be a recurrence on the same processor. Monitor and replace the NPE with mem if this recurs within the next 6 months otherwise it was a transient issue and no further action is required. ITMT, ensure safe ESD compliance when handling components. Eninja On Apr 27, 2010, at 11:53 PM, Mike mike-cisconspl...@tiedyenetworks.com wrote: Howdy, Well that was fun, I discovered that my trusty 7204vxr reloaded unexpectedly and I find myself without a good explanation. Show version gives me 'processor memory pairity error': System returned to ROM by processor memory parity error at PC 0x60640F70, address 0x0 at 03:09:00 PST Tue Apr 27 2010 System restarted at 04:10:28 PDT Tue Apr 27 2010 System image file is disk0:c7200-is-mz.123-26.bin and digging thru the 'show tech' gave me: Pid 3: Process OSPF Hello stack 0x63D733F0 savedsp 0x63D75660 Flags: analyze on_old_queue Status 0x Orig_ra 0x Routine 0x Signal 0 Caller_pc 0x60CDAE84 Callee_pc 0x60806190 Dbg_events 0x State 1 Totmalloc 548640 Totfree 441612 Totgetbuf 15876 Totretbuf 0 Edisms 0x60CD71A4 Eparm 0x64A91E7C Elapsed 0xC6E634 Ncalls 0xE09B9AA Ngiveups 0x491E Priority_q 3 Ticks_5s 2 Cpu_5sec 81 Cpu_1min 40 Cpu_5min 8 Stacksize 0x2328 Lowstack 0x2328 Ttyptr 0x63D55FA8 Mem_holding 0x0 Thrash_count 0 Wakeup_reasons 0x0FFF Default_wakeup_reasons 0x0FFF Direct_wakeup_major 0x Direct_wakeup_minor 0x So my inexperienced glancing would say it was something to do with OSPF. My question tho is, #1, how do I really debug a problem like this, and #2, what would the minimum cisco contract be required to make sure I have access to the cco/bug advisor and possibly updated IOS for this device? Its been a tank with absolutely zero issues in this enviorment for more than a year, but this event underscores the fact that we have no real support route and probabbly should get on some program even for our little operation. Thanks. Mike- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
[c-nsp] ouch 7204vxr reloaded
Howdy, Well that was fun, I discovered that my trusty 7204vxr reloaded unexpectedly and I find myself without a good explanation. Show version gives me 'processor memory pairity error': System returned to ROM by processor memory parity error at PC 0x60640F70, address 0x0 at 03:09:00 PST Tue Apr 27 2010 System restarted at 04:10:28 PDT Tue Apr 27 2010 System image file is disk0:c7200-is-mz.123-26.bin and digging thru the 'show tech' gave me: Pid 3: Process OSPF Hello stack 0x63D733F0 savedsp 0x63D75660 Flags: analyze on_old_queue Status 0x Orig_ra 0x Routine0x Signal 0 Caller_pc 0x60CDAE84 Callee_pc 0x60806190 Dbg_events 0x State 1 Totmalloc 548640 Totfree 441612 Totgetbuf 15876 Totretbuf 0 Edisms0x60CD71A4 Eparm 0x64A91E7C Elapsed0xC6E634 Ncalls0xE09B9AA Ngiveups 0x491E Priority_q 3 Ticks_5s 2 Cpu_5sec 81 Cpu_1min 40 Cpu_5min 8 Stacksize 0x2328 Lowstack 0x2328 Ttyptr 0x63D55FA8 Mem_holding 0x0Thrash_count 0 Wakeup_reasons 0x0FFF Default_wakeup_reasons 0x0FFF Direct_wakeup_major 0x Direct_wakeup_minor 0x So my inexperienced glancing would say it was something to do with OSPF. My question tho is, #1, how do I really debug a problem like this, and #2, what would the minimum cisco contract be required to make sure I have access to the cco/bug advisor and possibly updated IOS for this device? Its been a tank with absolutely zero issues in this enviorment for more than a year, but this event underscores the fact that we have no real support route and probabbly should get on some program even for our little operation. Thanks. Mike- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
On Tue, 27 Apr 2010, Mike wrote: unexpectedly and I find myself without a good explanation. Show version gives me 'processor memory pairity error': System returned to ROM by processor memory parity error at PC 0x60640F70, So my inexperienced glancing would say it was something to do with OSPF. How do you know you don't have some RAM going BAD? Antonio Querubin 808-545-5282 x3003 e-mail/xmpp: t...@lava.net ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
Mike, -Original Message- Sent: Tuesday, April 27, 2010 6:40 PM To: Mike Cc: 'Cisco-nsp' Subject: Re: [c-nsp] ouch 7204vxr reloaded On Tue, 27 Apr 2010, Mike wrote: unexpectedly and I find myself without a good explanation. Show version gives me 'processor memory pairity error': System returned to ROM by processor memory parity error at PC 0x60640F70, So my inexperienced glancing would say it was something to do with OSPF. How do you know you don't have some RAM going BAD? Might want to have a look here at hard parity errors: http://www.ciscotaccc.com/kaidara-advisor/core/showcase?case=K20414285 -ryan ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ouch 7204vxr reloaded
Antonio Querubin wrote: So my inexperienced glancing would say it was something to do with OSPF. How do you know you don't have some RAM going BAD? I don't, and thats the point of my message - to get on the track to know and further to get enabled to resolve the problem, whatever it may wind up being. Mike- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/