Hello, we're using 17.3R2 with 2x MPC 3D 16x 10GE. PR1312336 looks very interesting, thank you.
Regards Karl ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ *From:* zh73 [mailto:[email protected]] *Sent:* Wed, Sep 5, 2018 2:22 AM CEST *To:* Karl Gerhard *Cc:* [email protected] *Subject:* [j-nsp] MX960 transient errors on high capacity AC power supplies > What type of MPCs are you using? which junos release? > Better upgrade to 17.3R3 which have PR1312336, PR1325271, PR1349179 fix. > Or open a case to JTAC. > > At 2018-09-04 15:38:33, "Karl Gerhard" <[email protected]> wrote: > >Hello, > > > >we have bought two Juniper MX960 and we're having serious trouble with power > >supplies triggering alarms and then clearing alarms a few seconds later: > >2x RE-S-X6-64G > >3x SCBE-2-MX > >MX960 Premium 3 chassis > >4x High Capacty AC PEMs > > > >$ show log messages | match alarmd > >Aug 30 08:09:02 router1 alarmd[12567]: %DAEMON-4: Alarm set: Pwr supply > >color=RED, class=CHASSIS, reason=PEM 1 Not OK > >Aug 30 08:09:12 router1 alarmd[12567]: %DAEMON-4: Alarm cleared: Pwr > >supply color=RED, class=CHASSIS, reason=PEM 1 Not OK > >Aug 30 08:12:30 router1 alarmd[12567]: %DAEMON-4: Alarm set: Pwr supply > >color=RED, class=CHASSIS, reason=PEM 1 Not OK > >Aug 30 08:12:35 router1 alarmd[12567]: %DAEMON-4: Alarm cleared: Pwr > >supply color=RED, class=CHASSIS, reason=PEM 1 Not OK > >Aug 30 08:14:53 router1 alarmd[12567]: %DAEMON-4: Alarm set: Pwr supply > >color=RED, class=CHASSIS, reason=PEM 1 Not OK > >Aug 30 08:14:58 router1 alarmd[12567]: %DAEMON-4: Alarm cleared: Pwr > >supply color=RED, class=CHASSIS, reason=PEM 1 Not OK > > > > > >$ show log messages > >Aug 31 06:12:33 router1 kernel: %KERN-3: PCF8584(WR): (i2c_s1=0x08, > >group=0x3, device=0x51) > >Aug 31 06:13:29 router1 kernel: %KERN-3: PCF8584(RD): target ack timeout > >Aug 31 06:13:29 router1 kernel: %KERN-3: PCF8584(RD): (i2c_s1=0x08, > >group=0x3, device=0x51) > >Aug 31 06:13:29 router1 kernel: %KERN-3: PCF8584(WR): target ack failure > >on byte 0 > >Aug 31 06:13:29 router1 kernel: %KERN-3: PCF8584(WR): (i2c_s1=0x08, > >group=0x3, device=0x51) > >Aug 31 06:13:50 router1 alarmd[12567]: %DAEMON-4: Alarm set: Pwr supply > >color=RED, class=CHASSIS, reason=PEM 1 Not OK > >Aug 31 06:13:50 router1 craftd[12162]: %DAEMON-4: Major alarm set, PEM 1 > >Not OK > >Aug 31 06:13:50 router1 kernel: %KERN-3: PCF8584(WR): target ack failure > >on byte 0 > >Aug 31 06:13:50 router1 kernel: %KERN-3: PCF8584(WR): (i2c_s1=0x08, > >group=0x3, device=0x51) > >Aug 31 06:13:50 router1 kernel: %KERN-3: PCF8584(WR): target ack failure > >on byte 1 > >Aug 31 06:13:50 router1 kernel: %KERN-3: PCF8584(WR): (i2c_s1=0x08, > >group=0x3, device=0x51) > >Aug 31 06:13:50 router1 chassisd[12159]: %DAEMON-4-CHASSISD_PEM_INPUT_BAD: > >status failure for power supply 1 (status bits: 0x0); check circuit breaker > >Aug 31 06:13:55 router1 alarmd[12567]: %DAEMON-4: Alarm cleared: Pwr > >supply color=RED, class=CHASSIS, reason=PEM 1 Not OK > >Aug 31 06:13:55 router1 craftd[12162]: %DAEMON-4: Major alarm cleared, PEM > >1 Not OK > > > >Oddly enough the errors show up only every few weeks. The power supplies > >work for weeks without a hitch and then start throwing alerts for a day or a > >few days and then stop throwing alerts and work flawlessly again for a few > >weeks. > > > >We've checked and swapped everything. It's not the cables, not the > >connectors, not the power source. > >Then we started sending power supplies back to our supplier. But the errors > >keep showing up even with brand new, swapped power supplies. > >We've found PR1299284 which seems to be related to non-hc power supplies. > > > >Could those errors be related to a software problem which affects > >RE-S-X6-64G/SCBE-2-MX in combination with High Capacity AC PEMs? > >Anyone else experienced errors like that? > > > >Regards > >Karl > > > >_______________________________________________ > >juniper-nsp mailing list [email protected] > >https://puck.nether.net/mailman/listinfo/juniper-nsp > > > > _______________________________________________ juniper-nsp mailing list [email protected] https://puck.nether.net/mailman/listinfo/juniper-nsp

