Re: [j-nsp] MX80 watchdog

2023-06-12 Thread Saku Ytti via juniper-nsp
Do you monitor RPD task memory use and Freebsd process memory use?
Is it possible you are leaking memory over time, and getting DRAM
pressure at the 1500d mark?

It might be this:
https://prsearch.juniper.net/problemreport/PR108

Initially as you said it happens at strenuous SSD access, I was
thinking that Junos does have RE failover limits on disk-io read/write
latency, which causes false positive RE switchovers now and again
(more people have hit them, than people are aware of hitting them).
But in your case this can't possibly be true, because the MX80 doesn't
have two RE. But for completeness,
https://www.juniper.net/documentation/us/en/software/junos/high-availability/topics/ref/statement/not-on-disk-underperform-edit-chassis.html

On Mon, 12 Jun 2023 at 18:35, Tom Bird via juniper-nsp
 wrote:
>
> Afternoon,
>
> I've been upgrading some MX80 routers to from 15.1, consistently they
> seem to fall over during periods of strenuous SSD access, or indeed once
> during a "commit check".
>
> We thought this might be due to the uptime (~1500 days) so have been
> rebooting them prior to the upgrade which has mostly stopped the problem
> from happening.  Not completely, however - they get stuck for about an
> hour doing this, after which they reboot and continue to work.
>
>
> watchdog: scheduling fairness gone for 3540 seconds now.
> (da1:umass-sim1:1:0:0): Synchronize cache failed, status == 0x34, scsi
> status == 0x0
> Automatic reboot in 15 seconds - press a key on the console to abort
> Rebooting...
>
>
> I'd like it if they waited a bit less than an hour and see the watchdog
> can be configured but I can't find any useful documentation about
> exactly what conditions it would fire and what the defaults are.
>
> Currently there is no configuration under "system processes watchdog",
> and it looks like it can be enabled, disabled and the timeout set up to
> 3600 seconds.
>
> So my question is, is it this watchdog that is resetting the thing after
> an hour and would it be reasonable to set the timeout to say 300 seconds
> so there was less down time if it went wrong.
>
> Thanks,
> --
> Tom
>
> :: www.portfast.co.uk / @portfast
> :: hosted services, domains, virtual machines, consultancy
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp



-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] MX80 watchdog

2023-06-12 Thread Tom Bird via juniper-nsp

Afternoon,

I've been upgrading some MX80 routers to from 15.1, consistently they 
seem to fall over during periods of strenuous SSD access, or indeed once 
during a "commit check".


We thought this might be due to the uptime (~1500 days) so have been 
rebooting them prior to the upgrade which has mostly stopped the problem 
from happening.  Not completely, however - they get stuck for about an 
hour doing this, after which they reboot and continue to work.



watchdog: scheduling fairness gone for 3540 seconds now.
(da1:umass-sim1:1:0:0): Synchronize cache failed, status == 0x34, scsi 
status == 0x0

Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...


I'd like it if they waited a bit less than an hour and see the watchdog 
can be configured but I can't find any useful documentation about 
exactly what conditions it would fire and what the defaults are.


Currently there is no configuration under "system processes watchdog", 
and it looks like it can be enabled, disabled and the timeout set up to 
3600 seconds.


So my question is, is it this watchdog that is resetting the thing after 
an hour and would it be reasonable to set the timeout to say 300 seconds 
so there was less down time if it went wrong.


Thanks,
--
Tom

:: www.portfast.co.uk / @portfast
:: hosted services, domains, virtual machines, consultancy
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Juniper MX240-40G License required?

2023-06-12 Thread Ian Couch via juniper-nsp
Hello,
you will not require a license. We have a few of these and the only license
we've had to apply is the subscriber license for BNG function, which you
don't mention.  You might want to check into the type of NAT you want to
do, without an MS-MPC card your options are limited.

Ian

On Fri, Apr 28, 2023 at 3:15 PM Juan C. Crespo R. via juniper-nsp <
juniper-nsp@puck.nether.net> wrote:

> Hello Everyone
>
>
>  We're wondering if we get a second hand Router with this config, we
> would need some kind of license ? it will be intended in using for
> routing, ospf, BGP, and possibly NAT
>
>   *
>
>
>   MX240BASE
>
>   *
>
>
>   MIC3-3D-2X40GE-QSFPP
>
>   *
>
>
>   MX-MPC3E-3D MPC3
>
> Thanks!
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>

-- 


This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error please notify the 
system manager. This message contains confidential information and is 
intended only for the individual named. If you are not the named addressee 
you should not disseminate, distribute or copy this email. Please notify 
the sender immediately by e-mail if you have received this email by mistake 
and delete this e-mail from your system. If you are not the intended 
recipient you are notified that disclosing, copying, distributing or taking 
any action in reliance on the contents of this information is strictly 
prohibited.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] ACX7100-48L

2023-06-12 Thread Aaron Gould via juniper-nsp

I might be hitting PR1664302

keep in mind, I have another 7100 racked right beside this one with no 
problems


me@lab-7100-2> show log messages | grep "cooling|shutdown"
...
Jun  8 20:23:23  eng-lab-7100-2 hwdre: HWD_COOLING_FIRE_SHUTDOWN_INIT: 
Cooling zone fire action initiated !!
Jun  8 20:23:23  eng-lab-7100-2 hwdre: HWD_COOLING_FIRE_SHUTDOWN_SENSOR: 
Sensor /Chassis[0]/Fpc[0] Sensor J2 Max Reading crossed fire threshold 
temp value 136, driving chassis to shutdown


but i just now had someone pull the power cords since i couldn't console 
in... so I don't know if this reboot reason is from the power cord pull 
or the previous high temp shutdown PR condition


me@lab-7100-2> show chassis routing-engine | grep reboot
    Last reboot reason power cycle

interestingly, the PR is said to be fixed in 22.2R2-EVO, wouldn't that 
follow that it should be fixed in my version? 22.2R3.13-EVO


me@lab-7100-2> show version
...
Junos: 22.2R3.13-EVO


-Aaron


On 6/7/2023 2:29 PM, Roger Wiklund wrote:

Hi

Some generic pointers here:
Checklist for Collecting Crash Data - TechLibrary - Juniper Networks 



show chassis routing-engine
What does "last reboot reason say"?

I would upgrade to 22.2R3, it's working fine for us so far.

Regards
Roger



On Wed, Jun 7, 2023 at 9:18 PM Aaron Gould via juniper-nsp 
 wrote:


I had a ACX7100-48L suddenly go down in my lab.  Is there a way to
find
the cause of it going down?

agould@eng-lab-7100-2> show system information
Model: ACX7100-48L
Family: junos
Junos: 22.2R1.12-EVO
Hostname: eng-lab-7100-2

agould@eng-lab-7100-2> show system core-dumps
re0:
--

agould@eng-lab-7100-2> file ls /var/crash
/var/crash: No such file or directory

agould@eng-lab-7100-2>



-- 
-Aaron


___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


--
-Aaron
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] Paragon suite of software

2023-06-12 Thread Ian Couch via juniper-nsp
Has anyone on this list had any experience with any of the components of
Paragon such as Insights, Pathfinder, Planner, Active Assurance or the
additional Anuta ATOM piece?

Thanks in advance,
Ian.

-- 


This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error please notify the 
system manager. This message contains confidential information and is 
intended only for the individual named. If you are not the named addressee 
you should not disseminate, distribute or copy this email. Please notify 
the sender immediately by e-mail if you have received this email by mistake 
and delete this e-mail from your system. If you are not the intended 
recipient you are notified that disclosing, copying, distributing or taking 
any action in reliance on the contents of this information is strictly 
prohibited.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Unknown Attribute 28 in BGP

2023-06-12 Thread Einar Bjarni Halldórsson via juniper-nsp

On 6/12/23 06:37, Saku Ytti wrote:

Either will help, configure either or both and you're good.

Actual fixed release will behave the same as if drop-path-attribute 28
had been configured. That is read T, read L, seek past V, without
parsing.



Thanks Saku. I just discovered that "protocols bgp drop-path-attributes 
28" *is* supported on 18.2R3, but it's a hidden config option and so it 
must be entered exactly. When I tried it before, I kept trying 
"protocols bgp drop-path-attribute 28" which doesn't work.


Hopefully this will resolve our problems.

.einar
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Unknown Attribute 28 in BGP

2023-06-12 Thread Saku Ytti via juniper-nsp
Either will help, configure either or both and you're good.

Actual fixed release will behave the same as if drop-path-attribute 28
had been configured. That is read T, read L, seek past V, without
parsing.

On Sun, 11 Jun 2023 at 19:36, Einar Bjarni Halldórsson  wrote:
>
> On 6/11/23 15:24, Saku Ytti wrote:
> > set protocols bgp drop-path-attributes 28 works if your release is too
> > old for set protocols bgp bgp-error-tolerance, and is preferable in
> > some ways, as it will protect your downstream as well.
> >
>
> 18.2R3-S3.11 supports protocols bgp bgp-error-tolerance, but reading
> through the docs, I see:
>
> > The bgp-error-tolerance statement overrides this behavior so that the 
> > following BGP error handling is in effect:
> >
> > For fatal errors, Junos OS sends a notification message titled Error 
> > Code Update Message and resets the BGP session. An error in the 
> > MP_{UN}REACH attribute is considered to be fatal. The presence of multiple 
> > MP_{UN}REACH attributes in one BGP update is also considered to be a fatal 
> > error. Junos OS resets the BGP session if it cannot parse the NLRI field or 
> > the BGP update correctly. Failure to parse the BGP update packet can happen 
> > when the attribute length does not match the length of the attribute value.
>
> I read this section so that even if I configure bgp-error-tolerance, it
> won't make a difference since junos considers this a fatal error and
> resets the BGP session.
>
> .einar



-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp