Re: [c-nsp] ASR920: egress ACL on BDIs
--- Begin Message --- On Tue, 28 Jan 2020 at 17:28, Nathan Lannine wrote: > FWIW we are actually using object ACLs. What's the behavior then? Copy-swap? > Is there a real name for that which I'm not remembering? Object should be indeed copy-swap (atomic). -- ++ytti --- End Message --- ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
> > Somewhat related, IOS (all flavours) do in-place ACL unless you do > object ACLs. In-place ACL update behaviour essentially doubles your > FWIW we are actually using object ACLs. What's the behavior then? Copy-swap? Is there a real name for that which I'm not remembering? ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
> > Do you happen to have a bug reference for this? We’ve been seeing this > behaviour intermittently on some csr 1ks and haven’t had the time/energy to > debate it with TAC yet. Sorry, just saw this. https://bst.cloudapps.cisco.com/bugsearch/bug/CSCuw19907 . That's for the Catalyst 4500x, which is just a 4500 Sup7L and has its own set of limiting problems for us. We were on 15.2(4)E6 and were advised to update to 15.2(4)E8. After the update we still saw a subset of the problem. You'll see that bug specifically references the use of the "log" keyword on an ACE, which was true to our config. I would doubt a relationship between our experience and yours because of the significant difference in the platforms. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
On Mon, 27 Jan 2020 at 23:30, Chris Jones wrote: > > Aside from this behavior, XE in the enterprise access layer is full of bugs > > related to ACLs. We've recently begun a practice of maintaining two > > distinct versions of every ACL so we can swap them on interfaces after > > modifying the unused one. Modifying a used one in-place results in some > > degree of data plane failure on affected interfaces, i.e. they stop passing > > all or some subset of traffic. Even on "fixed" code, the problem persists, > > though less frequently. > > Do you happen to have a bug reference for this? We’ve been seeing this > behaviour intermittently on some csr 1ks and haven’t had the time/energy to > debate it with TAC yet. Somewhat related, IOS (all flavours) do in-place ACL unless you do object ACLs. In-place ACL update behaviour essentially doubles your ACL scale, if you are running exactly 1 large ACL but it's unpredictable what happens when ACL is changed. Many other devices, such as Juniper program new copy and then switch the ACL pointer to new copy and delete old, making it predictable but halving the ACL size if you are running exactly 1 large ACL as you need double space during reprogramming. Consider old ACL 100 deny host 1.1.1.1 200 deny host 2.2.2.2 300 permit any Consider new ACL 100 deny host 1.1.1.1 200 deny host 2.2.2.2 300 deny host 3.3.3.3 400 permit any this change would cause interruption of traffic if implied default is deny (IOS-XR) because the ACL solver has to remove the '300 permit deny' to fit the new rules and during this delta all packets are hitting implied deny. The implicit default thus optimizes for security rather than hitlessness. If instead of 300 permit any, you had used 10 permit any, during reprogramming you might have permitted something you should not have (not in this case), but you would not have dropped anything you should not, which may be much more desirable behaviour for example iACL updates, you'd rather let packets pass for few microseconds than drop what should not be dropped. -- ++ytti ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
> On 20 Jan 2020, at 00:15, Nathan Lannine wrote: > > >> >> >> >> This bug not only affects ACLs but other commands as well. Unsure if it is >> fixed in newest XE versions. Could this also affect you? >> >> > Aside from this behavior, XE in the enterprise access layer is full of bugs > related to ACLs. We've recently begun a practice of maintaining two > distinct versions of every ACL so we can swap them on interfaces after > modifying the unused one. Modifying a used one in-place results in some > degree of data plane failure on affected interfaces, i.e. they stop passing > all or some subset of traffic. Even on "fixed" code, the problem persists, > though less frequently. Do you happen to have a bug reference for this? We’ve been seeing this behaviour intermittently on some csr 1ks and haven’t had the time/energy to debate it with TAC yet. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
> > > This bug not only affects ACLs but other commands as well. Unsure if it is > fixed in newest XE versions. Could this also affect you? > > Aside from this behavior, XE in the enterprise access layer is full of bugs related to ACLs. We've recently begun a practice of maintaining two distinct versions of every ACL so we can swap them on interfaces after modifying the unused one. Modifying a used one in-place results in some degree of data plane failure on affected interfaces, i.e. they stop passing all or some subset of traffic. Even on "fixed" code, the problem persists, though less frequently. ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
Hi, On Sun, Jan 19, 2020 at 12:39:18PM +0100, Christian Meutes wrote: > if you use ???copy src dst??? then a ???no $something??? line right in the > beginning of a new block of configuration lines (eg. for being used to > first deconfigure the whole ACL block and then to reapply it again) might > miss to apply the ???no ...??? initially first, which will lead to a merge > behavior instead of a full ACL replace. > > This bug not only affects ACLs but other commands as well. Unsure if it is > fixed in newest XE versions. Could this also affect you? Our ACL config snippets do have no ip access-list extended FOOBAR ip access-list extended FOOBAR permit ... permit ... deny ... end in them, so yes, this effect would result in "merge" behaviour (which would very much puzzle me afterwards when looking at the resulting config diff, I think :-) ). It does not explain what we currently see - these ACLs have been installed "from zero", and the resulting running- and startup-config have all the lines "in". Just the filtering hardware doesn't... gert -- "If was one thing all people took for granted, was conviction that if you feed honest figures into a computer, honest figures come out. Never doubted it myself till I met a computer with a sense of humor." Robert A. Heinlein, The Moon is a Harsh Mistress Gert Doering - Munich, Germany g...@greenie.muc.de signature.asc Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
Hi, On Mon, Jan 20, 2020 at 12:28:25AM +1300, Nathan Ward wrote: > > On 20/01/2020, at 12:22 AM, Gert Doering wrote: > > > > Now, IPv6 ACLs are not working right either, but they fail in different > > ways - short ACLs seem to be working right, long ACLs fail-open, as in > > "the platform claims it has been programmed, but all packets pass". Yay. > > This is what happens on J ACX boxes.. stunningly bad behaviour :-( Ewww. Does it at least warn in a clearly visible way? Our Aristas also like to run out of TCAM, but if that happens, a very clear message is printed *and* the ACL config is not applied to the interface (= you can see it in your RANCID diffs). gert -- "If was one thing all people took for granted, was conviction that if you feed honest figures into a computer, honest figures come out. Never doubted it myself till I met a computer with a sense of humor." Robert A. Heinlein, The Moon is a Harsh Mistress Gert Doering - Munich, Germany g...@greenie.muc.de signature.asc Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
Hi, On Sun 19. Jan 2020 at 12:23, Gert Doering wrote: > replying to myself with a few... interesting... discoveries we've made > in the meantime... > > On Mon, Dec 30, 2019 at 11:57:54AM +0100, Gert Doering wrote: > > quick question to the group - ACLs on BDIs on ASR920s, is this something > > known as something you want to stay away from? > > TAC was not exactly helpful ("can you add a line to that ACL, and take > another one away, does it work now?" - I'm still waiting for a single > "let's see what is programmed in the hardware!" question...) - but that > uncovered quite an interesting effect... > > Namely: > > - if I type in the ACL in question, line by line (or remove and re-add >the non-working line from "conf term") things *work* > > - if I "bulk-config" the ACL by "copy tftp:$source running-config" or >"rcp $source router:running-config" - which is what our ACL provisioning >tool uses - things *fail* > > So my gut says "it's related to the speed of updates" - push in changes > too fast (like, 100 lines in basically "a single instant"), and "something > gets overrun". We've now changed our ACL uploader to use SSH and put > the ACLs in line by line, and that seems to have fixed it for v4. Maybe. > > > Now, IPv6 ACLs are not working right either, but they fail in different > ways - short ACLs seem to be working right, long ACLs fail-open, as in > "the platform claims it has been programmed, but all packets pass". Yay. > > Haven't figured out the trigger on that one yet - like "a certain > combination of protocol/port matches creates a pass-all rule instead" > (but didn't have much time). Should be somewhat easy to bisect, "just > need time"... if you use „copy src dst“ then a „no $something“ line right in the beginning of a new block of configuration lines (eg. for being used to first deconfigure the whole ACL block and then to reapply it again) might miss to apply the „no ...“ initially first, which will lead to a merge behavior instead of a full ACL replace. This bug not only affects ACLs but other commands as well. Unsure if it is fixed in newest XE versions. Could this also affect you? Cheers Chris > ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
> On 20/01/2020, at 12:22 AM, Gert Doering wrote: > > > Now, IPv6 ACLs are not working right either, but they fail in different > ways - short ACLs seem to be working right, long ACLs fail-open, as in > "the platform claims it has been programmed, but all packets pass". Yay. This is what happens on J ACX boxes.. stunningly bad behaviour :-( -- Nathan Ward ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
Hi, replying to myself with a few... interesting... discoveries we've made in the meantime... On Mon, Dec 30, 2019 at 11:57:54AM +0100, Gert Doering wrote: > quick question to the group - ACLs on BDIs on ASR920s, is this something > known as something you want to stay away from? TAC was not exactly helpful ("can you add a line to that ACL, and take another one away, does it work now?" - I'm still waiting for a single "let's see what is programmed in the hardware!" question...) - but that uncovered quite an interesting effect... Namely: - if I type in the ACL in question, line by line (or remove and re-add the non-working line from "conf term") things *work* - if I "bulk-config" the ACL by "copy tftp:$source running-config" or "rcp $source router:running-config" - which is what our ACL provisioning tool uses - things *fail* So my gut says "it's related to the speed of updates" - push in changes too fast (like, 100 lines in basically "a single instant"), and "something gets overrun". We've now changed our ACL uploader to use SSH and put the ACLs in line by line, and that seems to have fixed it for v4. Maybe. Now, IPv6 ACLs are not working right either, but they fail in different ways - short ACLs seem to be working right, long ACLs fail-open, as in "the platform claims it has been programmed, but all packets pass". Yay. Haven't figured out the trigger on that one yet - like "a certain combination of protocol/port matches creates a pass-all rule instead" (but didn't have much time). Should be somewhat easy to bisect, "just need time"... gert -- "If was one thing all people took for granted, was conviction that if you feed honest figures into a computer, honest figures come out. Never doubted it myself till I met a computer with a sense of humor." Robert A. Heinlein, The Moon is a Harsh Mistress Gert Doering - Munich, Germany g...@greenie.muc.de signature.asc Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
Hi, On Tue, Dec 31, 2019 at 12:00:00AM +0200, Tassos Chatzithomaoglou wrote: > We have been using small (<300 ACEs) egress ACLs under BDIs without any > apparent issues until now. Which version of IOS XE? How many BDIs? > Maybe have a look at the following outputs: > > show platform hardware pp active tcam utilization acl detail 0 > show platform hardware pp active tcam utilization egress-acl detail 0 That all looks reasonable... (15% and 57%) > Also check the limitations of your SDM template (i.e. > https://www.cisco.com/c/en/us/td/docs/routers/asr920/configuration/guide/sdm/16-11-1/b-sys-sdm-xe-16-11-1-asr-920.html) We're on "default", and that should have enough of everything for what the box is doing. gert -- "If was one thing all people took for granted, was conviction that if you feed honest figures into a computer, honest figures come out. Never doubted it myself till I met a computer with a sense of humor." Robert A. Heinlein, The Moon is a Harsh Mistress Gert Doering - Munich, Germany g...@greenie.muc.de signature.asc Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] ASR920: egress ACL on BDIs
Hi, We have been using small (<300 ACEs) egress ACLs under BDIs without any apparent issues until now. Maybe have a look at the following outputs: show platform hardware pp active tcam utilization acl detail 0 show platform hardware pp active tcam utilization egress-acl detail 0 Also check the limitations of your SDM template (i.e. https://www.cisco.com/c/en/us/td/docs/routers/asr920/configuration/guide/sdm/16-11-1/b-sys-sdm-xe-16-11-1-asr-920.html) -- Tassos Gert Doering wrote on 30/12/19 12:57: > Hi, > > quick question to the group - ACLs on BDIs on ASR920s, is this something > known as something you want to stay away from? > > I'm trying to get rid of one of our remaining 6500/Sup720s - most VLANs > got moved to Aristas, but a few of them have egress ACLs on the SVI/BDI > (which does not really work well with the default Arista TCAM carving, > only 1000 entries...) - so I decided "make good use of the ASR920 on > that site, which isn't really doing much" and moved the three (3!) BDIs > over. > > Bäm. > > *Some* packets that are supposed to be permitted by very simple IPv4 > ACLs are just not arriving. Like, TCP SYNs that should be matched > by a "permit ip host $source host $dest" rule, right at the top of > the ACL in question. Or ping, which is permitted in all our ACLs > with a "permit icmp any any" rule. > > Removing and re-adding the ACLs (and checking with a sniffer port) has > confirmed that it's indeed the egress ACLs, not routing or anything else > which might eat packets. > > Interesting enough, the pattern shifts - so when you change something, > a non-working ACL entry "A" starts working, but something in ACL B > starts failing. Nothing interesting in the logs, ever. > > This is an ASR920-12CZ with "Cisco IOS XE Software, Version 16.06.05a". > > I have a TAC case open, which has proceeded nicely to "I will have a > look at your logs, but first I go on vacation". > > > > I'm not looking for debugging advise right now, more for experience from > the field - like "yes, we've done egress ACLs with 16.06, and it just > does not work!" or "there is a hidden switch to make the ACL compiler > work correctly if you have " or maybe even "there is hidden > command to force re-programming of ACLs, it is needed because "... > > > This box does IPv4, IPv6 routing (BGP, EIGRP, OSPFv3) and EoMPLS/VPLS > things (LDP), on a fairly small scale (~250 IPv4 routes, ~900 IPv6 routes, > ~8 bridge-domains, 2 VPLS groups and 2 EoMPLS circuits). So this should > be well within the limits of the architecture... > > (I'm tempted to move these VLANs to an old 7301 - it's the backup uplinks > anyway, so falling down to ~500 Mbit/s in case the primary router fails > would be acceptable. But it irks me that I have this new and shiny box > which is not behaving...) > > gert > > > > ___ > cisco-nsp mailing list cisco-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/cisco-nsp > archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
[c-nsp] ASR920: egress ACL on BDIs
Hi, quick question to the group - ACLs on BDIs on ASR920s, is this something known as something you want to stay away from? I'm trying to get rid of one of our remaining 6500/Sup720s - most VLANs got moved to Aristas, but a few of them have egress ACLs on the SVI/BDI (which does not really work well with the default Arista TCAM carving, only 1000 entries...) - so I decided "make good use of the ASR920 on that site, which isn't really doing much" and moved the three (3!) BDIs over. Bäm. *Some* packets that are supposed to be permitted by very simple IPv4 ACLs are just not arriving. Like, TCP SYNs that should be matched by a "permit ip host $source host $dest" rule, right at the top of the ACL in question. Or ping, which is permitted in all our ACLs with a "permit icmp any any" rule. Removing and re-adding the ACLs (and checking with a sniffer port) has confirmed that it's indeed the egress ACLs, not routing or anything else which might eat packets. Interesting enough, the pattern shifts - so when you change something, a non-working ACL entry "A" starts working, but something in ACL B starts failing. Nothing interesting in the logs, ever. This is an ASR920-12CZ with "Cisco IOS XE Software, Version 16.06.05a". I have a TAC case open, which has proceeded nicely to "I will have a look at your logs, but first I go on vacation". I'm not looking for debugging advise right now, more for experience from the field - like "yes, we've done egress ACLs with 16.06, and it just does not work!" or "there is a hidden switch to make the ACL compiler work correctly if you have " or maybe even "there is hidden command to force re-programming of ACLs, it is needed because "... This box does IPv4, IPv6 routing (BGP, EIGRP, OSPFv3) and EoMPLS/VPLS things (LDP), on a fairly small scale (~250 IPv4 routes, ~900 IPv6 routes, ~8 bridge-domains, 2 VPLS groups and 2 EoMPLS circuits). So this should be well within the limits of the architecture... (I'm tempted to move these VLANs to an old 7301 - it's the backup uplinks anyway, so falling down to ~500 Mbit/s in case the primary router fails would be acceptable. But it irks me that I have this new and shiny box which is not behaving...) gert -- "If was one thing all people took for granted, was conviction that if you feed honest figures into a computer, honest figures come out. Never doubted it myself till I met a computer with a sense of humor." Robert A. Heinlein, The Moon is a Harsh Mistress Gert Doering - Munich, Germany g...@greenie.muc.de signature.asc Description: PGP signature ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/