Re: [j-nsp] Finding drops

2019-01-22 Thread Olivier Benghozi
My 2 cents: it could be interesting to check if running the system in 
hyper-mode makes a difference (that should normally be expected).

> Le 22 janv. 2019 à 20:42, adamv0...@netconsultings.com a écrit :
> 
> That sort of indicates that for the 64B stream the packets are dropped by the 
> platform -do you get the confirmation on the RX end of the tester about the 
> missing packets? Not sure if this is about misaligned counters or actually 
> about dropped packets?
> 
> How I read your test is that presumably this is 40G in and 40G out the same 
> PFE (back to back packets @ 64B or 100B packets) 
> So we should just consider single PFE performance but still the resulting PPS 
> rate is nwhere near the theoretical PPS budget.
> How are the PFEs on 204 linked together (any sacrifices in the PFE BW/PPS 
> budget to account for the fabric)? On MPC7E all 4 PFEs would be connected via 
> fabric.  
> So nothing really adds up here...  shouldn't be happening -not at these rates

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] OSPF reference-bandwidth 1T

2019-01-22 Thread Thomas Bellman
On 2019-01-22 12:02 MET, Pavel Lunin wrote:

>> (I am myself running a mostly DC network, with a little bit of campus
>> network on the side, and we use bandwidth-based metrics in our OSPF.
>> But we have standardized on using 3 Tbit/s as our "reference bandwidth",
>> and Junos doesn't allow us to set that, so we set explicit metrics.)

> As Adam has already mentioned, DC networks are becoming more and more
> Clos-based, so you basically don't need OSPF at all for this.
> 
> Fabric uplinks, Backbone/DCI and legacy still exist though, however in
> the DC we tend to ECMP it all, so you normally don't want to have unequal
> bandwidth links in parallel in the DC.

Our network is roughly spine-and-leaf.  But we have a fairly small net
(two spines, around twenty leafs, split over two computer rooms a couple
of hundred meters apart the way the fiber goes), and it doesn't make
economical sense to make it a perfectly pure folded Clos network.  So,
there are a couple of leaf switches that are just layer 2 with spanning
tree, and the WAN connections to our partner in the neighbouring city
goes directly into our spines instead of into "peering leafs".  (The
border routers for our normal Internet connectivity are connected as
leafs to our spines, though, but they are really our ISP's CPE routers,
not ours.)

Also, the leaves have wildly different bandwidth needs.  Our DNS, email
and web servers don't need as much bandwidth as a 2000 node HPC cluster,
which in turn needs less bandwidth than the storage cluster for LHC
data.  Most leaves have 10G uplinks (one to each spine), but we also
have leafs with 1G and with 40G uplinks.

I don't want a leaf with 1G uplinks becoming a "transit" node for traffic
between two other leafs in (some) failure cases, because an elephant flow
could easily saturate those 1G links.  Thus, I want higher costs for those
links than for the 10G and 40G links.  Of course, the costs don't have to
be exactly  / , but there need to be some relation
to the bandwidth.

> Workarounds happen, sometimes you have no more 100G ports available and
> need to plug, let's say, 4x40G "temporarily" in addition to two existing
> 100G which are starting to be saturated. In such a case you'd rather
> consciously decide weather you want to ECMP these 200 Gigs among six
> links (2x100 + 4x40) or use 40GB links as a backup only (might be not
> the best idea in this scenario).

Right.  I actually have one leaf switch with unequal bandwidth uplinks.
On one side, it uses 2×10G link aggregation, but on the other side, I
could use an old Infiniband AOC cable giving us a 40G uplink.  In that
case, I have explicitly set the two uplinks to have the same costs.


/Bellman, NSC



signature.asc
Description: OpenPGP digital signature
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Finding drops

2019-01-22 Thread Tim Warnock
You're looking in the wrong place :)

You might better understand if you look here: 
https://en.wikipedia.org/wiki/Ethernet_frame

> -Original Message-
> From: juniper-nsp [mailto:juniper-nsp-boun...@puck.nether.net] On Behalf
> Of Jason Lixfeld
> Sent: Tuesday, 22 January 2019 6:09 AM
> To: Juniper List 
> Subject: [j-nsp] Finding drops
> 
> Hi all,
> 
> I’m doing some RFC2544 tests through an MX204.  The tester is connected to
> et-0/0/2, and the test destination is somewhere out there via et-0/0/0.  64
> byte packets seem to be getting dropped, and I’m trying to find where on
> the box those drops are being recorded.
> 
> I’ve distilled the test down to generating 100 million 64 byte (UDP) packets 
> to
> the destination, but the counters on et-0/0/2 read as though they’ve only
> received about 76.6% of those packets.
> 
> If I change the test to send 100 million 100 byte packets, the counters on et-
> 0/0/2 account for all packets.
> 
> I’ve tried looking at various output to find a counter that registers the 
> missing
> packets, but I’m not having any luck.
> 
> Aside from 'show interface et-0/0/2 extensive’, I’ve looked here with no
> luck:
> 
> show interface queue et-0/0/2
> show pfe statistics traffic detail
> show pfe statistics exceptions
> show pfe statistics error
> 
> Somewhere else I should be looking?
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Finding drops

2019-01-22 Thread adamv0025
> Jason Lixfeld
> Sent: Monday, January 21, 2019 8:09 PM
> 
> Hi all,
> 
> I’m doing some RFC2544 tests through an MX204.  The tester is connected to
> et-0/0/2, and the test destination is somewhere out there via et-0/0/0.  64
> byte packets seem to be getting dropped, and I’m trying to find where on
> the box those drops are being recorded.
> 
> I’ve distilled the test down to generating 100 million 64 byte (UDP) packets 
> to
> the destination, but the counters on et-0/0/2 read as though they’ve only
> received about 76.6% of those packets.
> 
> If I change the test to send 100 million 100 byte packets, the counters on et-
> 0/0/2 account for all packets.
> 
> I’ve tried looking at various output to find a counter that registers the 
> missing
> packets, but I’m not having any luck.
> 
> Aside from 'show interface et-0/0/2 extensive’, I’ve looked here with no
> luck:
> 
> show interface queue et-0/0/2
> show pfe statistics traffic detail
> show pfe statistics exceptions
> show pfe statistics error
> 
> Somewhere else I should be looking?
>
Maybe any of the show commands in the below, if they show any drops?
https://kb.juniper.net/InfoCenter/index?page=content=KB26519=FIREWALL=LIST

I appreciate you're just concerned with packet count at the moment, but what is 
interesting is that if you change the rate at which you blast the packet at the 
box (bigger packets = slower PPS rate) the counters align all of a sudden.
That sort of indicates that for the 64B stream the packets are dropped by the 
platform -do you get the confirmation on the RX end of the tester about the 
missing packets? Not sure if this is about misaligned counters or actually 
about dropped packets?
  
How I read your test is that presumably this is 40G in and 40G out the same PFE 
(back to back packets @ 64B or 100B packets) 
So we should just consider single PFE performance but still the resulting PPS 
rate is nwhere near the theoretical PPS budget.
How are the PFEs on 204 linked together (any sacrifices in the PFE BW/PPS 
budget to account for the fabric)? On MPC7E all 4 PFEs would be connected via 
fabric.  
So nothing really adds up here...  shouldn't be happening -not at these rates

adam


___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Finding drops

2019-01-22 Thread Jason Lixfeld


> On Jan 22, 2019, at 4:49 AM, Saku Ytti  wrote:
> 
> On Mon, 21 Jan 2019 at 22:09, Jason Lixfeld  wrote:
> 
>> I’ve distilled the test down to generating 100 million 64 byte (UDP) packets 
>> to the destination, but the counters on et-0/0/2 read as though they’ve only 
>> received about 76.6% of those packets.
> 
> What are you actually doing?

Trying to making sure the counters work :)

Transmitting exactly 100 million 64 byte UDP packets.  SPORT:  49184 DPORT: 7.

> How are measuring rates?

I’m not interested in rate at this point.  I’m interested in correlating the 
number of packets the tester sent vs. the number that it’s directly connected 
MX interface received.  As it stands, there’s a discrepancy, and that’s fine, 
I’m trying to find the counter(s) on the box that will ultimately account for 
all 100 million packets.


___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Finding drops

2019-01-22 Thread Jason Lixfeld
Hey,

> On Jan 21, 2019, at 3:38 PM, Dave Bell  wrote:
> 
> Are you sure your tester is capable of generating that volume of traffic?

Yes.  I’m quite sure.

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Junos Arp Expiration Timer Behavior & Active Flows

2019-01-22 Thread Clarke Morledge

Thanks, Brian.

Unfortunately, the MX policer is not granular enough to trim down the 
unwanted traffic enough, in one particular use case that I am dealing 
with. Excessive ARPs can easily overwhelm some downstream hosts, some more 
than others.


Clarke Morledge
College of William and Mary


___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Junos and single IPv6 link-local address per IFL

2019-01-22 Thread Eldon Koyle
He showed fe80::206:aff:fe0e:fffb/64 in his second example with the same
result.

-- 
Eldon

On Tue, Jan 22, 2019, 07:11 Anderson, Charles R  Link-Local addresses should be in fe80::/64, not fe80::/10.  Try
> configuring a second one that meets this criteria, such as:
>
> > +   address fe80::206:aff:fe0e:fffb/64;
>
> On Tue, Jan 22, 2019 at 03:42:43PM +0200, Martin T wrote:
> > Hi,
> >
> > looks like Junos allows to have only a single IPv6 link-local address.
> > For example, here I tested with Junos 18.2R1.9:
> >
> > root@vmx1# show | compare
> > [edit interfaces ge-0/0/9 unit 0 family inet6]
> > address fe80::206:aff:fe0e:fffa/64 { ... }
> > +   address fe80:1::206:aff:fe0e:fffa/64;
> >
> > [edit]
> > root@vmx1# commit check
> > [edit interfaces ge-0/0/9 unit 0 family inet6]
> >   'address fe80:1::206:aff:fe0e:fffa/64'
> >  Link Local address exists
> > error: configuration check-out failed
> >
> > [edit]
> > root@vmx1#
> >
> > ..or:
> >
> > root@vmx1# show | compare
> > [edit interfaces ge-0/0/9 unit 0 family inet6]
> > address fe80::206:aff:fe0e:fffa/64 { ... }
> > +   address fe80::206:aff:fe0e:fffb/64;
> >
> > [edit]
> > root@vmx1# commit check
> > [edit interfaces ge-0/0/9 unit 0 family inet6]
> >   'address fe80::206:aff:fe0e:fffb/64'
> >  Link Local address exists
> > error: configuration check-out failed
> >
> > [edit]
> > root@vmx1#
> >
> > Just out of curiosity, why there is this limitation? For example
> > FreeBSD 11, which Junos 18.2R1.9 is based on, does not have this
> > limitation:
> >
> > root@FreeBSD-11:~ # ifconfig em0 inet6
> > em0: flags=8843 metric 0 mtu
> > 1500
> >
>  options=209b
> > inet6 fe80::fc69:d3ff:feec:7741%em0 prefixlen 64 scopeid 0x1
> > inet6 fe80::fc69:d3ff:feec:7740%em0 prefixlen 64 scopeid 0x1
> > nd6 options=21
> > root@FreeBSD-11:~ #
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Junos and single IPv6 link-local address per IFL

2019-01-22 Thread Anderson, Charles R
Link-Local addresses should be in fe80::/64, not fe80::/10.  Try configuring a 
second one that meets this criteria, such as:

> +   address fe80::206:aff:fe0e:fffb/64;

On Tue, Jan 22, 2019 at 03:42:43PM +0200, Martin T wrote:
> Hi,
> 
> looks like Junos allows to have only a single IPv6 link-local address.
> For example, here I tested with Junos 18.2R1.9:
> 
> root@vmx1# show | compare
> [edit interfaces ge-0/0/9 unit 0 family inet6]
> address fe80::206:aff:fe0e:fffa/64 { ... }
> +   address fe80:1::206:aff:fe0e:fffa/64;
> 
> [edit]
> root@vmx1# commit check
> [edit interfaces ge-0/0/9 unit 0 family inet6]
>   'address fe80:1::206:aff:fe0e:fffa/64'
>  Link Local address exists
> error: configuration check-out failed
> 
> [edit]
> root@vmx1#
> 
> ..or:
> 
> root@vmx1# show | compare
> [edit interfaces ge-0/0/9 unit 0 family inet6]
> address fe80::206:aff:fe0e:fffa/64 { ... }
> +   address fe80::206:aff:fe0e:fffb/64;
> 
> [edit]
> root@vmx1# commit check
> [edit interfaces ge-0/0/9 unit 0 family inet6]
>   'address fe80::206:aff:fe0e:fffb/64'
>  Link Local address exists
> error: configuration check-out failed
> 
> [edit]
> root@vmx1#
> 
> Just out of curiosity, why there is this limitation? For example
> FreeBSD 11, which Junos 18.2R1.9 is based on, does not have this
> limitation:
> 
> root@FreeBSD-11:~ # ifconfig em0 inet6
> em0: flags=8843 metric 0 mtu
> 1500
> 
> options=209b
> inet6 fe80::fc69:d3ff:feec:7741%em0 prefixlen 64 scopeid 0x1
> inet6 fe80::fc69:d3ff:feec:7740%em0 prefixlen 64 scopeid 0x1
> nd6 options=21
> root@FreeBSD-11:~ #
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] Junos and single IPv6 link-local address per IFL

2019-01-22 Thread Martin T
Hi,

looks like Junos allows to have only a single IPv6 link-local address.
For example, here I tested with Junos 18.2R1.9:

root@vmx1# show | compare
[edit interfaces ge-0/0/9 unit 0 family inet6]
address fe80::206:aff:fe0e:fffa/64 { ... }
+   address fe80:1::206:aff:fe0e:fffa/64;

[edit]
root@vmx1# commit check
[edit interfaces ge-0/0/9 unit 0 family inet6]
  'address fe80:1::206:aff:fe0e:fffa/64'
 Link Local address exists
error: configuration check-out failed

[edit]
root@vmx1#

..or:

root@vmx1# show | compare
[edit interfaces ge-0/0/9 unit 0 family inet6]
address fe80::206:aff:fe0e:fffa/64 { ... }
+   address fe80::206:aff:fe0e:fffb/64;

[edit]
root@vmx1# commit check
[edit interfaces ge-0/0/9 unit 0 family inet6]
  'address fe80::206:aff:fe0e:fffb/64'
 Link Local address exists
error: configuration check-out failed

[edit]
root@vmx1#

Just out of curiosity, why there is this limitation? For example
FreeBSD 11, which Junos 18.2R1.9 is based on, does not have this
limitation:

root@FreeBSD-11:~ # ifconfig em0 inet6
em0: flags=8843 metric 0 mtu
1500

options=209b
inet6 fe80::fc69:d3ff:feec:7741%em0 prefixlen 64 scopeid 0x1
inet6 fe80::fc69:d3ff:feec:7740%em0 prefixlen 64 scopeid 0x1
nd6 options=21
root@FreeBSD-11:~ #


thanks,
Martin
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] OSPF reference-bandwidth 1T

2019-01-22 Thread Pavel Lunin



Hi all,


> Would you advise avoiding bandwidth-based metrics in e.g. datacenter
> or campus networks as well?
>
> (I am myself running a mostly DC network, with a little bit of campus
> network on the side, and we use bandwidth-based metrics in our OSPF.
> But we have standardized on using 3 Tbit/s as our "reference bandwidth",
> and Junos doesn't allow us to set that, so we set explicit metrics.)

As Adam has already mentioned, DC networks are becoming more and more 
Clos-based, so you basically don't need OSPF at all for this.

Fabric uplinks, Backbone/DCI and legacy still exist though, however in the DC 
we tend to ECMP it all, so you normally don't want to have unequal bandwidth 
links in parallel in the DC.

Workarounds happen, sometimes you have no more 100G ports available and need to 
plug, let's say, 4x40G "temporarily" in addition to two existing 100G which are 
starting to be saturated. In such a case you'd rather consciously decide 
weather you want to ECMP these 200 Gigs among six links (2x100 + 4x40) or use 
40GB links as a backup only (might be not the best idea in this scenario).

So, it's not the reference bandwidth itself which is bad in the DC but rather 
the use-cases, where it can technically work, are not the best for modern DC 
networks.

--
Pavel
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Finding drops

2019-01-22 Thread Saku Ytti
On Mon, 21 Jan 2019 at 22:09, Jason Lixfeld  wrote:

> I’ve distilled the test down to generating 100 million 64 byte (UDP) packets 
> to the destination, but the counters on et-0/0/2 read as though they’ve only 
> received about 76.6% of those packets.

This caught my eye.

84B / 64B = 76.19% - awfully close to what you're seeing.

What are you actually doing? How are measuring rates?

JunOS SNMP and 'show int' measures _l3 speed_, which is 20/84 = 23.8%.
So worst case you see 23.8Gbps out of 100Gbps interface, when it's
fully congested, because you're measuring wrong thing.

SNMP standard says L2 rate, which is 76.19Gbps out of 100Gbps, which
is also measuring wrong thing. So everyone is looking at stats, which
are wrong, unless they post-process and approximate the bps to be bps
+ (pps* overhead).

You actually want to measure true L1 rate (preamble 1, sfd 7, dmac 6,
smac 6, etype 2, payload 46, crc 4, ifg 12).

MX204 is single Eagle trio, certainly can't do wire-rate on all ports,
but can do in one port.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp