Re: Trident3 vs Jericho2

2021-04-09 Thread Jeff Tantsura
Buffer size has nothing to do with feature richness.
Assuming you are asking about DC  - in a wide radix low oversubscription 
network shallow buffers do just fine, some applications (think map reduce/ML 
model training) have many to one traffic patterns and suffer from incast as the 
result, deep buffers might be helpful here, DCI/DC-GW is another case where 
deep buffers could be justified.

Regards,
Jeff

> On Apr 9, 2021, at 05:59, Dmitry Sherman  wrote:
> 
> Once again, which is better shared buffer featurerich or fat buffer switches?
> When its better to put big buffer switch? When its better to drop and 
> retransmit instead of queueing?
> 
> Thanks.
> Dmitry


Re: My First BGP-Hijacking Explanation

2021-04-09 Thread Mark Tinka




On 4/9/21 00:19, Eric Kuhnke wrote:




As an anecdotal data point, the only effect this has had is teaching 
random 14 year olds how to use ordinary consumer grade VPNs, which 
work just fine.


One way or the other, you can't keep the kids from what they want :-).

Mark.


Weekly Routing Table Report

2021-04-09 Thread Routing Analysis Role Account
This is an automated weekly mailing describing the state of the Internet
Routing Table as seen from APNIC's router in Japan.

The posting is sent to APOPS, NANOG, AfNOG, SANOG, PacNOG, SAFNOG
TZNOG, MENOG, BJNOG, SDNOG, CMNOG, LACNOG and the RIPE Routing WG.

Daily listings are sent to bgp-st...@lists.apnic.net

For historical data, please see http://thyme.rand.apnic.net.

If you have any comments please contact Philip Smith .

Routing Table Report   04:00 +10GMT Sat 10 Apr, 2021

Report Website: http://thyme.rand.apnic.net
Detailed Analysis:  http://thyme.rand.apnic.net/current/

Analysis Summary


BGP routing table entries examined:  855145
Prefixes after maximum aggregation (per Origin AS):  324141
Deaggregation factor:  2.64
Unique aggregates announced (without unneeded subnets):  406808
Total ASes present in the Internet Routing Table: 71034
Prefixes per ASN: 12.04
Origin-only ASes present in the Internet Routing Table:   61106
Origin ASes announcing only one prefix:   25216
Transit ASes present in the Internet Routing Table:9928
Transit-only ASes present in the Internet Routing Table:307
Average AS path length visible in the Internet Routing Table:   4.4
Max AS path length visible:  38
Max AS path prepend of ASN ( 37385)  33
Prefixes from unregistered ASNs in the Routing Table:  1015
Number of instances of unregistered ASNs:  1021
Number of 32-bit ASNs allocated by the RIRs:  35637
Number of 32-bit ASNs visible in the Routing Table:   29671
Prefixes from 32-bit ASNs in the Routing Table:  138154
Number of bogon 32-bit ASNs visible in the Routing Table:21
Special use prefixes present in the Routing Table:1
Prefixes being announced from unallocated address space:530
Number of addresses announced to Internet:   2955025792
Equivalent to 176 /8s, 34 /16s and 29 /24s
Percentage of available address space announced:   79.8
Percentage of allocated address space announced:   79.8
Percentage of available address space allocated:  100.0
Percentage of address space in use by end-sites:   99.5
Total number of prefixes smaller than registry allocations:  290123

APNIC Region Analysis Summary
-

Prefixes being announced by APNIC Region ASes:   226451
Total APNIC prefixes after maximum aggregation:   65430
APNIC Deaggregation factor:3.46
Prefixes being announced from the APNIC address blocks:  222562
Unique aggregates announced from the APNIC address blocks:89330
APNIC Region origin ASes present in the Internet Routing Table:   11459
APNIC Prefixes per ASN:   19.42
APNIC Region origin ASes announcing only one prefix:   3264
APNIC Region transit ASes present in the Internet Routing Table:   1624
Average APNIC Region AS path length visible:4.5
Max APNIC Region AS path length visible: 30
Number of APNIC region 32-bit ASNs visible in the Routing Table:   6613
Number of APNIC addresses announced to Internet:  772040448
Equivalent to 46 /8s, 4 /16s and 103 /24s
APNIC AS Blocks4608-4864, 7467-7722, 9216-10239, 17408-18431
(pre-ERX allocations)  23552-24575, 37888-38911, 45056-46079, 55296-56319,
   58368-59391, 63488-64098, 64297-64395, 131072-143673
APNIC Address Blocks 1/8,  14/8,  27/8,  36/8,  39/8,  42/8,  43/8,
49/8,  58/8,  59/8,  60/8,  61/8, 101/8, 103/8,
   106/8, 110/8, 111/8, 112/8, 113/8, 114/8, 115/8,
   116/8, 117/8, 118/8, 119/8, 120/8, 121/8, 122/8,
   123/8, 124/8, 125/8, 126/8, 133/8, 150/8, 153/8,
   163/8, 171/8, 175/8, 180/8, 182/8, 183/8, 202/8,
   203/8, 210/8, 211/8, 218/8, 219/8, 220/8, 221/8,
   222/8, 223/8,

ARIN Region Analysis Summary


Prefixes being announced by ARIN Region ASes:245433
Total ARIN prefixes after maximum aggregation:   112906
ARIN Deaggregation factor: 2.17
Prefixes being announced from the ARIN address blocks:   245868
Unique aggregates announced from the ARIN address blocks:117633
ARIN Region origin ASes present in the Internet Routing Table:18781
ARIN Prefixes per ASN:13.09
ARIN Region 

Re: Trident3 vs Jericho2

2021-04-09 Thread lobna gouda
It will not be easy to get a straight answer, I would say more about your 
environ and applications.  So if you considered the classical TCP  algorithm 
ignoring latency it is large buffer, yet what about microburst?

LG


From: NANOG  on behalf of 
William Herrin 
Sent: Friday, April 9, 2021 1:07 PM
To: Mike Hammett 
Cc: nanog@nanog.org 
Subject: Re: Trident3 vs Jericho2

On Fri, Apr 9, 2021 at 6:05 AM Mike Hammett  wrote:
> What I've observed is that it's better to have a big buffer device
> when you're mixing port speeds. The more dramatic the port
> speed differences (and the more of them), the more buffer you need.
>
> If you have all the same port speed, small buffers are fine. If you have
> 100G and 1G ports, you'll need big buffers wherever the transition to
> the smaller port speed is located.


When a network is behaving well (losing few packets to data
corruption), TCP throughput is is impacted by exactly two factors:

1. Packet round trip time
2. The size to which the congestion window has grown when the first
packet is lost

Assuming the sender has data ready, it will (after the initial
negotiation) slam out 10 packets back to back at the local wire speed.
Those 10 packets are the initial congestion window. After sending 10
packets it will wait and wait and wait until it hits a timeout or the
other side responds with an acknowledgement. So those initial packets
start out crammed right at the front of the round trip time with lots
of empty afterwards.

The receiver gets the packets in a similar burst and sends its acks.
As the sender receives acknowledgement for each of the original
packets, it sends two more. This doubling effect is called "slow
start," and it's slow in the sense that the sender doesn't just throw
the entire data set at the wire and hope. So, having received acks for
10 packets, it sends 20 more. These 20 have spread out a little bit,
more or less based on the worst link speed in the path, but they're
still all crammed up in a bunch at the start of the round trip time.

Next round trip time it doubles to 40 packets. Then 80. Then 160. All
crammed up at the start of the round trip time causing them to hit
that one slowest link in the middle all at once. This doubling
continues until one of the buffers in the middle is too small to hold
the trailing part of the burst of packets while the leading part is
sent. With a full buffer, a packet is dropped. Whatever the congestion
window size is when that first packet is dropped, that number times
the round trip time is more or less the throughput you're going to see
on that TCP connection.

The various congestion control algorithms for TCP do different things
after they see that first packet drop. Some knock the congestion
window in half right away. Others back down more cautiously. Some
reduce growth all the way down to 1 packet per round trip time. Others
will allow faster growth as the packets spread out over the whole
round trip time and demonstrate that they don't keep getting lost. But
in general, the throughput you're going to see on that TCP connection
has been decided as soon as you lose that first packet.

So, TCP will almost always get better throughput with more buffers.
The flip side is latency: packets sitting in a buffer extend the time
before the receiver gets them. So if you make a buffer that's 500
milliseconds long and then let a TCP connection fill it up, apps which
work poorly in high latency environments (like games and ssh) will
suffer.

Regards,
Bill Herrin


--
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Trident3 vs Jericho2

2021-04-09 Thread William Herrin
On Fri, Apr 9, 2021 at 6:05 AM Mike Hammett  wrote:
> What I've observed is that it's better to have a big buffer device
> when you're mixing port speeds. The more dramatic the port
> speed differences (and the more of them), the more buffer you need.
>
> If you have all the same port speed, small buffers are fine. If you have
> 100G and 1G ports, you'll need big buffers wherever the transition to
> the smaller port speed is located.


When a network is behaving well (losing few packets to data
corruption), TCP throughput is is impacted by exactly two factors:

1. Packet round trip time
2. The size to which the congestion window has grown when the first
packet is lost

Assuming the sender has data ready, it will (after the initial
negotiation) slam out 10 packets back to back at the local wire speed.
Those 10 packets are the initial congestion window. After sending 10
packets it will wait and wait and wait until it hits a timeout or the
other side responds with an acknowledgement. So those initial packets
start out crammed right at the front of the round trip time with lots
of empty afterwards.

The receiver gets the packets in a similar burst and sends its acks.
As the sender receives acknowledgement for each of the original
packets, it sends two more. This doubling effect is called "slow
start," and it's slow in the sense that the sender doesn't just throw
the entire data set at the wire and hope. So, having received acks for
10 packets, it sends 20 more. These 20 have spread out a little bit,
more or less based on the worst link speed in the path, but they're
still all crammed up in a bunch at the start of the round trip time.

Next round trip time it doubles to 40 packets. Then 80. Then 160. All
crammed up at the start of the round trip time causing them to hit
that one slowest link in the middle all at once. This doubling
continues until one of the buffers in the middle is too small to hold
the trailing part of the burst of packets while the leading part is
sent. With a full buffer, a packet is dropped. Whatever the congestion
window size is when that first packet is dropped, that number times
the round trip time is more or less the throughput you're going to see
on that TCP connection.

The various congestion control algorithms for TCP do different things
after they see that first packet drop. Some knock the congestion
window in half right away. Others back down more cautiously. Some
reduce growth all the way down to 1 packet per round trip time. Others
will allow faster growth as the packets spread out over the whole
round trip time and demonstrate that they don't keep getting lost. But
in general, the throughput you're going to see on that TCP connection
has been decided as soon as you lose that first packet.

So, TCP will almost always get better throughput with more buffers.
The flip side is latency: packets sitting in a buffer extend the time
before the receiver gets them. So if you make a buffer that's 500
milliseconds long and then let a TCP connection fill it up, apps which
work poorly in high latency environments (like games and ssh) will
suffer.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Trident3 vs Jericho2

2021-04-09 Thread Vincent Bernat
 ❦  9 avril 2021 17:20 +03, Saku Ytti:

> If we'd change TCP sender to bandwidth estimation, and newly created window
> space would be serialised at estimated receiver rate then we would need
> dramatically less buffers. However this less aggressive TCP algorithm would
> be outcompeted by new reno reducing bandwidth estimation to approach zero.
>
> Luckily almost all traffic is handled by few players, if they agree to
> change to well behaved TCP (or QUIC) algorithm, it doesn't matter much if
> the long tail is badly behaving TCP.

I think many of them are now using BBR or BBR v2. It would be
interesting to know how it impacted switch buffering.
-- 
As flies to wanton boys are we to the gods; they kill us for their sport.
-- Shakespeare, "King Lear"


Re: Trident3 vs Jericho2

2021-04-09 Thread Saku Ytti
The reason why we need larger buffers on some applications is because of
TCP implementation detail. When TCP window grows in size (it grows
exponentially) the newly created window size is bursted on to the wire at
sender speed.

If sender is significantly higher speed than receiver, someone needs to
store these bytes, while they are serialised at receiver speed. If we
cannot store them, then the window cannot grow to accommodate the
banwdith*delay product and the receiver cannot observe ideal TCP receive
rate.

If we'd change TCP sender to bandwidth estimation, and newly created window
space would be serialised at estimated receiver rate then we would need
dramatically less buffers. However this less aggressive TCP algorithm would
be outcompeted by new reno reducing bandwidth estimation to approach zero.

Luckily almost all traffic is handled by few players, if they agree to
change to well behaved TCP (or QUIC) algorithm, it doesn't matter much if
the long tail is badly behaving TCP.



On Fri, 9 Apr 2021 at 17:13, Mike Hammett  wrote:

> I have seen the opposite, where small buffers impacted throughput.
>
> Then again, it was observation only, no research into why, other than
> superficial.
>
>
>
> -
> Mike Hammett
> Intelligent Computing Solutions 
> 
> 
> 
> 
> Midwest Internet Exchange 
> 
> 
> 
> The Brothers WISP 
> 
> 
> --
> *From: *"Tom Beecher" 
> *To: *"Mike Hammett" 
> *Cc: *"Dmitry Sherman" , "NANOG" 
> *Sent: *Friday, April 9, 2021 8:40:00 AM
> *Subject: *Re: Trident3 vs Jericho2
>
> If you have all the same port speed, small buffers are fine. If you have
>> 100G and 1G ports, you'll need big buffers wherever the transition to the
>> smaller port speed is located.
>
>
> While the larger buffer there you are likely to be severely impacting
> application throughput.
>
> On Fri, Apr 9, 2021 at 9:05 AM Mike Hammett  wrote:
>
>> What I've observed is that it's better to have a big buffer device when
>> you're mixing port speeds. The more dramatic the port speed differences
>> (and the more of them), the more buffer you need.
>>
>> If you have all the same port speed, small buffers are fine. If you have
>> 100G and 1G ports, you'll need big buffers wherever the transition to the
>> smaller port speed is located.
>>
>>
>>
>> -
>> Mike Hammett
>> Intelligent Computing Solutions 
>> 
>> 
>> 
>> 
>> Midwest Internet Exchange 
>> 
>> 
>> 
>> The Brothers WISP 
>> 
>> 
>> --
>> *From: *"Dmitry Sherman" 
>> *To: *nanog@nanog.org
>> *Sent: *Friday, April 9, 2021 7:57:05 AM
>> *Subject: *Trident3 vs Jericho2
>>
>> Once again, which is better shared buffer featurerich or fat buffer
>> switches?
>> When its better to put big buffer switch? When its better to drop and
>> retransmit instead of queueing?
>>
>> Thanks.
>> Dmitry
>>
>>
>

-- 
  ++ytti


Re: Trident3 vs Jericho2

2021-04-09 Thread Mike Hammett
I have seen the opposite, where small buffers impacted throughput. 

Then again, it was observation only, no research into why, other than 
superficial. 




- 
Mike Hammett 
Intelligent Computing Solutions 

Midwest Internet Exchange 

The Brothers WISP 

- Original Message -

From: "Tom Beecher"  
To: "Mike Hammett"  
Cc: "Dmitry Sherman" , "NANOG"  
Sent: Friday, April 9, 2021 8:40:00 AM 
Subject: Re: Trident3 vs Jericho2 




If you have all the same port speed, small buffers are fine. If you have 100G 
and 1G ports, you'll need big buffers wherever the transition to the smaller 
port speed is located. 




While the larger buffer there you are likely to be severely impacting 
application throughput. 



On Fri, Apr 9, 2021 at 9:05 AM Mike Hammett < na...@ics-il.net > wrote: 




What I've observed is that it's better to have a big buffer device when you're 
mixing port speeds. The more dramatic the port speed differences (and the more 
of them), the more buffer you need. 


If you have all the same port speed, small buffers are fine. If you have 100G 
and 1G ports, you'll need big buffers wherever the transition to the smaller 
port speed is located. 




- 
Mike Hammett 
Intelligent Computing Solutions 

Midwest Internet Exchange 

The Brothers WISP 



From: "Dmitry Sherman" < dmi...@interhost.net > 
To: nanog@nanog.org 
Sent: Friday, April 9, 2021 7:57:05 AM 
Subject: Trident3 vs Jericho2 

Once again, which is better shared buffer featurerich or fat buffer switches? 
When its better to put big buffer switch? When its better to drop and 
retransmit instead of queueing? 

Thanks. 
Dmitry 






Re: Trident3 vs Jericho2

2021-04-09 Thread Tom Beecher
>
> If you have all the same port speed, small buffers are fine. If you have
> 100G and 1G ports, you'll need big buffers wherever the transition to the
> smaller port speed is located.


While the larger buffer there you are likely to be severely impacting
application throughput.

On Fri, Apr 9, 2021 at 9:05 AM Mike Hammett  wrote:

> What I've observed is that it's better to have a big buffer device when
> you're mixing port speeds. The more dramatic the port speed differences
> (and the more of them), the more buffer you need.
>
> If you have all the same port speed, small buffers are fine. If you have
> 100G and 1G ports, you'll need big buffers wherever the transition to the
> smaller port speed is located.
>
>
>
> -
> Mike Hammett
> Intelligent Computing Solutions 
> 
> 
> 
> 
> Midwest Internet Exchange 
> 
> 
> 
> The Brothers WISP 
> 
> 
> --
> *From: *"Dmitry Sherman" 
> *To: *nanog@nanog.org
> *Sent: *Friday, April 9, 2021 7:57:05 AM
> *Subject: *Trident3 vs Jericho2
>
> Once again, which is better shared buffer featurerich or fat buffer
> switches?
> When its better to put big buffer switch? When its better to drop and
> retransmit instead of queueing?
>
> Thanks.
> Dmitry
>
>


Re: Trident3 vs Jericho2

2021-04-09 Thread Tom Beecher
There is no easy, one side fits all answer to this question. It's a complex
subject, and the answer will often be different depending on the
environment and traffic profile.

On Fri, Apr 9, 2021 at 8:58 AM Dmitry Sherman  wrote:

> Once again, which is better shared buffer featurerich or fat buffer
> switches?
> When its better to put big buffer switch? When its better to drop and
> retransmit instead of queueing?
>
> Thanks.
> Dmitry
>


Re: My First BGP-Hijacking Explanation

2021-04-09 Thread Tom Beecher
>
> As an anecdotal data point, the only effect this has had is teaching
> random 14 year olds how to use ordinary consumer grade VPNs, which work
> just fine.
>

Or, perhaps some kid watched that and said "Oh that's cool, I want to know
more about how that works!" , and planted a seed for a future career.

(More likely they went back to Fortnite videos, but one can dream, right?)

On Thu, Apr 8, 2021 at 6:21 PM Eric Kuhnke  wrote:

> If one follows the social media accounts of the Pakistan version of the
> FCC, nowadays they're just banning anything they find insulting or illegal
> in the local legal system, and ordering ISPs to null route big chunks of IP
> space.
>
> As an anecdotal data point, the only effect this has had is teaching
> random 14 year olds how to use ordinary consumer grade VPNs, which work
> just fine.
>
> https://www.pta.gov.pk/en
>
>
>
> On Thu, Apr 8, 2021 at 9:52 AM Jay R. Ashworth  wrote:
>
>> Sam 'Half As Interesting' Denby actually did a surprisingly good job
>> explaining
>> this for the average only-vaguely-technical viewer...
>>
>>https://www.youtube.com/watch?v=K9gnRs33NOk
>>
>> [ For all the bad dad jokes he tells on HAI, he's got really good research
>>   skills/staff, and his long-form stuff on Wendover Productions is
>> excellent ]
>>
>>
>> Cheers,
>> -- jra
>>
>> --
>> Jay R. Ashworth  Baylink
>> j...@baylink.com
>> Designer The Things I Think   RFC
>> 2100
>> Ashworth & Associates   http://www.bcp38.info  2000 Land
>> Rover DII
>> St Petersburg FL USA  BCP38: Ask For It By Name!   +1 727 647
>> 1274
>>
>


Re: Trident3 vs Jericho2

2021-04-09 Thread Mike Hammett
What I've observed is that it's better to have a big buffer device when you're 
mixing port speeds. The more dramatic the port speed differences (and the more 
of them), the more buffer you need. 


If you have all the same port speed, small buffers are fine. If you have 100G 
and 1G ports, you'll need big buffers wherever the transition to the smaller 
port speed is located. 




- 
Mike Hammett 
Intelligent Computing Solutions 

Midwest Internet Exchange 

The Brothers WISP 

- Original Message -

From: "Dmitry Sherman"  
To: nanog@nanog.org 
Sent: Friday, April 9, 2021 7:57:05 AM 
Subject: Trident3 vs Jericho2 

Once again, which is better shared buffer featurerich or fat buffer switches? 
When its better to put big buffer switch? When its better to drop and 
retransmit instead of queueing? 

Thanks. 
Dmitry 



Trident3 vs Jericho2

2021-04-09 Thread Dmitry Sherman
Once again, which is better shared buffer featurerich or fat buffer switches?
When its better to put big buffer switch? When its better to drop and 
retransmit instead of queueing?

Thanks.
Dmitry