Re: ECN

2019-11-13 Thread Tore Anderson
* Saku Ytti

> Not true. Hash result should indicate discreet flow, more importantly
> discreet flow should not result into two unique hash numbers. Using
> whole TOS byte breaks this promise and thus breaks ECMP.
> 
> Platforms allow you to configure which  bytes are part of hash
> calculation, whole TOS byte should not be used as discreet flow SHOULD
> have unique ECN bits during congestion. Toke has diagnosed the problem
> correctly, solution is to remove TOS from ECMP hash calculation.

Agreed. This also goes for the other bits, so whole byte must be excluded.

For example, the OpenSSH client will by default change the code point from zero 
(during authentication) to af21/cs1 (when it enters a 
interactive/non-interactive session).

I have experienced this break IPv6 SSH sessions to an anycasted SSH server 
instance that was reached through old Juniper DPC cards with ECMP enabled. 
Symptom was that authentication went fine, only for the connection to be reset 
immediately after (unless default IPQoS config was changed). The «solution» was 
to simply disable ECMP for all IPv6 traffic, since I could not figure out how 
to make the Juniper exclude the DiffServ byte from the ECMP hash calculation.

Tore


Re: Brocade CER MPLS

2019-11-13 Thread Brandon Martin

On 11/13/19 3:24 PM, Fawcett, Nick via NANOG wrote:
I have one CER (let’s call him Charlie) that is not able to build an LSP 
tunnel to another CER.  Shows “Currently found no route. Will schedule 
for retry”.  Both CER’s can ping and traceroute each other.  When I had 
the LSP to the destination router (Snoopy) it adds the LSP and shows UP 
in status field.  Both have the same number of routes in their ospf 
table.  Charlie has other LSP’s built to other CER’s on the network with 
productive VLL’s and VPLS’s.  I have removed both lsp’s from both 
Charlie and Snoopy and re-added them and only one Snoopy’s LSP comes 
up.  Any ideas?


What's your CAM profile set to?  Or, if you don't have one, do you have 
any MPLS stuff specified as system-max?  NetIron does, uh, "bad things" 
if it thinks it's out of CAM space for a given type of entry.


There's also fairly extensive MPLS debugging under "debug mpls ..." that 
might prove useful.


If you don't need traffic engineering at all, you might be able to shut 
off "traffic-eng ospf" at which point you'll just get total best-effort 
LDP over IP using next-hop IP connectivity (regardless of how that 
connectivity is signaled) only AFAIK.  It's braindead simple which is 
handy if you're having problems and don't need anything more.

--
Brandon Martin


TCP and anycast (was Re: ECN)

2019-11-13 Thread Anoop Ghanwani
RFC 7094 (https://tools.ietf.org/html/rfc7094) describes the pitfalls &
risks of using TCP with an anycast address.  It recognizes that there are
valid use cases for it, though.

Specifically, section 3.1 says this:
>>>

   Most stateful transport protocols (e.g., TCP), without modification,
   do not understand the properties of anycast; hence, they will fail
   probabilistically, but possibly catastrophically, when using anycast
   addresses in the presence of "normal" routing dynamics.

...

   This can lead
   to a protocol working fine in, say, a test lab but not in the global
   Internet.

>>>

On Wed, Nov 13, 2019 at 3:33 PM Warren Kumari  wrote:

> On Thu, Nov 14, 2019 at 12:25 AM Matt Corallo  wrote:
> >
> > This sounds like a bug on Cloudflare’s end (cause trying to do anycast
> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
>
> Err. I really don't think that there is any sort of spec that
> covers that :-P
>
> Using Anycast for TCP is incredibly common - the DNS root servers for
> one obvious example.
> More TCP centric well-known examples are Fastly and LinkedIn -
> LinkedIn in particular did a really good podcast on their experience
> with this.
>
> There is also a good NANOG talk from the ~2000s (?) on people using
> TCP anycast for long lived (serving ISO files, which were long-lived
> in those days) flows, and how reliable it is - perhaps that's the talk
> Todd mentioned?
>
> W
>
> >
> > > On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG <
> nanog@nanog.org> wrote:
> > >
> > > 
> > >>
> > >> Hello
> > >>
> > >> I have a customer that believes my network has a ECN problem. We do
> > >> not, we just move packets. But how do I prove it?
> > >>
> > >> Is there a tool that checks for ECN trouble? Ideally something I could
> > >> run on the NLNOG Ring network.
> > >>
> > >> I believe it likely that it is the destination that has the problem.
> > >
> > > Hi Baldur
> > >
> > > I believe I may be that customer :)
> > >
> > > First of all, thank you for looking into the issue! We've been having
> > > great fun over on the ecn-sane mailing list trying to figure out what's
> > > going on. I'll summarise below, but see this thread for the discussion
> > > and debugging details:
> > >
> https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
> > >
> > > The short version is that the problem appears to come from a
> combination
> > > of the ECMP routing in your network, and Cloudflare's heavy use of
> > > anycast. Specifically, a router in your network appears to be doing
> ECMP
> > > by hashing on the packet header, *including the ECN bits*. This breaks
> > > TCP connections with ECN because the TCP SYN (with no ECN bits set) end
> > > up taking a different path than the rest of the flow (which is marked
> as
> > > ECT(0)). When the destination is anycasted, this means that the data
> > > packets go to a different server than the SYN did. This second server
> > > doesn't recognise the connection, and so replies with a TCP RST. To fix
> > > this, simply exclude the ECN bits (or the whole TOS byte) from your
> > > router's ECMP hash.
> > >
> > > For a longer exposition, see below. You should be able to verify this
> > > from somewhere else in the network, but if there's anything else you
> > > want me to test, do let me know. Also, would you mind sharing the
> router
> > > make and model that does this? We're trying to collect real-world
> > > examples of network problems caused by ECN and this is definitely an
> > > interesting example.
> > >
> > > -Toke
> > >
> > >
> > >
> > > The long version:
> > >
> > > From my end I can see that I have two paths to Cloudflare; which is
> > > taken appears to be based on a hash of the packet header, as can be
> seen
> > > by varying the source port:
> > >
> > > $ traceroute -q 1 --sport=1 104.24.125.13
> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
> packets
> > > 1  _gateway (10.42.3.1)  0.357 ms
> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
> > > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
> > > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
> > > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
> > > 6  104.24.125.13 (104.24.125.13)  1.322 ms
> > >
> > > $ traceroute -q 1 --sport=10001 104.24.125.13
> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
> packets
> > > 1  _gateway (10.42.3.1)  0.293 ms
> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
> > > 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
> > > 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
> > > 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
> > > 6  149.6.142.130 (149.6.142.130)  6.925 ms
> > > 7  104.24.125.13 (104.24.125.13)  1.501 ms
> > >
> > >
> > > This is fine in itself. However, the problem stems from the fact that
> > > the ECN bits 

Re: Puerto Rico IX (operational)

2019-11-13 Thread Mehmet Akcin
No.

We do not have any local caches yet.

On Wed, Nov 13, 2019 at 16:29 Darin Steffl  wrote:

> My guess is Aeronet already has a Netflix OCA but Netflix may still be
> interested adding some gear on the IX for other ISP's that don't qualify
> for their own appliance.
>
> Also, cloudflare would likely want to add a POP here as well. They're
> trying to be within 10ms of every ISP in the world or something like that.
>
> On Wed, Nov 13, 2019, 4:56 PM Mehmet Akcin  wrote:
>
>> Hey there,
>>
>> Puerto Rico IX , famously known as PRIX , is now operational.
>>
>> You can visit www.puertoricoix.net to see sites you can connect to PRIX,
>> members and join Ix mailing list.
>>
>> There are no fees to join the IX. We hope to keep this, this way until
>> there is enough interest to form a non profit org.
>>
>> I would like to thank once again Aeronet, Arista and Cloudsmash for their
>> donations of equipment , time and energy!
>>
>> Now, we are looking for those who want to deploy 1-2RU caching boxes
>> anything from DNS to Content.
>>
>> We are working on more stats/looking glass, etc soon!
>>
>> Mehmet
>> --
>> Mehmet
>> +1-424-298-1903
>>
> --
Mehmet
+1-424-298-1903


Re: Puerto Rico IX (operational)

2019-11-13 Thread Darin Steffl
My guess is Aeronet already has a Netflix OCA but Netflix may still be
interested adding some gear on the IX for other ISP's that don't qualify
for their own appliance.

Also, cloudflare would likely want to add a POP here as well. They're
trying to be within 10ms of every ISP in the world or something like that.

On Wed, Nov 13, 2019, 4:56 PM Mehmet Akcin  wrote:

> Hey there,
>
> Puerto Rico IX , famously known as PRIX , is now operational.
>
> You can visit www.puertoricoix.net to see sites you can connect to PRIX,
> members and join Ix mailing list.
>
> There are no fees to join the IX. We hope to keep this, this way until
> there is enough interest to form a non profit org.
>
> I would like to thank once again Aeronet, Arista and Cloudsmash for their
> donations of equipment , time and energy!
>
> Now, we are looking for those who want to deploy 1-2RU caching boxes
> anything from DNS to Content.
>
> We are working on more stats/looking glass, etc soon!
>
> Mehmet
> --
> Mehmet
> +1-424-298-1903
>


Re: ECN

2019-11-13 Thread Warren Kumari
On Thu, Nov 14, 2019 at 12:25 AM Matt Corallo  wrote:
>
> This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP 
> is... out of spec to say the least), not a bug in ECN/ECMP.

Err. I really don't think that there is any sort of spec that
covers that :-P

Using Anycast for TCP is incredibly common - the DNS root servers for
one obvious example.
More TCP centric well-known examples are Fastly and LinkedIn -
LinkedIn in particular did a really good podcast on their experience
with this.

There is also a good NANOG talk from the ~2000s (?) on people using
TCP anycast for long lived (serving ISO files, which were long-lived
in those days) flows, and how reliable it is - perhaps that's the talk
Todd mentioned?

W

>
> > On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG 
> >  wrote:
> >
> > 
> >>
> >> Hello
> >>
> >> I have a customer that believes my network has a ECN problem. We do
> >> not, we just move packets. But how do I prove it?
> >>
> >> Is there a tool that checks for ECN trouble? Ideally something I could
> >> run on the NLNOG Ring network.
> >>
> >> I believe it likely that it is the destination that has the problem.
> >
> > Hi Baldur
> >
> > I believe I may be that customer :)
> >
> > First of all, thank you for looking into the issue! We've been having
> > great fun over on the ecn-sane mailing list trying to figure out what's
> > going on. I'll summarise below, but see this thread for the discussion
> > and debugging details:
> > https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
> >
> > The short version is that the problem appears to come from a combination
> > of the ECMP routing in your network, and Cloudflare's heavy use of
> > anycast. Specifically, a router in your network appears to be doing ECMP
> > by hashing on the packet header, *including the ECN bits*. This breaks
> > TCP connections with ECN because the TCP SYN (with no ECN bits set) end
> > up taking a different path than the rest of the flow (which is marked as
> > ECT(0)). When the destination is anycasted, this means that the data
> > packets go to a different server than the SYN did. This second server
> > doesn't recognise the connection, and so replies with a TCP RST. To fix
> > this, simply exclude the ECN bits (or the whole TOS byte) from your
> > router's ECMP hash.
> >
> > For a longer exposition, see below. You should be able to verify this
> > from somewhere else in the network, but if there's anything else you
> > want me to test, do let me know. Also, would you mind sharing the router
> > make and model that does this? We're trying to collect real-world
> > examples of network problems caused by ECN and this is definitely an
> > interesting example.
> >
> > -Toke
> >
> >
> >
> > The long version:
> >
> > From my end I can see that I have two paths to Cloudflare; which is
> > taken appears to be based on a hash of the packet header, as can be seen
> > by varying the source port:
> >
> > $ traceroute -q 1 --sport=1 104.24.125.13
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.357 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
> > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
> > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
> > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
> > 6  104.24.125.13 (104.24.125.13)  1.322 ms
> >
> > $ traceroute -q 1 --sport=10001 104.24.125.13
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.293 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
> > 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
> > 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
> > 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
> > 6  149.6.142.130 (149.6.142.130)  6.925 ms
> > 7  104.24.125.13 (104.24.125.13)  1.501 ms
> >
> >
> > This is fine in itself. However, the problem stems from the fact that
> > the ECN bits in the IP header are also included in the ECMP hash (-t
> > sets the TOS byte; -t 1 ends up as ECT(0) on the wire and -t 2 is
> > ECT(1)):
> >
> > $ traceroute -q 1 --sport=1 104.24.125.13 -t 1
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.336 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  6.964 ms
> > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.056 ms
> > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.512 ms
> > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.313 ms
> > 6  104.24.125.13 (104.24.125.13)  1.210 ms
> >
> > $ traceroute -q 1 --sport=1 104.24.125.13 -t 2
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.339 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk 

Puerto Rico IX (operational)

2019-11-13 Thread Mehmet Akcin
Hey there,

Puerto Rico IX , famously known as PRIX , is now operational.

You can visit www.puertoricoix.net to see sites you can connect to PRIX,
members and join Ix mailing list.

There are no fees to join the IX. We hope to keep this, this way until
there is enough interest to form a non profit org.

I would like to thank once again Aeronet, Arista and Cloudsmash for their
donations of equipment , time and energy!

Now, we are looking for those who want to deploy 1-2RU caching boxes
anything from DNS to Content.

We are working on more stats/looking glass, etc soon!

Mehmet
-- 
Mehmet
+1-424-298-1903


Re: Apple http/https

2019-11-13 Thread Jared Mauch
I can confirm this works well.  It’s a bit tricker w/ IPv6 but with IPv4 it 
works and you can serve a lot of software updates out of the cache.

Mac mini w/ large SSD is a common application that people do 

# AssetCacheManagerUtil  status
..
CacheDetails = {
"Apple TV Software" = 732593844;
"Mac Software" = 28762157836;
Other = 5047787109;
iCloud = 25227716657;
"iOS Software" = 15765184801;
};
TotalBytesAreSince = "2019-11-04 16:44:33 +";
TotalBytesImported = 2198032684;
TotalBytesReturnedToClients = 22369111463;
TotalBytesStoredFromOrigin = 9485169600;
TotalBytesStoredFromPeers = 4171229773;


> On Nov 13, 2019, at 4:51 PM, Michael Gehrmann  wrote:
> 
> Hi Ahmed,
> 
> We have been using the Apple specific content caching feature for a while now.
> 
> It's something you enable on a mac (we use a mac mini) which then get 
> discovered on your local network via a DNS TXT record or bonjour.
> 
> https://support.apple.com/en-au/guide/mac-help/mchl3b6c3720/mac
> 
> Hope this helps.
> 
> MIKE G
> 
> 
> On Thu, 14 Nov 2019 at 06:22, ahmed.dala...@hrins.net 
>  wrote:
> Does anyone know if there is an apple cache? 
> Today we noticed that apple store applications and updates are not caching 
> anymore by HTTPs cache servers, and when we checked through DPI, we found 
> that it's been changed into HTTPS! Does anyone know what is going on? 
> 
> Ahmed



Re: Brocade CER MPLS

2019-11-13 Thread Kaiser, Erich
What version of Netiron are you running on each and do they both have the
adv license key enabled?  do show license it should show:
[image: image.png]

Erich Kaiser
The Fusion Network
er...@gotfusion.net
Office: 815-570-3101





On Wed, Nov 13, 2019 at 2:54 PM Fawcett, Nick via NANOG 
wrote:

> All mpls-interfaces are listed and policy is just traffic-eng ospf.
>
>
>
> Nick
>
>
>
> *From:* James Cornman 
> *Sent:* Wednesday, November 13, 2019 2:43 PM
> *To:* Fawcett, Nick 
> *Cc:* nanog@nanog.org
> *Subject:* Re: Brocade CER MPLS
>
>
>
> Ensure that 'mpls-interface xxx' is on for all of the interfaces, and also
> ensure that any traffic-engineering options match on both sides.  "show
> mpls lsp detail" should show some other info that may be helpful here.
>
>
>
> On Wed, Nov 13, 2019 at 3:24 PM Fawcett, Nick via NANOG 
> wrote:
>
> I have one CER (let’s call him Charlie) that is not able to build an LSP
> tunnel to another CER.  Shows “Currently found no route. Will schedule for
> retry”.  Both CER’s can ping and traceroute each other.  When I had the LSP
> to the destination router (Snoopy) it adds the LSP and shows UP in status
> field.  Both have the same number of routes in their ospf table.  Charlie
> has other LSP’s built to other CER’s on the network with productive VLL’s
> and VPLS’s.  I have removed both lsp’s from both Charlie and Snoopy and
> re-added them and only one Snoopy’s LSP comes up.  Any ideas?
>
>
>
> Nick
>
> --
>
> Checked by SOPHOS http://www.sophos.com
>
>
>
> --
>
> *James Cornman*
>
>
> *Chief Technology Officer *jcorn...@atlanticmetro.net
> 212.792.9950 - ext 101
>
> * Atlantic Metro Communications*
>
> *4 Century Drive, Parsippany NJ  07054*
>
>
> *Cloud Hosting • Colocation • Network Connectivity • Managed Services*
>
> Follow us on Twitter: @atlanticmetro  *•
> Like us on Facebook *
> www.atlanticmetro.net
>
>
>
> --
>
> Checked by SOPHOS http://www.sophos.com
>
>


Re: Apple http/https

2019-11-13 Thread Michael Gehrmann
Hi Ahmed,

We have been using the Apple specific content caching feature for a
while now.

It's something you enable on a mac (we use a mac mini) which then get
discovered on your local network via a DNS TXT record or bonjour.

https://support.apple.com/en-au/guide/mac-help/mchl3b6c3720/mac

Hope this helps.

Mike G


On Thu, 14 Nov 2019 at 06:22, ahmed.dala...@hrins.net <
ahmed.dala...@hrins.net> wrote:

> Does anyone know if there is an apple cache?
> Today we noticed that apple store applications and updates are not caching
> anymore by HTTPs cache servers, and when we checked through DPI, we found
> that it's been changed into HTTPS! Does anyone know what is going on?
>
> Ahmed


Re: Apple http/https

2019-11-13 Thread Brielle
Apple has had requirements in place for a while that developers and the 
like had to start only supporting secure connections (App Transport 
Security - basically apps are no longer allowed to make http connections).


Likely they may have thrown the switch on some backend stuff to finally 
enforce that for other things too (like app store downloads).




On 11/13/2019 12:21 PM, ahmed.dala...@hrins.net wrote:

Does anyone know if there is an apple cache?
Today we noticed that apple store applications and updates are not caching 
anymore by HTTPs cache servers, and when we checked through DPI, we found that 
it's been changed into HTTPS! Does anyone know what is going on?

Ahmed




--
Brielle Bruns
The Summit Open Source Development Group
http://www.sosdg.org/ http://www.ahbl.org


Re: ECN

2019-11-13 Thread Lukas Tribus
Hello,

On Wed, Nov 13, 2019 at 8:35 PM Saku Ytti  wrote:
>
> On Wed, 13 Nov 2019 at 18:27, Matt Corallo  wrote:
>
> > This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP 
> > is... out of spec to say the least), not a bug in ECN/ECMP.
>
> Not true. Hash result should indicate discreet flow, more importantly
> discreet flow should not result into two unique hash numbers. Using
> whole TOS byte breaks this promise and thus breaks ECMP.
>
> Platforms allow you to configure which  bytes are part of hash
> calculation, whole TOS byte should not be used as discreet flow SHOULD
> have unique ECN bits during congestion. Toke has diagnosed the problem
> correctly, solution is to remove TOS from ECMP hash calculation.

In fact I believe everything beyond the 5-tuple is just a bad idea to
base your hash on. Here are some examples (not quite as straight
forward than the TOS/ECN case here):

TTL:
https://mailman.nanog.org/pipermail/nanog/2018-September/096871.html

IPv6 flow label:
https://blog.apnic.net/2018/01/11/ipv6-flow-label-misuse-hashing/
https://pc.nanog.org/static/published/meetings/NANOG71/1531/20171003_Jaeggli_Lightning_Talk_Ipv6_v1.pdf
https://www.youtube.com/watch?v=b0CRjOpnT7w



Lukas


Re: ECN

2019-11-13 Thread William Herrin
On Wed, Nov 13, 2019 at 11:36 AM Saku Ytti  wrote:

> On Wed, 13 Nov 2019 at 18:27, Matt Corallo  wrote:
> > This sounds like a bug on Cloudflare’s end (cause trying to do anycast
> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
>
> Not true. Hash result should indicate discreet flow, more importantly
> discreet flow should not result into two unique hash numbers. Using
> whole TOS byte breaks this promise and thus breaks ECMP.
>

Yes true.

Equal Cost MultiPath (ECMP) consistency over the life of a TCP connection
is not a promise. Anycasters would love it to be but it's not.

ECMP's only promise is that packets for a particular connection will tend
to prefer a particular path so that throughput doesn't suffer overly much
from the packet reordering you'd get by round-robining the packets on
different links. Choosing an alternate path during congestion is a
perfectly reasonable thing for ECMP to do.

Don't blame the network. This is Cloudflare choosing not to handle the
anycast spray corner case because it happens rarely enough with symptoms
obscure enough that they only occasionally get called to carpet. Their BGP
announcements make the claim they're ready for your packet at any of their
sites, but they're not.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


RE: Brocade CER MPLS

2019-11-13 Thread Fawcett, Nick via NANOG
All mpls-interfaces are listed and policy is just traffic-eng ospf.

Nick

From: James Cornman 
Sent: Wednesday, November 13, 2019 2:43 PM
To: Fawcett, Nick 
Cc: nanog@nanog.org
Subject: Re: Brocade CER MPLS

Ensure that 'mpls-interface xxx' is on for all of the interfaces, and also 
ensure that any traffic-engineering options match on both sides.  "show mpls 
lsp detail" should show some other info that may be helpful here.

On Wed, Nov 13, 2019 at 3:24 PM Fawcett, Nick via NANOG 
mailto:nanog@nanog.org>> wrote:
I have one CER (let’s call him Charlie) that is not able to build an LSP tunnel 
to another CER.  Shows “Currently found no route. Will schedule for retry”.  
Both CER’s can ping and traceroute each other.  When I had the LSP to the 
destination router (Snoopy) it adds the LSP and shows UP in status field.  Both 
have the same number of routes in their ospf table.  Charlie has other LSP’s 
built to other CER’s on the network with productive VLL’s and VPLS’s.  I have 
removed both lsp’s from both Charlie and Snoopy and re-added them and only one 
Snoopy’s LSP comes up.  Any ideas?

Nick

--

Checked by SOPHOS http://www.sophos.com


--

James Cornman

Chief Technology Officer
jcorn...@atlanticmetro.net
212.792.9950 - ext 101

Atlantic Metro Communications

4 Century Drive, Parsippany NJ  07054

Cloud Hosting • Colocation • Network Connectivity • Managed Services

Follow us on Twitter: @atlanticmetro • Like 
us on Facebook
www.atlanticmetro.net



--

Checked by SOPHOS http://www.sophos.com


Brocade CER MPLS

2019-11-13 Thread Fawcett, Nick via NANOG
I have one CER (let's call him Charlie) that is not able to build an LSP tunnel 
to another CER.  Shows "Currently found no route. Will schedule for retry".  
Both CER's can ping and traceroute each other.  When I had the LSP to the 
destination router (Snoopy) it adds the LSP and shows UP in status field.  Both 
have the same number of routes in their ospf table.  Charlie has other LSP's 
built to other CER's on the network with productive VLL's and VPLS's.  I have 
removed both lsp's from both Charlie and Snoopy and re-added them and only one 
Snoopy's LSP comes up.  Any ideas?

Nick

-- 
Checked by SOPHOS http://www.sophos.com


Re: 2000::/3 Being Announced and Accepted

2019-11-13 Thread Billy Crook
Agreed. This is a problem, and it has happened before.  This is not
the first time.

I asked Job Snijders (a maintainer of IRRExplorer) about it, and
here's what he had to say.

I don't think he should set an arbitrary threshold for excluding large
prefixes from IRRExplorer.  I think the prefix probably shouldn't be
being advertised.  But is there a technical distinction between this
/3 and other advertisements aside from the size that could flag it for
ignoring?

-- Forwarded message -
From: Job Snijders 
Date: Fri, Oct 4, 2019 at 9:43 PM
Subject: Re: IRR Eplorer weirdness 2000::/3 route?
To: Billy Crook 


On Fri, Oct 04, 2019 at 12:25:27PM -0500, Billy Crook wrote:
> I'm seeing that all of 2000::/3 is being advertised by 24785.
>
> That can't be right, right?  Maybe they're you're default upstream?  I
> didn't see this route in sprint's v6 looking glass, so I'm assuming
> it's a local anomaly to your system.

Yeah, from time to time people (usually by accident) leak very large
blocks to route collectors. Often these blocks exist internal in
networks as a replacement for default routes and are not meant to leak
to the wider world, but you know how things go.

Here you can see the source of that data:
http://lg.ring.nlnog.net/prefix_detail/lg01/ipv6?q=2000::/3

I could make irrexplorer ignore such large announcements, but where to
draw the line?

On Wed, Nov 13, 2019 at 1:00 PM Douglas Fischer
 wrote:
>
> I have been recommending to many friends to check in daily at 
> http://irrexplorer.nlnog.net/ to make sure everything is healthy with their 
> prefixes ...
>
> Today a colleague reported a problem with an AS58299 ad appearing in "their 
> prefixes".
> I went look and was showing up on our ASNs too.
>
> It took me a while (dããã) to understand what was going on ...
> Why was irrexplorer showing that prefix in our query?
>
>
> Could anyone reach somebody from OpenFactory/NetShelter/level66network about 
> this?
>
>
>
> http://lg.ring.nlnog.net/prefix_bgpmap/lg01/ipv6?q=2000::/3
>
> 2000::/3
> [LEVEL66NETWORK1 11:35:27 from 2a09:11c0::1] * (100/-) [AS58299i]
> Type: BGP unicast univ
> BGP.origin: IGP
> BGP.as_path: 209844 49697 58299
> BGP.next_hop: 2a09:11c0::1
> BGP.local_pref: 100
> BGP.community: (49697,1000) (49697,1007) (49697,2302)
> BGP.ext_community: (RPKI Origin Validation State: not-found)
> BGP.large_community: (209844, 100, 13)
>
>
> --
> Douglas Fernando Fischer
> Engº de Controle e Automação


Re: 2000::/3 Being Announced and Accepted

2019-11-13 Thread Douglas Fischer
The route has already been removed!
Thanks!

Em qua, 13 de nov de 2019 às 14:00, Douglas Fischer <
fischerdoug...@gmail.com> escreveu:

> I have been recommending to many friends to check in daily at
> http://irrexplorer.nlnog.net/ to make sure everything is healthy with
> their prefixes ...
>
> Today a colleague reported a problem with an AS58299 ad appearing in
> "their prefixes".
> I went look and was showing up on our ASNs too.
>
> It took me a while (dããã) to understand what was going on ...
> Why was irrexplorer showing that prefix in our query?
>
>
> Could anyone reach somebody from OpenFactory/NetShelter/level66network
> about this?
>
>
>
> http://lg.ring.nlnog.net/prefix_bgpmap/lg01/ipv6?q=2000::/3
>
> 2000::/3
> [LEVEL66NETWORK1 11:35:27 from 2a09:11c0::1] * (100/-) [AS58299i]
> Type: BGP unicast univ
> BGP.origin: IGP
> BGP.as_path: 209844 49697 58299
> BGP.next_hop: 2a09:11c0::1
> BGP.local_pref: 100
> BGP.community: (49697,1000) (49697,1007) (49697,2302)
> BGP.ext_community: (RPKI Origin Validation State: not-found)
> BGP.large_community: (209844, 100, 13)
>
>
> --
> Douglas Fernando Fischer
> Engº de Controle e Automação
>


-- 
Douglas Fernando Fischer
Engº de Controle e Automação


Re: ECN

2019-11-13 Thread Owen DeLong
Like it or not (and I really don’t), the majority of modern CDNs are using TCP 
over Anycast.

It’s ugly and it’s prone to problems like this. It’s nice to see a customer 
with know-how actually publicizing and digging into the problem.

Until now, I believe an unknown number of customers have been suffering in 
silence or relegated to the ISPs “We can’t reproduce you problem” bin without 
resolution.

I’ve had lots of discussions on the subject and the usual end result is “It’s 
too hard to measure or quantify and there’s no visible contingent of impacted 
users”.

Now we at least have one visible impacted user.

Owen


> On Nov 13, 2019, at 09:19 , Anoop Ghanwani  wrote:
> 
> Not to condone what cloudflare is doing, but...
> 
> An ECN connection will have different bits on various packets for the 
> duration of the connection -- pure ACKs (ACKs not piggybacking on data) will 
> have the ECN bits as 00b, while all other packets will have either 01b, 10b 
> (when no congestion was experienced) or 11b (when congestion was 
> experienced).  So using the ECN bits as part of the hash would affect 
> performance throughout the life of the connection.
> 
> On Wed, Nov 13, 2019 at 9:00 AM Matt Corallo  > wrote:
> Not ideal, sure, but if it’s only for the SYN (as you seem to indicate), 
> splitting the flow shouldn’t have material performance degradation? 
> 
> > On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen  > > wrote:
> > 
> > 
> > 
> >> On 13 November 2019 17:20:18 CET, Matt Corallo  >> > wrote:
> >> This sounds like a bug on Cloudflare’s end (cause trying to do anycast
> >> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
> > 
> > Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will 
> > split the flow over multiple paths; avoiding that is the whole point of 
> > doing the flow-based hashing in the first place.
> > 
> > Anycast "only" turns a potential degradation of TCP performance into a hard 
> > failure... :)
> > 
> > -Toke
> 



Re: ECN

2019-11-13 Thread Saku Ytti
On Wed, 13 Nov 2019 at 18:27, Matt Corallo  wrote:

> This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP 
> is... out of spec to say the least), not a bug in ECN/ECMP.

Not true. Hash result should indicate discreet flow, more importantly
discreet flow should not result into two unique hash numbers. Using
whole TOS byte breaks this promise and thus breaks ECMP.

Platforms allow you to configure which  bytes are part of hash
calculation, whole TOS byte should not be used as discreet flow SHOULD
have unique ECN bits during congestion. Toke has diagnosed the problem
correctly, solution is to remove TOS from ECMP hash calculation.

-- 
  ++ytti


Apple http/https

2019-11-13 Thread ahmed.dala...@hrins.net
Does anyone know if there is an apple cache? 
Today we noticed that apple store applications and updates are not caching 
anymore by HTTPs cache servers, and when we checked through DPI, we found that 
it's been changed into HTTPS! Does anyone know what is going on? 

Ahmed

Re: ECN

2019-11-13 Thread Toke Høiland-Jørgensen via NANOG



On 13 November 2019 17:20:18 CET, Matt Corallo  wrote:
>This sounds like a bug on Cloudflare’s end (cause trying to do anycast
>TCP is... out of spec to say the least), not a bug in ECN/ECMP.

Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will 
split the flow over multiple paths; avoiding that is the whole point of doing 
the flow-based hashing in the first place.

Anycast "only" turns a potential degradation of TCP performance into a hard 
failure... :)

-Toke


GeoIP issue with dvd.netflix.com

2019-11-13 Thread Mark Thompson
Can anyone share a contact at Netflix who can help work through this?

-- 
Mark Thompson
(408) 202-1278


2000::/3 Being Announced and Accepted

2019-11-13 Thread Douglas Fischer
I have been recommending to many friends to check in daily at
http://irrexplorer.nlnog.net/ to make sure everything is healthy with their
prefixes ...

Today a colleague reported a problem with an AS58299 ad appearing in "their
prefixes".
I went look and was showing up on our ASNs too.

It took me a while (dããã) to understand what was going on ...
Why was irrexplorer showing that prefix in our query?


Could anyone reach somebody from OpenFactory/NetShelter/level66network
about this?



http://lg.ring.nlnog.net/prefix_bgpmap/lg01/ipv6?q=2000::/3

2000::/3
[LEVEL66NETWORK1 11:35:27 from 2a09:11c0::1] * (100/-) [AS58299i]
Type: BGP unicast univ
BGP.origin: IGP
BGP.as_path: 209844 49697 58299
BGP.next_hop: 2a09:11c0::1
BGP.local_pref: 100
BGP.community: (49697,1000) (49697,1007) (49697,2302)
BGP.ext_community: (RPKI Origin Validation State: not-found)
BGP.large_community: (209844, 100, 13)


--
Douglas Fernando Fischer
Engº de Controle e Automação


Re: ECN

2019-11-13 Thread Baldur Norddahl
ZTE M6000-S V3.00.20(3.40.1)

We are moving away from this platform so I can not be bothered with
requesting a fix. In the past they have made fixes for us, so I
believe they would also fix this issue if we asked them to do so.

Also I would like to state that I have not personally verified that the
equipment is doing hashing based on the ECN bits. I just turned off ECMP so
the customer can test. If it works we will either let ECMP stay off or move
the customer to the new platform.

Regards,

Baldur


On Wed, Nov 13, 2019 at 7:30 PM Mikael Abrahamsson  wrote:

> On Wed, 13 Nov 2019, Baldur Norddahl wrote:
>
> > In any case, is it not recommended that users of anycast proxy packets
> > that arrive at the wrong place? To avoid this kind of issue.
>
> In typical anycast deployments there is no feasible way to figure out
> where the "right place" is.
>
> It would be very interesting if your could share what equipment you're
> using that is doing ECMP hashing based on ECN bits. That vendor needs to
> fix that or people should avoid their devices.
>
> --
> Mikael Abrahamssonemail: swm...@swm.pp.se
>


Re: ECN

2019-11-13 Thread Mikael Abrahamsson via NANOG

On Wed, 13 Nov 2019, Baldur Norddahl wrote:

In any case, is it not recommended that users of anycast proxy packets 
that arrive at the wrong place? To avoid this kind of issue.


In typical anycast deployments there is no feasible way to figure out 
where the "right place" is.


It would be very interesting if your could share what equipment you're 
using that is doing ECMP hashing based on ECN bits. That vendor needs to 
fix that or people should avoid their devices.


--
Mikael Abrahamssonemail: swm...@swm.pp.se


RE: Disney+ Streaming

2019-11-13 Thread Aaron Gould
Justin’s original question was “….. Is it well known where the newly released 
Disney+ streaming service content is sourced?...”

 

With Eric’s finding of “I saw various content being served from Akamai, Amazon, 
Fastly and Limelight so far. I'm in Montreal.”

 

Is this an absolute answer as to how Disney+ is handling delivery of their 
content?  If not, are there any Disney folks listening that could respond to me 
either off list or on the community thread here about how we should expect to 
see this Disney+ content sourced and whether or not Disney+ has or is planning 
on building out an ISP-located CDN type of network, much like all the others? 
(OCA, FNA, AANP, AEC, ACE, GGC)

 

-Aaron



RE: Disney+ Geolocation issues

2019-11-13 Thread Aaron Gould
That email (cl...@disneystreaming.com) bounced back as undeliverable.

 

-Aaron

 

From: NANOG [mailto:nanog-boun...@nanog.org] On Behalf Of Michael Crapse
Sent: Tuesday, November 12, 2019 7:27 PM
Cc: NANOG list
Subject: Re: Disney+ Geolocation issues

 

There has been a continued flurry of trouble tickets from our eyeballs. I did 
find a contact  cl...@disneystreaming.com that i have reached out to in hope 
that they can hear our pleas.

 

On Tue, 12 Nov 2019 at 16:53, Cassidy B. Larson  wrote:

We're seeing the same thing.  Actually we saw it during pre-signup.  Reached 
out to Disney+ weeks ago as well, with no response.  Now it's launched, our 
support lines are flooded with people unable to give Disney all their moneys.   
 We finally got through to Disney+ support after 2.5hrs on hold to supply them 
the error code, IP address, and zip code.. we'll see if it's passed to the 
right folks. 

 

On Tue, Nov 12, 2019 at 3:30 PM Michael Crapse  wrote:

Myself and a few other ISPs are having our eyeballs complain about disney+ 
saying that they're on a VPN. Does anyone have any idea, or who to contact 
regarding this issue?

This is most likely improper geolocation databases. Anyone have an idea who 
they use?

 

Mike



Re: Marseille Colocation

2019-11-13 Thread Jürgen Jaritsch
+1 for the IX in Marseille!

Cross-connect charges are always the same with IX: you need to buy the
pre-cabling (see below) and afterwards you pay CC with MRC:

Costs for pre-cabling from your rack to the MMR (NRC):
6 SMD pairs: 2.475,00 Eur
12 SMD pairs: 4.125,00 Eur
24 SMD pairs: 5.960,00 Eur

Copper-CC pre-cabling from your rack to the MMR (NRC):
6x UTP/STP RJ45: 1.950,00 Eur
12x UTP/STP RJ45: 2.600,00 Eur
24x UTP/STP RJ45: 3.650,00 Eur

Copper CC MRC (same building): 45,00 Eur 
Copper CC MRC (doesn't matter to which IX building on the IX campus): 85,00
Eur 
Copper CC MRC (between your own racks): 25,00 Eur

SMD CC MRC (same building): 85,00 Eur 
SMD CC MRC (doesn't matter to which IX building on the IX campus): 85,00 Eur
SMD CC MRC (between your own racks): 45,00 Eur


I heard something about a raise from 85,00  to 95,00 Eur MRC, but as of yet
I didn't have a proof for this.

Connecting the CC to your equipment: 275,00 Eur NRC

Best regards
Jürgen




Re: ECN

2019-11-13 Thread Baldur Norddahl
I am testing disabling our use of ECMP as it is not strictly necessary and
we are moving to a new platform anyway. Waiting for feedback from the
customer to hear if this fixes the issue.

In any case, is it not recommended that users of anycast proxy packets that
arrive at the wrong place? To avoid this kind of issue.

Regards,

Baldur


On Wed, Nov 13, 2019 at 6:35 PM Todd Underwood  wrote:

> as one of the authors of that talk, it definitely is "a thing", has been
> for years and years and years, and indeed, mostly works.
>
> t
>
> On Wed, Nov 13, 2019 at 12:18 PM Hunter Fuller 
> wrote:
>
>> It is certainly odd, but it's definitely a "thing."
>>
>> https://archive.nanog.org/meetings/nanog37/presentations/matt.levine.pdf
>>
>> On Wed, Nov 13, 2019 at 10:24 AM Matt Corallo  wrote:
>> >
>> > This sounds like a bug on Cloudflare’s end (cause trying to do anycast
>> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
>> >
>> > > On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG <
>> nanog@nanog.org> wrote:
>> > >
>> > > 
>> > >>
>> > >> Hello
>> > >>
>> > >> I have a customer that believes my network has a ECN problem. We do
>> > >> not, we just move packets. But how do I prove it?
>> > >>
>> > >> Is there a tool that checks for ECN trouble? Ideally something I
>> could
>> > >> run on the NLNOG Ring network.
>> > >>
>> > >> I believe it likely that it is the destination that has the problem.
>> > >
>> > > Hi Baldur
>> > >
>> > > I believe I may be that customer :)
>> > >
>> > > First of all, thank you for looking into the issue! We've been having
>> > > great fun over on the ecn-sane mailing list trying to figure out
>> what's
>> > > going on. I'll summarise below, but see this thread for the discussion
>> > > and debugging details:
>> > >
>> https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
>> > >
>> > > The short version is that the problem appears to come from a
>> combination
>> > > of the ECMP routing in your network, and Cloudflare's heavy use of
>> > > anycast. Specifically, a router in your network appears to be doing
>> ECMP
>> > > by hashing on the packet header, *including the ECN bits*. This breaks
>> > > TCP connections with ECN because the TCP SYN (with no ECN bits set)
>> end
>> > > up taking a different path than the rest of the flow (which is marked
>> as
>> > > ECT(0)). When the destination is anycasted, this means that the data
>> > > packets go to a different server than the SYN did. This second server
>> > > doesn't recognise the connection, and so replies with a TCP RST. To
>> fix
>> > > this, simply exclude the ECN bits (or the whole TOS byte) from your
>> > > router's ECMP hash.
>> > >
>> > > For a longer exposition, see below. You should be able to verify this
>> > > from somewhere else in the network, but if there's anything else you
>> > > want me to test, do let me know. Also, would you mind sharing the
>> router
>> > > make and model that does this? We're trying to collect real-world
>> > > examples of network problems caused by ECN and this is definitely an
>> > > interesting example.
>> > >
>> > > -Toke
>> > >
>> > >
>> > >
>> > > The long version:
>> > >
>> > > From my end I can see that I have two paths to Cloudflare; which is
>> > > taken appears to be based on a hash of the packet header, as can be
>> seen
>> > > by varying the source port:
>> > >
>> > > $ traceroute -q 1 --sport=1 104.24.125.13
>> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
>> packets
>> > > 1  _gateway (10.42.3.1)  0.357 ms
>> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
>> > > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
>> > > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
>> > > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
>> > > 6  104.24.125.13 (104.24.125.13)  1.322 ms
>> > >
>> > > $ traceroute -q 1 --sport=10001 104.24.125.13
>> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
>> packets
>> > > 1  _gateway (10.42.3.1)  0.293 ms
>> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
>> > > 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
>> > > 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
>> > > 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
>> > > 6  149.6.142.130 (149.6.142.130)  6.925 ms
>> > > 7  104.24.125.13 (104.24.125.13)  1.501 ms
>> > >
>> > >
>> > > This is fine in itself. However, the problem stems from the fact that
>> > > the ECN bits in the IP header are also included in the ECMP hash (-t
>> > > sets the TOS byte; -t 1 ends up as ECT(0) on the wire and -t 2 is
>> > > ECT(1)):
>> > >
>> > > $ traceroute -q 1 --sport=1 104.24.125.13 -t 1
>> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
>> packets
>> > > 1  _gateway (10.42.3.1)  0.336 ms
>> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  

Re: ECN

2019-11-13 Thread Jon Lewis
It does when the split flows land in different anycast origin POPs. 
Making a few assumptions from the traceroutes, the ECMP paths are sending 
some packets to Hamburg and some to Denmark.  Each POP may be getting 
parts of what should be a single TCP stream, and I doubt they have 
anything to cope with that (another assumption).


On Wed, 13 Nov 2019, Matt Corallo wrote:


Not ideal, sure, but if it’s only for the SYN (as you seem to indicate), 
splitting the flow shouldn’t have material performance degradation?


On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen  wrote:




On 13 November 2019 17:20:18 CET, Matt Corallo  wrote:
This sounds like a bug on Cloudflare’s end (cause trying to do anycast
TCP is... out of spec to say the least), not a bug in ECN/ECMP.


Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will 
split the flow over multiple paths; avoiding that is the whole point of doing 
the flow-based hashing in the first place.

Anycast "only" turns a potential degradation of TCP performance into a hard 
failure... :)

-Toke





--
 Jon Lewis, MCP :)   |  I route
 StackPath, Sr. Neteng   |  therefore you are
_ http://www.lewis.org/~jlewis/pgp for PGP public key_


Re: ECN

2019-11-13 Thread Todd Underwood
as one of the authors of that talk, it definitely is "a thing", has been
for years and years and years, and indeed, mostly works.

t

On Wed, Nov 13, 2019 at 12:18 PM Hunter Fuller  wrote:

> It is certainly odd, but it's definitely a "thing."
>
> https://archive.nanog.org/meetings/nanog37/presentations/matt.levine.pdf
>
> On Wed, Nov 13, 2019 at 10:24 AM Matt Corallo  wrote:
> >
> > This sounds like a bug on Cloudflare’s end (cause trying to do anycast
> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
> >
> > > On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG <
> nanog@nanog.org> wrote:
> > >
> > > 
> > >>
> > >> Hello
> > >>
> > >> I have a customer that believes my network has a ECN problem. We do
> > >> not, we just move packets. But how do I prove it?
> > >>
> > >> Is there a tool that checks for ECN trouble? Ideally something I could
> > >> run on the NLNOG Ring network.
> > >>
> > >> I believe it likely that it is the destination that has the problem.
> > >
> > > Hi Baldur
> > >
> > > I believe I may be that customer :)
> > >
> > > First of all, thank you for looking into the issue! We've been having
> > > great fun over on the ecn-sane mailing list trying to figure out what's
> > > going on. I'll summarise below, but see this thread for the discussion
> > > and debugging details:
> > >
> https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
> > >
> > > The short version is that the problem appears to come from a
> combination
> > > of the ECMP routing in your network, and Cloudflare's heavy use of
> > > anycast. Specifically, a router in your network appears to be doing
> ECMP
> > > by hashing on the packet header, *including the ECN bits*. This breaks
> > > TCP connections with ECN because the TCP SYN (with no ECN bits set) end
> > > up taking a different path than the rest of the flow (which is marked
> as
> > > ECT(0)). When the destination is anycasted, this means that the data
> > > packets go to a different server than the SYN did. This second server
> > > doesn't recognise the connection, and so replies with a TCP RST. To fix
> > > this, simply exclude the ECN bits (or the whole TOS byte) from your
> > > router's ECMP hash.
> > >
> > > For a longer exposition, see below. You should be able to verify this
> > > from somewhere else in the network, but if there's anything else you
> > > want me to test, do let me know. Also, would you mind sharing the
> router
> > > make and model that does this? We're trying to collect real-world
> > > examples of network problems caused by ECN and this is definitely an
> > > interesting example.
> > >
> > > -Toke
> > >
> > >
> > >
> > > The long version:
> > >
> > > From my end I can see that I have two paths to Cloudflare; which is
> > > taken appears to be based on a hash of the packet header, as can be
> seen
> > > by varying the source port:
> > >
> > > $ traceroute -q 1 --sport=1 104.24.125.13
> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
> packets
> > > 1  _gateway (10.42.3.1)  0.357 ms
> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
> > > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
> > > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
> > > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
> > > 6  104.24.125.13 (104.24.125.13)  1.322 ms
> > >
> > > $ traceroute -q 1 --sport=10001 104.24.125.13
> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
> packets
> > > 1  _gateway (10.42.3.1)  0.293 ms
> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
> > > 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
> > > 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
> > > 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
> > > 6  149.6.142.130 (149.6.142.130)  6.925 ms
> > > 7  104.24.125.13 (104.24.125.13)  1.501 ms
> > >
> > >
> > > This is fine in itself. However, the problem stems from the fact that
> > > the ECN bits in the IP header are also included in the ECMP hash (-t
> > > sets the TOS byte; -t 1 ends up as ECT(0) on the wire and -t 2 is
> > > ECT(1)):
> > >
> > > $ traceroute -q 1 --sport=1 104.24.125.13 -t 1
> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
> packets
> > > 1  _gateway (10.42.3.1)  0.336 ms
> > > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  6.964 ms
> > > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.056 ms
> > > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.512 ms
> > > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.313 ms
> > > 6  104.24.125.13 (104.24.125.13)  1.210 ms
> > >
> > > $ traceroute -q 1 --sport=1 104.24.125.13 -t 2
> > > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte
> packets
> > > 1  _gateway (10.42.3.1)  0.339 ms
> > > 2  albertslund-edge1-lo.net.gigabit.dk 

Re: ECN

2019-11-13 Thread Anoop Ghanwani
Not to condone what cloudflare is doing, but...

An ECN connection will have different bits on various packets for the
duration of the connection -- pure ACKs (ACKs not piggybacking on data)
will have the ECN bits as 00b, while all other packets will have either
01b, 10b (when no congestion was experienced) or 11b (when congestion was
experienced).  So using the ECN bits as part of the hash would affect
performance throughout the life of the connection.

On Wed, Nov 13, 2019 at 9:00 AM Matt Corallo  wrote:

> Not ideal, sure, but if it’s only for the SYN (as you seem to indicate),
> splitting the flow shouldn’t have material performance degradation?
>
> > On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen  wrote:
> >
> > 
> >
> >> On 13 November 2019 17:20:18 CET, Matt Corallo 
> wrote:
> >> This sounds like a bug on Cloudflare’s end (cause trying to do anycast
> >> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
> >
> > Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so
> will split the flow over multiple paths; avoiding that is the whole point
> of doing the flow-based hashing in the first place.
> >
> > Anycast "only" turns a potential degradation of TCP performance into a
> hard failure... :)
> >
> > -Toke
>
>


Re: ECN

2019-11-13 Thread Hunter Fuller
It is certainly odd, but it's definitely a "thing."

https://archive.nanog.org/meetings/nanog37/presentations/matt.levine.pdf

On Wed, Nov 13, 2019 at 10:24 AM Matt Corallo  wrote:
>
> This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP 
> is... out of spec to say the least), not a bug in ECN/ECMP.
>
> > On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG 
> >  wrote:
> >
> > 
> >>
> >> Hello
> >>
> >> I have a customer that believes my network has a ECN problem. We do
> >> not, we just move packets. But how do I prove it?
> >>
> >> Is there a tool that checks for ECN trouble? Ideally something I could
> >> run on the NLNOG Ring network.
> >>
> >> I believe it likely that it is the destination that has the problem.
> >
> > Hi Baldur
> >
> > I believe I may be that customer :)
> >
> > First of all, thank you for looking into the issue! We've been having
> > great fun over on the ecn-sane mailing list trying to figure out what's
> > going on. I'll summarise below, but see this thread for the discussion
> > and debugging details:
> > https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
> >
> > The short version is that the problem appears to come from a combination
> > of the ECMP routing in your network, and Cloudflare's heavy use of
> > anycast. Specifically, a router in your network appears to be doing ECMP
> > by hashing on the packet header, *including the ECN bits*. This breaks
> > TCP connections with ECN because the TCP SYN (with no ECN bits set) end
> > up taking a different path than the rest of the flow (which is marked as
> > ECT(0)). When the destination is anycasted, this means that the data
> > packets go to a different server than the SYN did. This second server
> > doesn't recognise the connection, and so replies with a TCP RST. To fix
> > this, simply exclude the ECN bits (or the whole TOS byte) from your
> > router's ECMP hash.
> >
> > For a longer exposition, see below. You should be able to verify this
> > from somewhere else in the network, but if there's anything else you
> > want me to test, do let me know. Also, would you mind sharing the router
> > make and model that does this? We're trying to collect real-world
> > examples of network problems caused by ECN and this is definitely an
> > interesting example.
> >
> > -Toke
> >
> >
> >
> > The long version:
> >
> > From my end I can see that I have two paths to Cloudflare; which is
> > taken appears to be based on a hash of the packet header, as can be seen
> > by varying the source port:
> >
> > $ traceroute -q 1 --sport=1 104.24.125.13
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.357 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
> > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
> > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
> > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
> > 6  104.24.125.13 (104.24.125.13)  1.322 ms
> >
> > $ traceroute -q 1 --sport=10001 104.24.125.13
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.293 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
> > 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
> > 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
> > 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
> > 6  149.6.142.130 (149.6.142.130)  6.925 ms
> > 7  104.24.125.13 (104.24.125.13)  1.501 ms
> >
> >
> > This is fine in itself. However, the problem stems from the fact that
> > the ECN bits in the IP header are also included in the ECMP hash (-t
> > sets the TOS byte; -t 1 ends up as ECT(0) on the wire and -t 2 is
> > ECT(1)):
> >
> > $ traceroute -q 1 --sport=1 104.24.125.13 -t 1
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.336 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  6.964 ms
> > 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.056 ms
> > 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.512 ms
> > 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.313 ms
> > 6  104.24.125.13 (104.24.125.13)  1.210 ms
> >
> > $ traceroute -q 1 --sport=1 104.24.125.13 -t 2
> > traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> > 1  _gateway (10.42.3.1)  0.339 ms
> > 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  2.565 ms
> > 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.301 ms
> > 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.339 ms
> > 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.570 ms
> > 6  149.6.142.130 (149.6.142.130)  6.888 ms
> > 7  104.24.125.13 (104.24.125.13)  1.785 ms
> >
> >
> > So why is this a problem? The TCP SYN packet first needs to negotiate
> > ECN, so it is sent 

Re: ECN

2019-11-13 Thread Matt Corallo
Not ideal, sure, but if it’s only for the SYN (as you seem to indicate), 
splitting the flow shouldn’t have material performance degradation? 

> On Nov 13, 2019, at 11:51, Toke Høiland-Jørgensen  wrote:
> 
> 
> 
>> On 13 November 2019 17:20:18 CET, Matt Corallo  wrote:
>> This sounds like a bug on Cloudflare’s end (cause trying to do anycast
>> TCP is... out of spec to say the least), not a bug in ECN/ECMP.
> 
> Even without anycast, an ECMP shouldn't hash on the ECN bits. Doing so will 
> split the flow over multiple paths; avoiding that is the whole point of doing 
> the flow-based hashing in the first place.
> 
> Anycast "only" turns a potential degradation of TCP performance into a hard 
> failure... :)
> 
> -Toke



Re: Disney+ Streaming

2019-11-13 Thread Ross Tajvar
I think it would be more on topic if everyone weren't just guessing what
users will do based on hypothetical behavior patterns and hypothetical
content shifts.

I WOULD be interested to see some data showing e.g. a drop in traffic to
one service and a boost in traffic to another service when a particular bit
of media was moved from the former to the latter. (Or a boost in both, etc.)

On Wed, Nov 13, 2019, 11:04 AM Stephen Satchell  wrote:

> CAVAET: I don't have a dog in this hunt.
>
> On 11/13/19 6:46 AM, Mel Beckman wrote:
> > This is silly off-topic. You don’t have to go home, but you can’t
> > stay here, according to NANOG guidelines.
>
> > https://www.nanog.org/resources/usage-guidelines/ >
> https://www.nanog.org/bylaws/
>
> "The NANOG mailing list was established in 1994 to provide an open forum
> for the exchange of technical information, and lively discussion of
> SPECIFIC IMPLEMENTATION CHALLENGES (emphasis mine) that require
> cooperation among network service providers.
>
> "Posts to NANOG’s mailing list should be focused on operational and
> technical content only, as described by the NANOG Bylaws."
>
> Yes, some of the Disney Plus thread has strayed outside the four corners
> of the rules of the mailing list, but the bulk of the thread has to do
> with two things: geolocation inaccuracies, and traffic capacity shifts.
>   For some network operators on this list, the discussion does not
> describe issues on their networks.  But "some" is not "all".
>


Re: ECN

2019-11-13 Thread Matt Corallo
This sounds like a bug on Cloudflare’s end (cause trying to do anycast TCP 
is... out of spec to say the least), not a bug in ECN/ECMP.

> On Nov 13, 2019, at 11:07, Toke Høiland-Jørgensen via NANOG  
> wrote:
> 
> 
>> 
>> Hello
>> 
>> I have a customer that believes my network has a ECN problem. We do
>> not, we just move packets. But how do I prove it?
>> 
>> Is there a tool that checks for ECN trouble? Ideally something I could
>> run on the NLNOG Ring network.
>> 
>> I believe it likely that it is the destination that has the problem.
> 
> Hi Baldur
> 
> I believe I may be that customer :)
> 
> First of all, thank you for looking into the issue! We've been having
> great fun over on the ecn-sane mailing list trying to figure out what's
> going on. I'll summarise below, but see this thread for the discussion
> and debugging details:
> https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html
> 
> The short version is that the problem appears to come from a combination
> of the ECMP routing in your network, and Cloudflare's heavy use of
> anycast. Specifically, a router in your network appears to be doing ECMP
> by hashing on the packet header, *including the ECN bits*. This breaks
> TCP connections with ECN because the TCP SYN (with no ECN bits set) end
> up taking a different path than the rest of the flow (which is marked as
> ECT(0)). When the destination is anycasted, this means that the data
> packets go to a different server than the SYN did. This second server
> doesn't recognise the connection, and so replies with a TCP RST. To fix
> this, simply exclude the ECN bits (or the whole TOS byte) from your
> router's ECMP hash.
> 
> For a longer exposition, see below. You should be able to verify this
> from somewhere else in the network, but if there's anything else you
> want me to test, do let me know. Also, would you mind sharing the router
> make and model that does this? We're trying to collect real-world
> examples of network problems caused by ECN and this is definitely an
> interesting example.
> 
> -Toke
> 
> 
> 
> The long version:
> 
> From my end I can see that I have two paths to Cloudflare; which is
> taken appears to be based on a hash of the packet header, as can be seen
> by varying the source port:
> 
> $ traceroute -q 1 --sport=1 104.24.125.13
> traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> 1  _gateway (10.42.3.1)  0.357 ms
> 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
> 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
> 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
> 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
> 6  104.24.125.13 (104.24.125.13)  1.322 ms
> 
> $ traceroute -q 1 --sport=10001 104.24.125.13
> traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> 1  _gateway (10.42.3.1)  0.293 ms
> 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
> 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
> 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
> 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
> 6  149.6.142.130 (149.6.142.130)  6.925 ms
> 7  104.24.125.13 (104.24.125.13)  1.501 ms
> 
> 
> This is fine in itself. However, the problem stems from the fact that
> the ECN bits in the IP header are also included in the ECMP hash (-t
> sets the TOS byte; -t 1 ends up as ECT(0) on the wire and -t 2 is
> ECT(1)):
> 
> $ traceroute -q 1 --sport=1 104.24.125.13 -t 1
> traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> 1  _gateway (10.42.3.1)  0.336 ms
> 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  6.964 ms
> 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.056 ms
> 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.512 ms
> 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.313 ms
> 6  104.24.125.13 (104.24.125.13)  1.210 ms
> 
> $ traceroute -q 1 --sport=1 104.24.125.13 -t 2
> traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
> 1  _gateway (10.42.3.1)  0.339 ms
> 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  2.565 ms
> 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.301 ms
> 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.339 ms
> 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.570 ms
> 6  149.6.142.130 (149.6.142.130)  6.888 ms
> 7  104.24.125.13 (104.24.125.13)  1.785 ms
> 
> 
> So why is this a problem? The TCP SYN packet first needs to negotiate
> ECN, so it is sent without any ECN bits set in the header; after
> negotiation succeeds, the data packets will be marked as ECT(0). But
> because that becomes part of the ECMP hash, those packets will take
> another path. And since the destination is anycasted, that means they
> will also end up at a different endpoint. This second endpoint won't
> recognise the connection, and reply with 

Marseille Colocation

2019-11-13 Thread Rod Beck
Any suggestions on a good telecom hotel for a cloud provider in Marseille? 
Interxion has a campus there with 2 buildings, a huge number of carriers and 
serves as the hand off point for a large number of undersea cables. Does anyone 
know anything about the facility in terms of space and power availability and 
pricing, service and cross connect charges? Any alternatives to Interxion worth 
considering?

Regards,

Roderick.


Roderick Beck

VP of Business Development

United Cable Company

www.unitedcablecompany.com

New York City & Budapest

rod.b...@unitedcablecompany.com

36-70-605-5144


[1467221477350_image005.png]


Re: Disney+ Geolocation issues

2019-11-13 Thread Cassidy B. Larson
We're seeing the same thing.  Actually we saw it during pre-signup.
Reached out to Disney+ weeks ago as well, with no response.  Now it's
launched, our support lines are flooded with people unable to give Disney
all their moneys.We finally got through to Disney+ support after 2.5hrs
on hold to supply them the error code, IP address, and zip code.. we'll see
if it's passed to the right folks.

On Tue, Nov 12, 2019 at 3:30 PM Michael Crapse  wrote:

> Myself and a few other ISPs are having our eyeballs complain about
> disney+ saying that they're on a VPN. Does anyone have any idea, or who to
> contact regarding this issue?
> This is most likely improper geolocation databases. Anyone have an idea
> who they use?
>
> Mike
>


Disney+ Geolocation Issues

2019-11-13 Thread Mat Perkins
Hey Everyone,

I'm working with 3 ISPs currently who are having Geo Location issues with
clients being told they are out of the US. When their IPs are checked via a
free Geo IP tool everything is showing up correctly. Anyone have any
insight on who Disney+ is using for their GeoLocation services and if
anyone here can help out with getting updated to be in the correct location
/ type of service provider?

Mat


Re: ECN

2019-11-13 Thread Toke Høiland-Jørgensen via NANOG
> Hello
> 
> I have a customer that believes my network has a ECN problem. We do
> not, we just move packets. But how do I prove it?
> 
> Is there a tool that checks for ECN trouble? Ideally something I could
> run on the NLNOG Ring network.
> 
> I believe it likely that it is the destination that has the problem.

Hi Baldur

I believe I may be that customer :)

First of all, thank you for looking into the issue! We've been having
great fun over on the ecn-sane mailing list trying to figure out what's
going on. I'll summarise below, but see this thread for the discussion
and debugging details:
https://lists.bufferbloat.net/pipermail/ecn-sane/2019-November/000527.html

The short version is that the problem appears to come from a combination
of the ECMP routing in your network, and Cloudflare's heavy use of
anycast. Specifically, a router in your network appears to be doing ECMP
by hashing on the packet header, *including the ECN bits*. This breaks
TCP connections with ECN because the TCP SYN (with no ECN bits set) end
up taking a different path than the rest of the flow (which is marked as
ECT(0)). When the destination is anycasted, this means that the data
packets go to a different server than the SYN did. This second server
doesn't recognise the connection, and so replies with a TCP RST. To fix
this, simply exclude the ECN bits (or the whole TOS byte) from your
router's ECMP hash.

For a longer exposition, see below. You should be able to verify this
from somewhere else in the network, but if there's anything else you
want me to test, do let me know. Also, would you mind sharing the router
make and model that does this? We're trying to collect real-world
examples of network problems caused by ECN and this is definitely an
interesting example.

-Toke



The long version:

>From my end I can see that I have two paths to Cloudflare; which is
taken appears to be based on a hash of the packet header, as can be seen
by varying the source port:

$ traceroute -q 1 --sport=1 104.24.125.13
traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
 1  _gateway (10.42.3.1)  0.357 ms
 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  4.707 ms
 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.283 ms
 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.667 ms
 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.406 ms
 6  104.24.125.13 (104.24.125.13)  1.322 ms

$ traceroute -q 1 --sport=10001 104.24.125.13
traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
 1  _gateway (10.42.3.1)  0.293 ms
 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  3.430 ms
 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.194 ms
 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.297 ms
 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.805 ms
 6  149.6.142.130 (149.6.142.130)  6.925 ms
 7  104.24.125.13 (104.24.125.13)  1.501 ms


This is fine in itself. However, the problem stems from the fact that
the ECN bits in the IP header are also included in the ECMP hash (-t
sets the TOS byte; -t 1 ends up as ECT(0) on the wire and -t 2 is
ECT(1)):

$ traceroute -q 1 --sport=1 104.24.125.13 -t 1
traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
 1  _gateway (10.42.3.1)  0.336 ms
 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  6.964 ms
 3  customer-185-24-168-46.ip4.gigabit.dk (185.24.168.46)  1.056 ms
 4  te0-1-1-5.rcr21.cph01.atlas.cogentco.com (149.6.137.49)  1.512 ms
 5  netnod-ix-cph-blue-9000.cloudflare.com (212.237.192.246)  1.313 ms
 6  104.24.125.13 (104.24.125.13)  1.210 ms

$ traceroute -q 1 --sport=1 104.24.125.13 -t 2
traceroute to 104.24.125.13 (104.24.125.13), 30 hops max, 60 byte packets
 1  _gateway (10.42.3.1)  0.339 ms
 2  albertslund-edge1-lo.net.gigabit.dk (185.24.171.254)  2.565 ms
 3  customer-185-24-168-38.ip4.gigabit.dk (185.24.168.38)  1.301 ms
 4  10ge1-2.core1.cph1.he.net (216.66.83.101)  1.339 ms
 5  be2306.ccr42.ham01.atlas.cogentco.com (130.117.3.237)  6.570 ms
 6  149.6.142.130 (149.6.142.130)  6.888 ms
 7  104.24.125.13 (104.24.125.13)  1.785 ms


So why is this a problem? The TCP SYN packet first needs to negotiate
ECN, so it is sent without any ECN bits set in the header; after
negotiation succeeds, the data packets will be marked as ECT(0). But
because that becomes part of the ECMP hash, those packets will take
another path. And since the destination is anycasted, that means they
will also end up at a different endpoint. This second endpoint won't
recognise the connection, and reply with a TCP RST. This is clearly
visible in tcpdump; notice the different TOS values, and that the RST
packet has a different TTL than the SYN-ACK:

12:21:47.816359 IP (tos 0x0, ttl 64, id 25687, offset 0, flags [DF], proto TCP 
(6), length 60)
10.42.3.130.34420 > 104.24.125.13.80: Flags [SEW], cksum 0xf2ff (incorrect 
-> 0x0853), seq 3345293502, win 64240, options [mss 1460,sackOK,TS 

Re: Disney+ Streaming

2019-11-13 Thread Stephen Satchell

CAVAET: I don't have a dog in this hunt.

On 11/13/19 6:46 AM, Mel Beckman wrote:

This is silly off-topic. You don’t have to go home, but you can’t
stay here, according to NANOG guidelines.



https://www.nanog.org/resources/usage-guidelines/ > 
https://www.nanog.org/bylaws/


"The NANOG mailing list was established in 1994 to provide an open forum 
for the exchange of technical information, and lively discussion of 
SPECIFIC IMPLEMENTATION CHALLENGES (emphasis mine) that require 
cooperation among network service providers.


"Posts to NANOG’s mailing list should be focused on operational and 
technical content only, as described by the NANOG Bylaws."


Yes, some of the Disney Plus thread has strayed outside the four corners 
of the rules of the mailing list, but the bulk of the thread has to do 
with two things: geolocation inaccuracies, and traffic capacity shifts. 
 For some network operators on this list, the discussion does not 
describe issues on their networks.  But "some" is not "all".


Re: Disney+ Geolocation issues

2019-11-13 Thread Michael Crapse
For all those in the current and future thread. We were successful in
reaching to Disney by emailing them with our subnet
netad...@disneystreaming.com

On Wed, 13 Nov 2019 at 08:26, Robert Blayzor  wrote:

> On 11/13/19 9:49 AM, Matthew Huff wrote:
> > It’s not about optimization, it’s about the contract with the content
> providers. The agreement is to restrict content by geographical regions
> mainly for marketing purposes. They block VPN access to keep people from
> bypassing those restrictions. It’s true of all the streaming providers.
>
>
> Build a better mousetrap, because it's clearly not working. We still get
> tons of people calling into first level support asking why ESPN+ doesn't
> work and that ESPN told them to call their ISP's, which can do NOTHING
> to fix the problem.
>
> Guessing Disney stole a page from that book...
>
> --
> inoc.net!rblayzor
> XMPP: rblayzor.AT.inoc.net
> PGP:  https://pgp.inoc.net/rblayzor/
>


Re: Disney+ Geolocation issues

2019-11-13 Thread Robert Blayzor
On 11/13/19 9:49 AM, Matthew Huff wrote:
> It’s not about optimization, it’s about the contract with the content 
> providers. The agreement is to restrict content by geographical regions 
> mainly for marketing purposes. They block VPN access to keep people from 
> bypassing those restrictions. It’s true of all the streaming providers.


Build a better mousetrap, because it's clearly not working. We still get
tons of people calling into first level support asking why ESPN+ doesn't
work and that ESPN told them to call their ISP's, which can do NOTHING
to fix the problem.

Guessing Disney stole a page from that book...

-- 
inoc.net!rblayzor
XMPP: rblayzor.AT.inoc.net
PGP:  https://pgp.inoc.net/rblayzor/


Re: Disney+ Geolocation issues

2019-11-13 Thread Matthew Huff
It’s not about optimization, it’s about the contract with the content 
providers. The agreement is to restrict content by geographical regions mainly 
for marketing purposes. They block VPN access to keep people from bypassing 
those restrictions. It’s true of all the streaming providers.

> On Nov 13, 2019, at 9:44 AM, Robert Blayzor  wrote:
> 
> On 11/12/19 5:28 PM, Michael Crapse wrote:
>> Myself and a few other ISPs are having our eyeballs complain about
>> disney+ saying that they're on a VPN. Does anyone have any idea, or who
>> to contact regarding this issue?
>> This is most likely improper geolocation databases. Anyone have an idea
>> who they use?
>> 
> 
> 
> Same boat here. ARIN ISP with all valid SWIP clearly showing stateside
> USA. So who knows what Disney+ is doing to block their viewers. Seems
> rather silly to block viewing based on the connecting IP address.
> Wouldn't you base it on the authorized viewer who is logged in and using
> the service? I mean, that's what they are paying for. I get the whole
> CDN steering thing, but the error message message sent back to the
> viewer should not be to "call your ISP". Now you have support desks
> taking thousands of worthless calls
> 
> ESPN+ is guilty of the same garbage
> 
> -- 
> inoc.net!rblayzor
> XMPP: rblayzor.AT.inoc.net
> PGP:  https://pgp.inoc.net/rblayzor/



Re: Disney+ Streaming

2019-11-13 Thread Mel Beckman
I concur. This is silly off-topic. You don’t have to go home, but you can’t 
stay here, according to NANOG guidelines. 

-mel 

> On Nov 13, 2019, at 4:57 AM, Bryan Holloway  wrote:
> 
> 
> 
>> On 11/13/19 1:06 PM, Niels Bakker wrote:
>> * mikeboli...@gmail.com (Mike Bolitho) [Wed 13 Nov 2019, 12:05 CET]:
>>> This has gone well beyond out of scope of the NANOG list. Discussing who
>>> watches what kind of content has nothing to do with networking. Can you
>>> guys take the conversation elsewhere?
>> On the contrary.  This discussion informs eyeball networks' capacity 
>> planning requirements for the upcoming years.
>> It'd be nice to go from anecdata to data, though.
>> -- Niels.
> 
> 
> Indeed ... as an eyeball network, this is all very relevant.
> 
> Another aspect that hasn't been mentioned in this thread (I think), is that 
> besides there being a potential saturation of streaming services, there's 
> also the backroom dealings between content and content-providers.
> 
> Here's some data: Netflix just lost "Friends", one of its most popular 
> offerings (and probably more than a blip on my bandwidth graphs) to HBO Max. 
> This is but one example, but, as a whole, stuff like this is very important 
> for capacity-planning.
> 
> Not saying it's gonna happen, but if Disney "lost" the Star Wars franchise 
> to, say, Amazon, you better believe there are likely to be traffic shifts. 
> (Yes, I know they own it.)


Re: Disney+ Geolocation issues

2019-11-13 Thread Robert Blayzor
On 11/12/19 5:28 PM, Michael Crapse wrote:
> Myself and a few other ISPs are having our eyeballs complain about
> disney+ saying that they're on a VPN. Does anyone have any idea, or who
> to contact regarding this issue?
> This is most likely improper geolocation databases. Anyone have an idea
> who they use?
> 


Same boat here. ARIN ISP with all valid SWIP clearly showing stateside
USA. So who knows what Disney+ is doing to block their viewers. Seems
rather silly to block viewing based on the connecting IP address.
Wouldn't you base it on the authorized viewer who is logged in and using
the service? I mean, that's what they are paying for. I get the whole
CDN steering thing, but the error message message sent back to the
viewer should not be to "call your ISP". Now you have support desks
taking thousands of worthless calls

ESPN+ is guilty of the same garbage

-- 
inoc.net!rblayzor
XMPP: rblayzor.AT.inoc.net
PGP:  https://pgp.inoc.net/rblayzor/


Re: Disney+ Geolocation issues

2019-11-13 Thread Ca By
On Tue, Nov 12, 2019 at 9:18 PM Michael Crapse  wrote:

> IPv6 is a lot more granular when it comes to geolocation data. It is also
> very very unlikely that the block has been used before, and you never know
> what the previous owner did or what geolocation/VPN blacklists it was added
> to. Let me put it this way, this is a familiar song and dance for us, and
> it never happens on ipv6 for us, always IPv4.
>

Michael — good info from the field. Thanks for sharing.

CB


> On Tue, Nov 12, 2019, 10:02 PM Randy Bush  wrote:
>
>> > IPv6 support by disney(using AWS) would obviate this issue.
>>
>> ok.  i give.  exactly how?  i mean technically.
>>
>> randy
>>
>


Re: Disney+ Geolocation issues

2019-11-13 Thread jim deleskie
Using a TPIA provider here at home in Nova Scotia same issue.

-jim

On Tue., Nov. 12, 2019, 6:29 p.m. Michael Crapse, 
wrote:

> Myself and a few other ISPs are having our eyeballs complain about
> disney+ saying that they're on a VPN. Does anyone have any idea, or who to
> contact regarding this issue?
> This is most likely improper geolocation databases. Anyone have an idea
> who they use?
>
> Mike
>


Re: Disney+ Streaming

2019-11-13 Thread Bryan Holloway




On 11/13/19 1:06 PM, Niels Bakker wrote:

* mikeboli...@gmail.com (Mike Bolitho) [Wed 13 Nov 2019, 12:05 CET]:

This has gone well beyond out of scope of the NANOG list. Discussing who
watches what kind of content has nothing to do with networking. Can you
guys take the conversation elsewhere?


On the contrary.  This discussion informs eyeball networks' capacity 
planning requirements for the upcoming years.


It'd be nice to go from anecdata to data, though.


 -- Niels.



Indeed ... as an eyeball network, this is all very relevant.

Another aspect that hasn't been mentioned in this thread (I think), is 
that besides there being a potential saturation of streaming services, 
there's also the backroom dealings between content and content-providers.


Here's some data: Netflix just lost "Friends", one of its most popular 
offerings (and probably more than a blip on my bandwidth graphs) to HBO 
Max. This is but one example, but, as a whole, stuff like this is very 
important for capacity-planning.


Not saying it's gonna happen, but if Disney "lost" the Star Wars 
franchise to, say, Amazon, you better believe there are likely to be 
traffic shifts. (Yes, I know they own it.)


Re: Disney+ Streaming

2019-11-13 Thread Niels Bakker

* mikeboli...@gmail.com (Mike Bolitho) [Wed 13 Nov 2019, 12:05 CET]:

This has gone well beyond out of scope of the NANOG list. Discussing who
watches what kind of content has nothing to do with networking. Can you
guys take the conversation elsewhere?


On the contrary.  This discussion informs eyeball networks' capacity 
planning requirements for the upcoming years.


It'd be nice to go from anecdata to data, though.


-- Niels.


Re: Disney+ Streaming

2019-11-13 Thread Mike Bolitho
This has gone well beyond out of scope of the NANOG list. Discussing who
watches what kind of content has nothing to do with networking. Can you
guys take the conversation elsewhere?

- Mike Bolitho


On Tue, Nov 12, 2019 at 4:34 PM Matthew Petach 
wrote:

>
> My point was that Disney has a lock on much of the content kids love.
>
> Netflix/HBO/AmazonPrime, not so much.
>
> So, the new eyeballs aren't going to be from parents watching different
> shows, it'll be from parents watching their adult-ish stuff, while the kids
> are happily ensconced with Disney+.
>
> I called out Game of Thrones and Good Omens as shows that are popular with
> adults but that aren't terribly family friendly, so you won't be getting
> many 12-and-unders watching them.
>
> That's where the new eyeballs come from.
>
> Matt
>
>
> On Tue, Nov 12, 2019, 13:17 Mark Andrews  wrote:
>
>> They can already stream different content to multiple devices
>> simultaneously.
>> All this does is make some content that wasn’t available previously now
>> available.
>>
>> People can really only watch one thing at a time.  Net streaming of the
>> last mile
>> is unlikely to change much.  Just where that content is coming from may
>> change.
>>
>> Mark
>>
>> > On 13 Nov 2019, at 07:53, Matthew Petach  wrote:
>> >
>> >
>> > Different target audiences.
>> >
>> > Now the parents can be watching "Good Omens" or "Game of Thrones" on
>> Netflix while the kids are streaming "The Lion King" on Disney+ streaming.
>> Instead of the whole family watching one show together, now we have
>> segmentation in the marketplace.
>> >
>> > End result is more total overall bandwidth consumption.
>> >
>> > Matt
>> >
>> >
>> > On Tue, Nov 12, 2019, 12:38 Brian J. Murrell 
>> wrote:
>> > On Tue, 2019-11-12 at 15:26 -0500, Valdis Klētnieks wrote:
>> > >
>> > > I can foresee a lot of families subscribing to Netflix *and* Disney+
>> > > because neither one has all the content the family wants to watch.
>> >
>> > Absolutely.  But the time spent watching Disney would *replace* (not be
>> > in addition to, or would it?  Would Disney's content result in existing
>> > streamers watching more hours of streaming than they did before?)
>> > Netflix watching.
>> >
>> > > Has anybody seen a significant drop in total streaming traffic due to
>> > > Netflix
>> > > users jumping ship to Amazon/Hulu, or are consumers just biting the
>> > > bullet,
>> > > coughing up the $$, and streaming more total because across the
>> > > services
>> > > there's more stuff they want to watch?
>> >
>> > I actually suspect streaming is going to decline (at least in
>> > comparison to where it could have grown to) if this streaming service
>> > fragmentation continues.
>> >
>> > I think people are going to reject the idea that they need to subscribe
>> > to a dozen streaming services at $10-$20/mo. each and will be driven
>> > back the good old "single source" (piracy) they used to use before 1
>> > (or perhaps 2) streaming services kept them happy enough to abandon
>> > piracy.
>> >
>> > The content providers are going to piss in their bed again due to
>> > greed.  Again.
>> >
>> > Cheers,
>> > b.
>> >
>>
>> --
>> Mark Andrews, ISC
>> 1 Seymour St., Dundas Valley, NSW 2117, Australia
>> PHONE: +61 2 9871 4742  INTERNET: ma...@isc.org
>>
>>
>>