RE: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-13 Thread adamv0025
> From: Saku Ytti 
> Sent: Tuesday, March 12, 2019 6:14 PM
> 
> On Tue, Mar 12, 2019 at 8:09 PM  wrote:
> 
> > Yes right, but the lookup principle is the same either you look at IPv6 flow
> label or you look at the Entropy label.
> 
> Correct, FAT, Entropy and IPv6 Flow Label are all in principle same, a way for
> source node to communicates what constitutes a flow. And in every case,
> there is no guarantee implementation has any performance gains, as
> implementation may choose to do normal flow speculation in addition of
> doing the fast thing.
> 
That's right, and I didn't test that by sending forged packets (with 
conflicting L3+L4 keys and flow label) at the DUT to see if DUT uses L3+L4 keys 
or indeed relies on the flow information.

adam



Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-12 Thread Saku Ytti
On Tue, Mar 12, 2019 at 8:09 PM  wrote:

> Yes right, but the lookup principle is the same either you look at IPv6 flow 
> label or you look at the Entropy label.

Correct, FAT, Entropy and IPv6 Flow Label are all in principle same, a
way for source node to communicates what constitutes a flow. And in
every case, there is no guarantee implementation has any performance
gains, as implementation may choose to do normal flow speculation in
addition of doing the fast thing.

-- 
  ++ytti


RE: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-12 Thread adamv0025
> From: Saku Ytti 
> Sent: Tuesday, March 12, 2019 6:01 PM
> 
> On Tue, Mar 12, 2019 at 7:55 PM  wrote:
> 
> > This was on Trio and sorry I should have clarified we did test with default
> L3+L4 keys on MPLS labelled packets -default in Junos (as baseline).
> > And then repeated the test using flow labels -which forced Trio to ignore
> the L3+L4 keys and act solely on flow label.
> > PPS performance wise we couldn’t really tell the difference (was in the
> noise).
> 
> Are you sure we are talking about same thing. This thread is about 20bit IPv6
> header Flow Label. I feel like you're talking about FAT pseudowires?
> 
Yes right, but the lookup principle is the same either you look at IPv6 flow 
label or you look at the Entropy label.

> By default JNPR will in pseudowire transit look for IP keys, with or without
> FAT. Optionally it can look even with existence of CW.  You need to
> specifically ask it not to look for IP keys in pseudowires or add CW (and not
> explicitly tell it to look).
> 
We didn't use FAT PWs, but rather entropy labels for VPNv4 traffic.

adam



Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-12 Thread Saku Ytti
On Tue, Mar 12, 2019 at 7:55 PM  wrote:

> This was on Trio and sorry I should have clarified we did test with default 
> L3+L4 keys on MPLS labelled packets -default in Junos (as baseline).
> And then repeated the test using flow labels -which forced Trio to ignore the 
> L3+L4 keys and act solely on flow label.
> PPS performance wise we couldn’t really tell the difference (was in the 
> noise).

Are you sure we are talking about same thing. This thread is about
20bit IPv6 header Flow Label. I feel like you're talking about FAT
pseudowires?

By default JNPR will in pseudowire transit look for IP keys, with or
without FAT. Optionally it can look even with existence of CW.  You
need to specifically ask it not to look for IP keys in pseudowires or
add CW (and not explicitly tell it to look).


-- 
  ++ytti


RE: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-12 Thread adamv0025
Hey Saku,
> From: Saku Ytti 
> Sent: Tuesday, March 12, 2019 11:54 AM
> 
> Hey Adam,
> 
> > We did this exact testing a while back on Juniper 2nd and 3rd gen PFEs.
> > The results showed it doesn't matter a tiny bit whether you do 5-tuple hash
> or use flow label.
> > So the bottom line is on modern NPUs it doesn't really matter.
> 
> Does PFE mean PE or Trio? What exactly did you test? I don't see way to
> disable L3+L4 keys and enable flow_label.
> 
This was on Trio and sorry I should have clarified we did test with default 
L3+L4 keys on MPLS labelled packets -default in Junos (as baseline). 
And then repeated the test using flow labels -which forced Trio to ignore the 
L3+L4 keys and act solely on flow label. 
PPS performance wise we couldn’t really tell the difference (was in the noise). 
  

adam



Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-12 Thread Saku Ytti
Hey Adam,

> We did this exact testing a while back on Juniper 2nd and 3rd gen PFEs.
> The results showed it doesn't matter a tiny bit whether you do 5-tuple hash 
> or use flow label.
> So the bottom line is on modern NPUs it doesn't really matter.

Does PFE mean PE or Trio? What exactly did you test? I don't see way
to disable L3+L4 keys and enable flow_label.

Doing flow_label + sip + dip + sport + dport indeed would be pretty
almost same cost as sip + dip + spot + dport, the cost difference will
be very marginal.

Doing flow_label or sip+sip+sport+dport the cost difference is
non-marginal, if that actually is true for any specific implementation
is separate matter.

-- 
  ++ytti


RE: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-12 Thread adamv0025


> Töma Gavrichenkov
> Sent: Friday, March 8, 2019 5:07 PM
> 
> On Fri, Mar 8, 2019 at 7:48 PM Saku Ytti  wrote:
> > Why do you think it would be expensive? It's cheaper than how ECMP is
> > done for L3 keys, because you just read the flow label and not
> > calculate any hash.
> 
> The most honest answer would be: I have no idea. That's just what I've seen,
> rather briefly though, as we weren't going to investigate that part at the
> time.
> 
> It's been a while since then, and maybe there was a mistake on our side (at
> least within a perfectly academic context I must assume that there was, as
> there was no peer review — we were not in academy after all!), but I'm still
> inclined to, first, see the benchmarks of any proposed piece of hardware
> that's promising you ECMP with flow labels, second, make any statements
> about the latter.
> 
We did this exact testing a while back on Juniper 2nd and 3rd gen PFEs.
The results showed it doesn't matter a tiny bit whether you do 5-tuple hash or 
use flow label.
So the bottom line is on modern NPUs it doesn't really matter. 
 

adam



Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-09 Thread Masataka Ohta

Mark Andrews wrote:

> Why should the rest of the world have to put up with their inability
> to purchase devices that work with RFC compliant data streams.

Because RFCs specifying IPv6 are broken.

That is, as PTB is generated against multicast, we should block
them. Then, not blocking PTB against unicast needs very deep
inspection, which is not possible with some network processors.

See

https://meetings.apnic.net/32/pdf/pathMTU.pdf

for details.

William Herrin wrote:

> IPv4's inventors did a brilliant job with what they knew at the
> time. IPv6's inventors not so much. Sadly, they were too busy
> figuring out how to make IPv6 integrate well with ATM. Seriously, >  
if you dig up a copy of the original IPng book I think it's chapter 3.


Indeed.

IPv6 replaced link broadcast by various kind of multicast addresses
only to increase MLDP overhead, because IPng WG believed that
simple broadcast does not but more complicated multicast does
work with IP over ATM.

Masataka Ohta


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread William Herrin
On Fri, Mar 8, 2019 at 5:45 AM Brandon Martin 
wrote:

> ICMP is nice in that it's totally protocol agnostic and doesn't require
> altering of packets in transit.  It's a shame we can't reasonably rely
> on it being delivered.
>

Path MTU discovery is broken. It's the one place in TCP/IP where the
end-to-end principle was thrown out the window and we keep on paying for it.

A correct solution would have been for the intermediate router to truncate
the packet. Not fragment, truncate. On receiving the truncated packet, the
RECIPIENT (not the intermediate router) would report the truncation to the
sender. This could easily have been done at layer 3, just like existing
PMTUD.

IPv4's inventors did a brilliant job with what they knew at the time.
IPv6's inventors not so much. Sadly, they were too busy figuring out how to
make IPv6 integrate well with ATM. Seriously, if you dig up a copy of the
original IPng book I think it's chapter 3.

Regards,
Bill Herrin


-- 
William Herrin  her...@dirtside.com  b...@herrin.us
Dirtside Systems . Web: 


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread Saku Ytti
On Fri, Mar 8, 2019 at 7:07 PM Töma Gavrichenkov  wrote:

> It's been a while since then, and maybe there was a mistake on our
> side (at least within a perfectly academic context I must assume that
> there was, as there was no peer review — we were not in academy after
> all!), but I'm still inclined to, first, see the benchmarks of any
> proposed piece of hardware that's promising you ECMP with flow labels,
> second, make any statements about the latter.

1) current implementation
- set offset byte to 8
- read 128 bits to memory1
- read 128 bits to memory2
- return hash_function(memory1, memory2)

This is _JUST_ for L3 keys, in reality customers want L4 keys too, so
it's more expensive. Particularly in IPv6 the L4 keys could be
_anywhere_ potentially gigabytes in future, for same reasons in IPv6
you can bypass ACL filters in many cases, because the HW device won't
know what the L4 keys are.

2) flow label implementation
 - set offset to 12 bits
 - read 20 bits to memory1
 - return memory1

Seems cheaper to me. But still not a good solution, as it is AFI
specific and requires us to actually use the flow label consistently,
which is not universally true. ECMP on embedded ICMP actually would
work without any changes anywhere else but the device calculating the
hash.

-- 
  ++ytti


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread Töma Gavrichenkov
On Fri, Mar 8, 2019 at 7:48 PM Saku Ytti  wrote:
> Why do you think it would be expensive? It's cheaper than how ECMP is
> done for L3 keys, because you just read the flow label and not
> calculate any hash.

The most honest answer would be: I have no idea. That's just what I've
seen, rather briefly though, as we weren't going to investigate that
part at the time.

It's been a while since then, and maybe there was a mistake on our
side (at least within a perfectly academic context I must assume that
there was, as there was no peer review — we were not in academy after
all!), but I'm still inclined to, first, see the benchmarks of any
proposed piece of hardware that's promising you ECMP with flow labels,
second, make any statements about the latter.

--
Töma


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread Saku Ytti
On Fri, Mar 8, 2019 at 5:44 PM Töma Gavrichenkov  wrote:

> My point is that it might be hard to find an affordable device that
> implements ECMP with v6 flow labels without a considerable performance
> impact. I would personally happy to see what others have tested in
> that regard.

Why do you think it would be expensive? It's  cheaper than how ECMP is
done for L3 keys, because you just read the flow label and not
calculate any hash. Much much cheaper than how ECMP is done for L3+L4
keys, if that is done right, which it is not, because no device
implements IPv6 correctly, as it's not possible in reasonably
performing hardware, but this has nothing to do with ECMP.
But in any case, flow labels is not the right solution here, this is
not IPv6 problem, this is IP problem. The right solution is to look at
L3+L4 inside the embedded ICMP packet, as that solves the problem for
both AFIs. This at most costs one branch (negligible in typical NPU),
as you set different static offset based on if you're parsing ICMP or
not. In all likelyhood it costs nothing, as the code likely already
contains branch for ICMP where you can just reset the ECMP offset.

I still fail to understand why you think this particular problem has
anything to do attacks or ICMP volume, I find no such indications, and
the two cloudflare blog articles do not state attacks as motivators to
this, it's just technical problem at delivering the ICMP packets to
correct host. A real problem affecting other networks too, but a
problem we can fix, if we start asking our vendors for a fix.





-- 
  ++ytti


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread Töma Gavrichenkov
On Fri, Mar 8, 2019 at 5:11 PM Saku Ytti  wrote:
> Personally I'm surprised if ICMP volume is relevant based on our
> netflow data.

Legitimate ICMP traffic volume — oh, that's for sure.

But when it comes to attack volumes, it's a different story, and
current netflow measurements might be a bad indicator here, as in
"peacetime generals are always fighting the last war instead of the
next one".

> You are proposing that in this case, there is no such issue of
> delivering ICMPv6 messages to correct host

Guaranteed delivery of untrusted remote messages to exactly the
particular host behind an equal cost fanout, if allowed in a DDoS
mitigation network, is itself a problem, but that has been discussed
in detail in the Section 6 of RFC 6437.

My point is that it might be hard to find an affordable device that
implements ECMP with v6 flow labels without a considerable performance
impact. I would personally happy to see what others have tested in
that regard.

--
Töma


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread Tarko Tikan

hey,


The Cloudflare blog
entry is 4 years old, if they had started actively pursuing proper fix
to the ECMP problem, the fix would be in production right about now.


You can find more recent overview at
https://blog.cloudflare.com/increasing-ipv6-mtu/

--
tarko


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread Saku Ytti
Hey Töma,

> NB: Cloudflare is basically busy filtering excessive amounts of spoofed ICMP 
> packets containing whatever parameters and payload criminals could fit into, 
> at virtually no cost for a customer. Your list might become somewhat short 
> then.

I don't know what is the problem is here, but the Cloudflare blog
documents one specific problem related to ECMP, where the ICMPv6
messages arrive at wrong host and some solutions they are using to
overcome that problem.
You are proposing that in this case, there is no such issue of
delivering ICMPv6 messages to correct host, but in this case issue is
voluntary protection mechanism against too high volume of bad ICMPv6
packets. Is this something you personally are aware of or is this
something you suspect might explain the problem?

Personally I'm surprised if ICMP volume is relevant based on our
netflow data. And I've personally been affected in own deployments
with the ECMP problem and have solved it by just sending smaller
packets. I understand it to be common problem and it would be good if
we'd start asking vendors to fix the problem. The Cloudflare blog
entry is 4 years old, if they had started actively pursuing proper fix
to the ECMP problem, the fix would be in production right about now.


-- 
  ++ytti


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread Jeroen Massar
On 2019-03-08 14:45, Brandon Martin wrote:
> On 3/8/19 8:38 AM, Saku Ytti wrote:
>> Hey,
>>
>>>  now for UDP, I don't know yet how does things like QUIC can be handled 
>>> ...
>>
>> Unfortunately the magic answer you were hoping does not exist, what
>> they do is they just send smaller packets.
>>
> 
> What we almost seem to be moving toward in this discussion is an IP header 
> where the path can reduce the reported MTU which can then be read at the 
> receiving end.  This would be somewhat like ECN just with more than a couple 
> bits.

Something like what I once described in:
https://jeroen.massar.ch/archive/drafts/draft-massar-v6man-mtu-label-02.txt

? :)

Greets,
 Jeroen



Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread Töma Gavrichenkov
On Tue, Mar 5, 2019, 7:27 AM Mark Andrews  wrote:

> [..]

their inability to purchase
> devices that work with RFC compliant data streams.
>

To prove your point, you may want to provide a sample list of devices that
work that way, along with the benchmarks showing that those devices could
still handle arbitrary junk ICMP packets at a line rate.

NB: Cloudflare is basically busy filtering excessive amounts of spoofed
ICMP packets containing whatever parameters and payload criminals could fit
into, at virtually no cost for a customer. Your list might become somewhat
short then.

--
Töma

>


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread Brandon Martin

On 3/8/19 8:38 AM, Saku Ytti wrote:

Hey,


 now for UDP, I don't know yet how does things like QUIC can be handled ...


Unfortunately the magic answer you were hoping does not exist, what
they do is they just send smaller packets.



What we almost seem to be moving toward in this discussion is an IP 
header where the path can reduce the reported MTU which can then be read 
at the receiving end.  This would be somewhat like ECN just with more 
than a couple bits.


Of course, we know how well extension headers, much less hop-by-hop 
headers, are handled on IPv6...


Re-writing a field in the L4 header works, but it seems ugly since it 
means every hop that reduces the MTU of the link has to know every L4 
that participates in such a scheme.


ICMP is nice in that it's totally protocol agnostic and doesn't require 
altering of packets in transit.  It's a shame we can't reasonably rely 
on it being delivered.

--
Brandon Martin


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread Saku Ytti
Hey,

> now for UDP, I don't know yet how does things like QUIC can be handled ...

Unfortunately the magic answer you were hoping does not exist, what
they do is they just send smaller packets.

-- 
  ++ytti


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-08 Thread Jean-Daniel Pauget
hello,

Tore Anderson, you're right, clamping MSS is very efficient and very
certainly solves most of the problems.

now for UDP, I don't know yet how does things like QUIC can be handled ...

regards,

-- 
Jean-Daniel Pauget http://rezopole.net/
Rezopole/LyonIX+33 (0)4 27 46 00 50


On Wed, Mar 06, 2019 at 08:17:42AM +0100, Tore Anderson wrote:
> * Jean-Daniel Pauget
> 
> > I confess using IPv6 behind a 6in4 tunnel because the "Business-Class" 
> > service
> > of the concerned operator doesn't handle IPv6 yet.
> > 
> > as such, I realised that, as far as I can figure, ICMPv6 packet 
> > "too-big" (rfc 4443)
> > seem to be ignored or filtered at ~60% of ClouFlare's http farms
> > 
> > as a result, random sites such as http://nanog.org/ or 
> > https://www.ansible.com/
> > are badly reachable whenever small mtu are involved ...
> 
> Hi Jean-Daniel.
> 
> If you're using using tunnels you'll want to have your tunnel endpoint
> adjust down the TCP MSS value to match the MTU of the tunnel interface.
> That way, you'll avoid problems with Path MTU Discovery. Even in those
> situations where PMTUD does work fine, doing TCP MSS adjustment will
> improve performance as the server does not need to spend an RTT to
> discover your reduced MTU.
> 
> (This isn't really an IPv6 issue, by the way - ISPs using PPPoE will
> typically perform MSS adjustment for IPv4 packets too.)
> 
> If you're using Linux as your tunnel endpoint, try:
> 
> ip6tables -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS  
> --clamp-mss-to-pmtu
> 
> Tore


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Tore Anderson
* Jean-Daniel Pauget

> I confess using IPv6 behind a 6in4 tunnel because the "Business-Class" 
> service
> of the concerned operator doesn't handle IPv6 yet.
> 
> as such, I realised that, as far as I can figure, ICMPv6 packet "too-big" 
> (rfc 4443)
> seem to be ignored or filtered at ~60% of ClouFlare's http farms
> 
> as a result, random sites such as http://nanog.org/ or 
> https://www.ansible.com/
> are badly reachable whenever small mtu are involved ...

Hi Jean-Daniel.

If you're using using tunnels you'll want to have your tunnel endpoint
adjust down the TCP MSS value to match the MTU of the tunnel interface.
That way, you'll avoid problems with Path MTU Discovery. Even in those
situations where PMTUD does work fine, doing TCP MSS adjustment will
improve performance as the server does not need to spend an RTT to
discover your reduced MTU.

(This isn't really an IPv6 issue, by the way - ISPs using PPPoE will
typically perform MSS adjustment for IPv4 packets too.)

If you're using Linux as your tunnel endpoint, try:

ip6tables -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS  
--clamp-mss-to-pmtu

Tore


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Mark Andrews



> On 6 Mar 2019, at 1:36 pm, Fernando Gont  wrote:
> 
> On 5/3/19 03:26, Mark Andrews wrote:
>> 
>> 
>>> On 5 Mar 2019, at 5:18 pm, Mark Tinka  wrote:
>>> 
>>> 
>>> 
>>> On 5/Mar/19 00:25, Mark Andrews wrote:
>>> 
 
 Then Cloudflare should negotiate MSS’s that don’t generate PTB’s if
 they have installed broken ECMP devices.  The simplest way to do that
 is to set the interface MTUs to 1280 on all the servers.  Why should
 the rest of the world have to put up with their inability to purchase
 devices that work with RFC compliant data streams.
>>> 
>>> I've had this issue with cdnjs.cloudflare.com for the longest time at my
>>> house. But as some of you may recall, my little unwanted TCP MSS hack
>>> for IPv6 last weekend fixed that issue for me.
>>> 
>>> Not ideal, and I so wish IPv6 would work as designed, but…
>> 
>> It does work as designed except when crap middleware is added.  ECMP
>> should be using the flow label with IPv6.  It has the advantage that
>> it works for non-0-offset fragments as well as 0-offset fragments and
>> also works for transports other than TCP and UDP.  This isn’t a protocol
>> failure.  It is shitty implementations.
> 
> Not to play devil's advocate but the IETF fot to publish a spec for ECMP
> use of Flow Labels only a few years ago.
> 
> For quite a while, they were unasable... and might still be, for some
> implementations.

And if it is still using the quintuple the PTB has all the necessary information
for unfragmented and 0 offset fragment packets (which there shouldn’t be with a
working TCP stack) to be passed through.

> -- 
> Fernando Gont
> SI6 Networks
> e-mail: fg...@si6networks.com
> PGP Fingerprint:  31C6 D484 63B2 8FB1 E3C4 AE25 0D55 1D4E 7492

-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742  INTERNET: ma...@isc.org



Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Fernando Gont
On 5/3/19 03:26, Mark Andrews wrote:
> 
> 
>> On 5 Mar 2019, at 5:18 pm, Mark Tinka  wrote:
>>
>>
>>
>> On 5/Mar/19 00:25, Mark Andrews wrote:
>>
>>>
>>> Then Cloudflare should negotiate MSS’s that don’t generate PTB’s if
>>> they have installed broken ECMP devices.  The simplest way to do that
>>> is to set the interface MTUs to 1280 on all the servers.  Why should
>>> the rest of the world have to put up with their inability to purchase
>>> devices that work with RFC compliant data streams.
>>
>> I've had this issue with cdnjs.cloudflare.com for the longest time at my
>> house. But as some of you may recall, my little unwanted TCP MSS hack
>> for IPv6 last weekend fixed that issue for me.
>>
>> Not ideal, and I so wish IPv6 would work as designed, but…
> 
> It does work as designed except when crap middleware is added.  ECMP
> should be using the flow label with IPv6.  It has the advantage that
> it works for non-0-offset fragments as well as 0-offset fragments and
> also works for transports other than TCP and UDP.  This isn’t a protocol
> failure.  It is shitty implementations.

Not to play devil's advocate but the IETF fot to publish a spec for ECMP
use of Flow Labels only a few years ago.

For quite a while, they were unasable... and might still be, for some
implementations.


-- 
Fernando Gont
SI6 Networks
e-mail: fg...@si6networks.com
PGP Fingerprint:  31C6 D484 63B2 8FB1 E3C4 AE25 0D55 1D4E 7492






Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Fernando Gont
On 27/2/19 07:01, Jean-Daniel Pauget wrote:
> hello,
> 
> I confess using IPv6 behind a 6in4 tunnel because the "Business-Class" 
> service
> of the concerned operator doesn't handle IPv6 yet.
> 
> as such, I realised that, as far as I can figure, ICMPv6 packet "too-big" 
> (rfc 4443)
> seem to be ignored or filtered at ~60% of ClouFlare's http farms
> 
> as a result, random sites such as http://nanog.org/ or 
> https://www.ansible.com/
> are badly reachable whenever small mtu are involved ...
> 
> support@cloudflare answered me that because I'm not the owner of 
> concerned site,
> and because of security reasons, they wouldn't investigate further.
> 
> are there security concerns with ICMP-too-big ?

Please see: https://tools.ietf.org/html/rfc5927

and also: https://tools.ietf.org/html/rfc8021

Thanks,
-- 
Fernando Gont
SI6 Networks
e-mail: fg...@si6networks.com
PGP Fingerprint:  31C6 D484 63B2 8FB1 E3C4 AE25 0D55 1D4E 7492






Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Hunter Fuller
On Tue, Mar 5, 2019 at 10:09 AM Bjørn Mork  wrote:
> Stephen Satchell  writes:
> > Did you submit a bug report?
>
> I believe this was fixed 5 years ago (in Linux v3.17):
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cb1ce2ef387b01686469487edd45994872d52d73
>
> But RHEL and CentOS are using kernels from the stone age, so they
> haven't noticed yet.

For those who might need this feature, and have a Red Hat contract, a
suggestion:

If you submit a ticket, someone at Red Hat might backport the patch for you.


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Bjørn Mork
Stephen Satchell  writes:

> On 3/5/19 2:54 AM, Thomas Bellman wrote:
>> Out of curiosity, which operating systems put anything useful (for use
>> in ECMP) into the flow label of IPv6 packets?  At the moment, I only
>> have access to CentOS 6 and CentOS 7 machines, and both of them set the
>> flow label to zero for all traffic.
>
> Did you submit a bug report?

I believe this was fixed 5 years ago (in Linux v3.17):
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cb1ce2ef387b01686469487edd45994872d52d73

But RHEL and CentOS are using kernels from the stone age, so they
haven't noticed yet.


Bjørn


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Stephen Satchell
On 3/5/19 2:54 AM, Thomas Bellman wrote:
> Out of curiosity, which operating systems put anything useful (for use
> in ECMP) into the flow label of IPv6 packets?  At the moment, I only
> have access to CentOS 6 and CentOS 7 machines, and both of them set the
> flow label to zero for all traffic.

Did you submit a bug report?


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms,Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread sthaug
> Out of curiosity, which operating systems put anything useful (for use
> in ECMP) into the flow label of IPv6 packets?  At the moment, I only
> have access to CentOS 6 and CentOS 7 machines, and both of them set the
> flow label to zero for all traffic.

FreeBSD 11.2-STABLE.

Steinar Haug, Nethelp consulting, sth...@nethelp.no


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Saku Ytti
On Tue, Mar 5, 2019 at 12:09 PM Joel Jaeggli  wrote:

> Parsing the icmp payload was something we considered in  rfc7690 but wasn’t 
> one the approaches we pursued (we broadcasted the ptb to all hosts on the 
> segment(s) behind the load balancers in our original implementation).
>
> It actually seems like it is becoming feasible to do in an Ethernet switch 
> ASIC like tofino if that is what you want to burn real estate on. Being 
> worthwhile is another matter.

It is definitely possible in all relevant existing NPUs like Trio,
Solar, FP, EZChip, Lightspeed et.al. As it is within visibility of
lookup engine and it is at fixed offset. So not only possible but also
cheap.

-- 
  ++ytti


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Thomas Bellman
On 2019-03-05 07:26 CET, Mark Andrews wrote:

> It does work as designed except when crap middleware is added.  ECMP
> should be using the flow label with IPv6.  It has the advantage that
> it works for non-0-offset fragments as well as 0-offset fragments and
> also works for transports other than TCP and UDP.  This isn’t a protocol
> failure.  It is shitty implementations.

Out of curiosity, which operating systems put anything useful (for use
in ECMP) into the flow label of IPv6 packets?  At the moment, I only
have access to CentOS 6 and CentOS 7 machines, and both of them set the
flow label to zero for all traffic.

There is also the problem that the device generating the Packet Too
Big ICMP, is not the same as the end host that the big packet was
destined for, and does not know what flow label the end host would
have set in its TCP responses.  RFC 6437 is also explicit that:

   o  Forwarding nodes such as routers and load distributors MUST NOT
  depend only on Flow Label values being uniformly distributed.  In
  any usage such as a hash key for load distribution, the Flow Label
  bits MUST be combined at least with bits from other sources within
  the packet, so as to produce a constant hash value for each flow

In practice, using at least the source and destination IP(v6) addresses
in addition to the flow label.  But the ICMP packet has a different
source address than TCP responses from the end host.

Further problem is that the TCP responses from the destination end host
might not even be *passing* the router that generates a Packet Too Big
ICMP error.  In an anycast scenario, that router might have a route to
the sending IPv6 address that goes to a different datacenter than the
host that sent the large packet.  E.g, consider the following network:

 A1  A2
 |   |
DC1 DC2
/ \ /
   /   \   /
  / \ /
 R1  R2
  \ /
   \   /
\ /
 R3
 |
 B

A1 and A2 are hosts in different datacenters, using the same anycast
address A.  Host B initiates a TCP session with address A, R3 selects
the route via R1, and thus reaches A1 in datacenter DC1.  A1 sends a
large packet towards B, but the router in DC1 elects to send that via
R2.  R2 generates a PTB ICMP, but has its best route to address A
towards DC2...


/Bellman



signature.asc
Description: OpenPGP digital signature


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Joel Jaeggli



Sent from my iPhone

> On Mar 5, 2019, at 01:31, Saku Ytti  wrote:
> 
>> On Tue, Mar 5, 2019 at 12:26 AM Mark Andrews  wrote:
>> 
>> Then Cloudflare should negotiate MSS’s that don’t generate PTB’s if
>> they have installed broken ECMP devices.  The simplest way to do that
> 
> Out of curiosity does that imply you are aware of non-broken ECMP
> devices, which are able to hash on the embedded original packet?

Parsing the icmp payload was something we considered in  rfc7690 but wasn’t one 
the approaches we pursued (we broadcasted the ptb to all hosts on the 
segment(s) behind the load balancers in our original implementation).

It actually seems like it is becoming feasible to do in an Ethernet switch ASIC 
like tofino if that is what you want to burn real estate on. Being worthwhile 
is another matter.


> 
> -- 
>  ++ytti
> 



Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Saku Ytti
On Tue, Mar 5, 2019 at 12:26 AM Mark Andrews  wrote:

> Then Cloudflare should negotiate MSS’s that don’t generate PTB’s if
> they have installed broken ECMP devices.  The simplest way to do that

Out of curiosity does that imply you are aware of non-broken ECMP
devices, which are able to hash on the embedded original packet?

-- 
  ++ytti


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-05 Thread Joel Jaeggli



Sent from my iPhone

> On Mar 4, 2019, at 22:26, Mark Andrews  wrote:
> 
> 
> 
>> On 5 Mar 2019, at 5:18 pm, Mark Tinka  wrote:
>> 
>> 
>> 
>>> On 5/Mar/19 00:25, Mark Andrews wrote:
>>> 
>>> 
>>> Then Cloudflare should negotiate MSS’s that don’t generate PTB’s if
>>> they have installed broken ECMP devices.  The simplest way to do that
>>> is to set the interface MTUs to 1280 on all the servers.  Why should
>>> the rest of the world have to put up with their inability to purchase
>>> devices that work with RFC compliant data streams.
>> 
>> I've had this issue with cdnjs.cloudflare.com for the longest time at my
>> house. But as some of you may recall, my little unwanted TCP MSS hack
>> for IPv6 last weekend fixed that issue for me.
>> 
>> Not ideal, and I so wish IPv6 would work as designed, but…
> 
> It does work as designed except when crap middleware is added.  ECMP
> should be using the flow label with IPv6.  It has the advantage that
> it works for non-0-offset fragments as well as 0-offset fragments and
> also works for transports other than TCP and UDP.  This isn’t a protocol
> failure.  It is shitty implementations.

Your mobile carrier’s stateless  tcp accelerator should stop sending  acks with 
a zero flow label so we can actually identify them as part of the same flow...

There a lot of headwind in the real world for using the flow label as a hash 
component.

> 
>> Mark.
> 
> -- 
> Mark Andrews, ISC
> 1 Seymour St., Dundas Valley, NSW 2117, Australia
> PHONE: +61 2 9871 4742  INTERNET: ma...@isc.org
> 
> 



Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-04 Thread Mark Tinka



On 5/Mar/19 08:26, Mark Andrews wrote:
> It does work as designed except when crap middleware is added.  ECMP
> should be using the flow label with IPv6.  It has the advantage that
> it works for non-0-offset fragments as well as 0-offset fragments and
> also works for transports other than TCP and UDP.  This isn’t a protocol
> failure.  It is shitty implementations.

That's what I mean... we find ways to break protocols ourselves.

Mark.


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-04 Thread Mark Andrews



> On 5 Mar 2019, at 5:18 pm, Mark Tinka  wrote:
> 
> 
> 
> On 5/Mar/19 00:25, Mark Andrews wrote:
> 
>> 
>> Then Cloudflare should negotiate MSS’s that don’t generate PTB’s if
>> they have installed broken ECMP devices.  The simplest way to do that
>> is to set the interface MTUs to 1280 on all the servers.  Why should
>> the rest of the world have to put up with their inability to purchase
>> devices that work with RFC compliant data streams.
> 
> I've had this issue with cdnjs.cloudflare.com for the longest time at my
> house. But as some of you may recall, my little unwanted TCP MSS hack
> for IPv6 last weekend fixed that issue for me.
> 
> Not ideal, and I so wish IPv6 would work as designed, but…

It does work as designed except when crap middleware is added.  ECMP
should be using the flow label with IPv6.  It has the advantage that
it works for non-0-offset fragments as well as 0-offset fragments and
also works for transports other than TCP and UDP.  This isn’t a protocol
failure.  It is shitty implementations.

> Mark.

-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742  INTERNET: ma...@isc.org



Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-04 Thread Mark Tinka



On 5/Mar/19 00:25, Mark Andrews wrote:

>
> Then Cloudflare should negotiate MSS’s that don’t generate PTB’s if
> they have installed broken ECMP devices.  The simplest way to do that
> is to set the interface MTUs to 1280 on all the servers.  Why should
> the rest of the world have to put up with their inability to purchase
> devices that work with RFC compliant data streams.

I've had this issue with cdnjs.cloudflare.com for the longest time at my
house. But as some of you may recall, my little unwanted TCP MSS hack
for IPv6 last weekend fixed that issue for me.

Not ideal, and I so wish IPv6 would work as designed, but...

Mark.


Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-04 Thread Mark Andrews


> On 5 Mar 2019, at 6:06 am, Saku Ytti  wrote:
> 
> Hey Jean,
> 
>>I confess using IPv6 behind a 6in4 tunnel because the "Business-Class" 
>> service
>>of the concerned operator doesn't handle IPv6 yet.
>> 
>>as such, I realised that, as far as I can figure, ICMPv6 packet "too-big" 
>> (rfc 4443)
>>seem to be ignored or filtered at ~60% of ClouFlare's http farms
> 
> Might be related to this:
> https://blog.cloudflare.com/path-mtu-discovery-in-practice/
> 
> If you run ECMP then the hash algorithms make no guarantees ICMP
> messages generated by transit devices reach the correct host.


Then Cloudflare should negotiate MSS’s that don’t generate PTB’s if
they have installed broken ECMP devices.  The simplest way to do that
is to set the interface MTUs to 1280 on all the servers.  Why should
the rest of the world have to put up with their inability to purchase
devices that work with RFC compliant data streams.

Mark

> -- 
>  ++ytti

-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742  INTERNET: ma...@isc.org



Re: ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-04 Thread Saku Ytti
Hey Jean,

> I confess using IPv6 behind a 6in4 tunnel because the "Business-Class" 
> service
> of the concerned operator doesn't handle IPv6 yet.
>
> as such, I realised that, as far as I can figure, ICMPv6 packet "too-big" 
> (rfc 4443)
> seem to be ignored or filtered at ~60% of ClouFlare's http farms

Might be related to this:
https://blog.cloudflare.com/path-mtu-discovery-in-practice/

If you run ECMP then the hash algorithms make no guarantees ICMP
messages generated by transit devices reach the correct host.

-- 
  ++ytti


ICMPv6 "too-big" packets ignored (filtered ?) by Cloudflare farms

2019-03-04 Thread Jean-Daniel Pauget
hello,

I confess using IPv6 behind a 6in4 tunnel because the "Business-Class" 
service
of the concerned operator doesn't handle IPv6 yet.

as such, I realised that, as far as I can figure, ICMPv6 packet "too-big" 
(rfc 4443)
seem to be ignored or filtered at ~60% of ClouFlare's http farms

as a result, random sites such as http://nanog.org/ or 
https://www.ansible.com/
are badly reachable whenever small mtu are involved ...

support@cloudflare answered me that because I'm not the owner of concerned 
site,
and because of security reasons, they wouldn't investigate further.

are there security concerns with ICMP-too-big ?

regards,

-- 
Jean-Daniel Pauget http://rezopole.net/
Rezopole/LyonIX+33 (0)4 27 46 00 50