Re: [IPsec] draft-liu-ipsecme-ikev2-mtu-dect early TSVAREA review

2023-01-13 Thread to...@strayalpha.com
Daniel,

> On Jan 13, 2023, at 8:33 PM, Daniel Migault  wrote:
> 
> f not, to better understand, do we have an example of a packet that cannot be 
> processed if the MTU is set to tunMAP but that can be processed if the MTU is 
> set to EMTU_R. 


As per intarea-tunnels, MTU is a highly overloaded term.

Tunnels relay packets that exceed their tunMAP but not their tunMTU (EMTU_R - 
headers) using source fragmentation all the time.

However, that’s the issue. The reasons why what you’re trying to do isn’t 
useful is already covered in detail in intarea-tunnels - and why not to do it 
using IPv4 DF=0 in rfc6864.

It’s not useful to have an email exchange rehashing that content message by 
message.

Joe___
IPsec mailing list
IPsec@ietf.org
https://www.ietf.org/mailman/listinfo/ipsec


Re: [IPsec] draft-liu-ipsecme-ikev2-mtu-dect early TSVAREA review

2023-01-13 Thread Daniel Migault
Hi,

Thanks for the feedback, please find my comments and questions inline.

Yours,
Daniel

On Fri, Jan 13, 2023 at 8:41 PM to...@strayalpha.com 
wrote:

> Hi, Daniel,
>
> On Jan 13, 2023, at 2:12 PM, Daniel Migault  wrote:
>
> Hi Joe,
>
> Thanks for the comment. There are some terminologies we were not using
> properly, so thank you for the clarification. Please find inline our
> clarification and implementation of your concerns.
>
> Yours,
> Daniel
>
> On Sun, Jan 8, 2023 at 11:45 AM to...@strayalpha.com 
> wrote:
>
>> Hi, Daniel,
>>
>> The abstract clearly states a goal that is not achievable (of avoiding
>> reassembly). The best way to avoid the impact of mid-tunnel fragmentation
>> is to use IPv4 as a tunnel header the way that IPv6 would be - with DF=1.
>> However, even so, the egress always needs to handle reasssembly as long as
>> there is even source fragmentation.
>>
>
> I understand the comment as our goal is interpreted to avoid
> reassembling operations to happen completely. This would mean that
> reassembly could even not be implemented.
> This is not our intention. Reassembly happens the same way it happens
> today. The only thing we do is that the egress node notify the ingress node
> that reassembly is happening. The ingress node may or may not take any
> action to prevent reassembly to happen with the next packets being tunneled
> over the IPsec tunnel. In that sense "avoid" needs to be understood as
> reducing the number of occurrences the reassembly operation happens.
>
> We may agree the best way to avoid mid tunnel fragmentation is to set
> DF=1. But in our case we cannot meet this condition.
> The current text in the abstract is
>
> OLD:
> This document considers an ingress and an egress security gateway
> connected over an IPv4 network.
> The Tunnel Link Packet have their Don't Fragment (DF) set to 0.
>
> Does the text below is clearer to say:
>
> NEW:
> This document considers an ingress and an egress security gateway
> connected over a IPv4 network with the Tunnel Link Packet Don't Fragment
> (DF) set to 0.
>
>
> That is better English, but still technically ill-advised. Any solution
> that requires IPv4 DF=0 then requires generation of unique IDs that don’t
> wrap in ways that could cause mis-reassembly per RFC 6864.
>
> The introduction mentions the rationals on why we cannot rely on setting
> DF=1. Typically some routers do not check the MTU and ignore the packet
> without returning a ICMP PTB error and in many deployments the ICMP PTB -if
> sent - is blocked and is not received. This prohibits the use of DF=1 with
> IPv4.
>
>
> You have described the reason why PLPMTUD exists, which is not a rationale
> for continuing to use on-path IPv4 fragmentation.
>

>
>> I appreciate what you WANT to do - but again, it is not possible. You
>> have two behaviors - either use inner fragmentation (which won’t work for
>> transit traffic where IPv4 DF=1 or any IPv6) or reduce the tunnel MTU.
>>
>> But the tunnel MTU is defined by EMTU_R of the tunnel egress, not EMTU_S
>> of the tunnel ingress. If you reduce the tunnel MTU, you’re just going to
>> end up black-holing packets arriving at the tunnel ingress.
>>
>>  ok. I misunderstood tunnel MTU and that tunnel MTU is EMTU_R, this is
> not what we are changing. What we had written might be confusing.
> When I said EMTU_R I was considering the router only without any
> consideration of the tunnel.  From the terminology section of
> intarea-tunnel I did not read EMTU_R applies to a tunnel environment, and
> considered this to be the MTU associated to the interface for incoming
> packet to the router.
>
> Here is what we actually meant:
>
>
> We are ensuring that packet that are encapsulated by the Ingress interface
> do not exceed the tunnel MAP.  My understanding is  that the tunnel MAP is
> the largest IP packet the source can send,  that will not be fragmented by
> the network between the Ingress and egress interface. As it is not
> fragmented, fragments will not be reassembled.
>
>
> Please review intarea-tunnels.
>
> Setting Ingress send size to MAP doesn’t avoid source fragmentation, which
> thus doesn’t avoid reassembly. It just sets the size of each fragment to
> avoid on-path fragmentation - which avoids the need for DF=0. So setting
> DF=0 is exactly what you don’t need.
>
> To do so, we set the MTU of the router associated with the Ingress
> interface is set to the tunnel MAP. This corresponds to set tunMTU =tunMAP
> Figure 11 of intarea.
>
> Suppose an IP packet is sent by the source and meets that router.
> * The packet has DF=1. If it is larger than that MTU (= tunMAP), the
> packet is discarded and an ICMP PTB message is sent back to the source. The
> source will proceed to source fragmentation.
>
>
> When the IP packet gets to the router, the link should have an MTU of the
> tiunnel EMTU_R, not MAP.
>
I agree that setting the MTU to EMTU_R is the largest possible value.
However, setting it to a smaller value may also  be 

Re: [IPsec] draft-liu-ipsecme-ikev2-mtu-dect early TSVAREA review

2023-01-13 Thread to...@strayalpha.com
Hi, Daniel,

> On Jan 13, 2023, at 2:12 PM, Daniel Migault  wrote:
> 
> Hi Joe,
> 
> Thanks for the comment. There are some terminologies we were not using 
> properly, so thank you for the clarification. Please find inline our 
> clarification and implementation of your concerns.
> 
> Yours, 
> Daniel  
> 
> On Sun, Jan 8, 2023 at 11:45 AM to...@strayalpha.com 
>   > wrote:
>> Hi, Daniel,
>> 
>> The abstract clearly states a goal that is not achievable (of avoiding 
>> reassembly). The best way to avoid the impact of mid-tunnel fragmentation is 
>> to use IPv4 as a tunnel header the way that IPv6 would be - with DF=1. 
>> However, even so, the egress always needs to handle reasssembly as long as 
>> there is even source fragmentation.
>  
> I understand the comment as our goal is interpreted to avoid reassembling 
> operations to happen completely. This would mean that reassembly could even 
> not be implemented. 
> This is not our intention. Reassembly happens the same way it happens today. 
> The only thing we do is that the egress node notify the ingress node that 
> reassembly is happening. The ingress node may or may not take any action to 
> prevent reassembly to happen with the next packets being tunneled over the 
> IPsec tunnel. In that sense "avoid" needs to be understood as reducing the 
> number of occurrences the reassembly operation happens.  
> 
> We may agree the best way to avoid mid tunnel fragmentation is to set DF=1. 
> But in our case we cannot meet this condition. 
> The current text in the abstract is
> 
> OLD:
> This document considers an ingress and an egress security gateway connected 
> over an IPv4 network.
> The Tunnel Link Packet have their Don't Fragment (DF) set to 0.
> 
> Does the text below is clearer to say:
> 
> NEW:
> This document considers an ingress and an egress security gateway connected 
> over a IPv4 network with the Tunnel Link Packet Don't Fragment (DF) set to 0.

That is better English, but still technically ill-advised. Any solution that 
requires IPv4 DF=0 then requires generation of unique IDs that don’t wrap in 
ways that could cause mis-reassembly per RFC 6864.

> The introduction mentions the rationals on why we cannot rely on setting 
> DF=1. Typically some routers do not check the MTU and ignore the packet 
> without returning a ICMP PTB error and in many deployments the ICMP PTB -if 
> sent - is blocked and is not received. This prohibits the use of DF=1 with 
> IPv4. 

You have described the reason why PLPMTUD exists, which is not a rationale for 
continuing to use on-path IPv4 fragmentation.

>> 
>> I appreciate what you WANT to do - but again, it is not possible. You have 
>> two behaviors - either use inner fragmentation (which won’t work for transit 
>> traffic where IPv4 DF=1 or any IPv6) or reduce the tunnel MTU.
>> 
>> But the tunnel MTU is defined by EMTU_R of the tunnel egress, not EMTU_S of 
>> the tunnel ingress. If you reduce the tunnel MTU, you’re just going to end 
>> up black-holing packets arriving at the tunnel ingress.
>> 
>  ok. I misunderstood tunnel MTU and that tunnel MTU is EMTU_R, this is not 
> what we are changing. What we had written might be confusing. 
> When I said EMTU_R I was considering the router only without any 
> consideration of the tunnel.  From the terminology section of intarea-tunnel 
> I did not read EMTU_R applies to a tunnel environment, and considered this to 
> be the MTU associated to the interface for incoming packet to the router.
> 
> Here is what we actually meant:
> 
> 
> We are ensuring that packet that are encapsulated by the Ingress interface do 
> not exceed the tunnel MAP.  My understanding is  that the tunnel MAP is the 
> largest IP packet the source can send,  that will not be fragmented by the 
> network between the Ingress and egress interface. As it is not fragmented, 
> fragments will not be reassembled.

Please review intarea-tunnels.

Setting Ingress send size to MAP doesn’t avoid source fragmentation, which thus 
doesn’t avoid reassembly. It just sets the size of each fragment to avoid 
on-path fragmentation - which avoids the need for DF=0. So setting DF=0 is 
exactly what you don’t need.

> To do so, we set the MTU of the router associated with the Ingress interface 
> is set to the tunnel MAP. This corresponds to set tunMTU =tunMAP  Figure 11 
> of intarea. 
> 
> Suppose an IP packet is sent by the source and meets that router. 
> * The packet has DF=1. If it is larger than that MTU (= tunMAP), the packet 
> is discarded and an ICMP PTB message is sent back to the source. The source 
> will proceed to source fragmentation. 

When the IP packet gets to the router, the link should have an MTU of the 
tiunnel EMTU_R, not MAP.

If the packet arrives with DF=1, then if it’s smaller than the tunnel EMTU_R, 
it will pass. If not, the router has no choice but to drop the packet (and try 
to send an ICMP PTB if that’s possible).


Re: [IPsec] draft-liu-ipsecme-ikev2-mtu-dect early TSVAREA review

2023-01-13 Thread Daniel Migault
Hi Joe,

Thanks for the comment. There are some terminologies we were not using
properly, so thank you for the clarification. Please find inline our
clarification and implementation of your concerns.

Yours,
Daniel

On Sun, Jan 8, 2023 at 11:45 AM to...@strayalpha.com 
wrote:

> Hi, Daniel,
>
> The abstract clearly states a goal that is not achievable (of avoiding
> reassembly). The best way to avoid the impact of mid-tunnel fragmentation
> is to use IPv4 as a tunnel header the way that IPv6 would be - with DF=1.
> However, even so, the egress always needs to handle reasssembly as long as
> there is even source fragmentation.
>

I understand the comment as our goal is interpreted to avoid
reassembling operations to happen completely. This would mean that
reassembly could even not be implemented.
This is not our intention. Reassembly happens the same way it happens
today. The only thing we do is that the egress node notify the ingress node
that reassembly is happening. The ingress node may or may not take any
action to prevent reassembly to happen with the next packets being tunneled
over the IPsec tunnel. In that sense "avoid" needs to be understood as
reducing the number of occurrences the reassembly operation happens.

We may agree the best way to avoid mid tunnel fragmentation is to set DF=1.
But in our case we cannot meet this condition.
The current text in the abstract is

OLD:
This document considers an ingress and an egress security gateway connected
over an IPv4 network.
The Tunnel Link Packet have their Don't Fragment (DF) set to 0.

Does the text below is clearer to say:

NEW:
This document considers an ingress and an egress security gateway connected
over a IPv4 network with the Tunnel Link Packet Don't Fragment (DF) set to
0.

The introduction mentions the rationals on why we cannot rely on setting
DF=1. Typically some routers do not check the MTU and ignore the packet
without returning a ICMP PTB error and in many deployments the ICMP PTB -if
sent - is blocked and is not received. This prohibits the use of DF=1 with
IPv4.



>
> I appreciate what you WANT to do - but again, it is not possible. You have
> two behaviors - either use inner fragmentation (which won’t work for
> transit traffic where IPv4 DF=1 or any IPv6) or reduce the tunnel MTU.
>
> But the tunnel MTU is defined by EMTU_R of the tunnel egress, not EMTU_S
> of the tunnel ingress. If you reduce the tunnel MTU, you’re just going to
> end up black-holing packets arriving at the tunnel ingress.
>
>  ok. I misunderstood tunnel MTU and that tunnel MTU is EMTU_R, this is not
what we are changing. What we had written might be confusing.
When I said EMTU_R I was considering the router only without any
consideration of the tunnel.  From the terminology section of
intarea-tunnel I did not read EMTU_R applies to a tunnel environment, and
considered this to be the MTU associated to the interface for incoming
packet to the router.

Here is what we actually meant:


We are ensuring that packet that are encapsulated by the Ingress interface
do not exceed the tunnel MAP.  My understanding is  that the tunnel MAP is
the largest IP packet the source can send,  that will not be fragmented by
the network between the Ingress and egress interface. As it is not
fragmented, fragments will not be reassembled.

To do so, we set the MTU of the router associated with the Ingress
interface is set to the tunnel MAP. This corresponds to set tunMTU =tunMAP
Figure 11 of intarea.

Suppose an IP packet is sent by the source and meets that router.
* The packet has DF=1. If it is larger than that MTU (= tunMAP), the packet
is discarded and an ICMP PTB message is sent back to the source. The source
will proceed to source fragmentation.

* The packet has DF=0.  that is larger than that MTU the router fragments
it in fragments less than tunMAP thus performing inner fragmentation.

* Any packet smaller than the MTU = tunMAP is sent to the ingress interface
and encapsulated.


I agree that we MUST ensure that ICMP PTB messages are received by the
source and lead to source fragmentation otherwise, this will result in
black holding traffic between the tunnel MAP and the original MTU of the
router.

Note that by setting DF=1 you are supposed to be able to handle this kind
of situation. So I do not see this as a major issue.

Two important points: the tunnel ingress is NOT the one that should ever
> send PTB back; that’s the job of the router where/if that tunnel ingress
> resides; second, you cannot claim to get around an ICMP black hole
> situation by creating a new ICMP black hole situation.
>
> With the mechanism we clarified, the ICMP PTB is sent by the router where
the ingress interface is.
Regarding the blackholing situation, in the first case, it results from the
transit network which is out of the control of the administrator of the
source. On the other hand, the administrator of the source is able to
ensure that ICMP packet sent by the ingress router will be