Re: [Softwires] Review of draft-tsou-softwire-bfd-ds-lite-04

ian.farrer Mon, 22 Apr 2013 08:58:46 -0700

Hi Tina,

A couple of comments and clarifications in line.

Cheers,
Ian

From: Tina TSOU 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, 17 April 2013 08:21
To: Ian Farrer <[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>"

<[email protected]<mailto:[email protected]>>
Cc: Softwires WG <[email protected]<mailto:[email protected]>>
Subject: RE: Review of draft-tsou-softwire-bfd-ds-lite-04

Dear Ian,
Thanks a lot for your review. Replies are in line. And also cc the list.

Thank you,
Tina

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]]
Sent: 2013年4月9日 7:44
To: 
[email protected]<mailto:[email protected]>
Subject: Review of draft-tsou-softwire-bfd-ds-lite-04

Hi,

I've reviewed the draft, comments below:

Cheers,
Ian

Abstract
The tunnel is stateless, but the CGN function is not. In other SW WG documents, 
DS-Lite is referred to as per-flow stateful, so it would be worth making the 
distinction between tunnel and CGN state here (or in the introduction).
[We can make the distinction between tunnel and CGN state in the abstract or 
introduction part. And we can also detect the NAT44 status, but the BFD 
detection for CGN state should be based on per-user rather than per-flow.]

[ian] Agreed

Introduction
The draft makes the assumption that failure of the AFTR needs to be recognised 
by the B4. Has AFTR HA been considered? I'm not suggesting that this is a 
better approach, just that the document should state why the B4 needs to be 
responsible for failure detection / resolution.
[We have considered AFTR HA. Anycast could be a possible way to solve the 
problem theoretically, but there are some problems in the real applications 
(see the slide attached).]

[ian] There's a more fundamental problem with anycast for stateful CGN/AFTR. If 
you have an asymmetric path though the CGNs, then return path traffic will be 
dropped as there is no corresponding state table entry in the AFTR.

3.1
The BFD approach to the tunnel endpoint only provides reachability information 
about the tunnel endpoint - it doesn't consider whether the NAT44 CGN is 
functioning in the AFTR- what about discussing an option to use a remote v4 
endpoint, located beyond the AFTR so that both functions (tunnel & NAT) can be 
tested? - This would mean that the vinokour-bfd-dhcp option (or something 
similar) would be needed to provision a suitable end-point address to establish 
the BFD session to.
I must confess, that I've never tried running BFD through NAT, but from a quick 
look through RFC5880, I can't see any obvious reason why this wouldn't work.
[You are right. BFD could be used to detect the status of both DS-Lite tunnel 
and NAT44,but the BFD detection for NAT44 should be based on per-user rather 
than per-flow. We will clarify this in the next version.]

[ian] OK

3.1.1
The well-known addresses for DS-Lite are from the 192.0.0.0/29 range (not 
192.0.2.0/29) - This needs to be updated throughout.
[Will do. Thanks.]

3.3.
As for section 3.1

4.
Can it be assumed that both encap/decap and BFD functions can be carried out in 
hardware?
[It can be carried out in hardware of CGN.]

(I don't know the answer to this, it's just that I've seen hardware 
implementations that will do one function, but pass additional functions to the 
CPU). Also, the tunnel endpoint v4 address is often a virtual interface - are 
these based on ASIC/FPGAs.

What about putting more direct comparisons between the different approaches 
here? Some other things that could be compared:
[Agree, and we provide some texts below.]

How widely available the mechanism already is (e.g. ICMP ubiquitous, PCP/BFD 
less so)
[ICMP is widely used than PCP/BFD. BFD is widely used in the router and CGN 
side, but less used in the terminal side. Not sure about PCP. However, from the 
aspect of failure detection, BFD has explicit capability of bidirectional 
status synchronization to guarantee the consistency of the failure status of 
both sides. While ICMP has no such capability.]

Additional functionality ontop of keepalives (mentioned only for PCP at the 
moment)
[BFD has explicit capability of bidirectional status synchronization to 
guarantee the consistency of the failure status of both sides. ICMP could 
actively initiate status detection from the network side or CGN side, while PCP 
could not. PCP has no capability of bidirectional detection.]

Configuration/provisioning overheads for each approach
[There is normally TR-069 server at the network management side. So it is 
similar for each approach.]

[From the above analysis, we choose BFD as the failure detection approach in 
this document.]

5.
I don't think that the paragraph about anycast really belongs here as it refers 
to a completely different approach to HA than the rest of the document. What 
about describing the anycast approach and the VRRP approach as a section at the 
begining of the document, so that you would have AFTR based failover mechanisms 
and B4 based failover mechanisms. Wouldn't that give a more complete overview 
of all of the possible failover mechanisms that could be used?
[Would you please clarify VRRP approach a bit more? In our understanding, BRAS 
is the gateway for the broadband user access. The protection of the gateway is 
normally achieved by the function of cold standby and hot standby, and no VRRP 
technology is used.]

[ian] For active/passive HA in NAT gateways, it's quite common to have a single 
virtual address offered by VRRP (or a proprietary equivalent) that the upstream 
routers will use as their next hop. In the event that the master CGN fails, the 
standby takes over the virtual L3 address. If you were to use a VRRP based 
virtual address as the tunnel endpoint, then the clients wouldn't need to be 
aware of the failover. This is currently done for IPSec tunnel HA.

Also, using anycast for a per-flow stateful solution such as DS-Lite sounds 
like it's going to have a lot of problems. I certainly wouldn't
[Agreed.]

The document doesn't describe session re-establishment in the event of an AFTR 
failure, as any existing / persistent sessions would need to be re-created in 
the backup AFTR CGN's state table.
[We suggest set up BFD link for both active AFTR and backup AFTR in the initial 
state. When the active AFTR is detected in failure, the service will be shifted 
to the backup AFTR. If the backup AFTR is detected in failure, there will be a 
warning to remind the network management server to fix the failure.]

[ian] What I mean is that if there is an active TCP session through the CGN 
function of an AFTR, and this AFTR fails, then the TCP session will need to be 
re-established by the client as it is not present in the state table of the 
backup AFTR/CGN.

I think there should be a section for this towards the end of the document as 
it's a consideration with any stateful HA model that doesn't have state sync.
[Agree that there should be a state sync mechanism between active AFTR and 
backup AFTR, to synchronize the state of each user between the two AFTRs. This 
mechanism is to guarantee that the traffic coming back to the B4 is from the 
backup AFTR, if the service is shifted to backup AFTR.]

_______________________________________________
Softwires mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/softwires

Re: [Softwires] Review of draft-tsou-softwire-bfd-ds-lite-04

Reply via email to