Peter:

"the arguments you use against existing IGPs would be valid 20 years ago, 
but not today."
No matter in which year the arguments are referenced, the difference is 
significant, which is determined by two different design concepts.

"First, links have high bandwidth, CPUs are fast and any serious IGP 
implementation has addressed the bottlenecks you are talking about."
As the number of nodes increases and the high bandwidth is required, the 
number of protocol packets which are transmitted and needed to be 
processed is growing exponentially. But the CPU of a switch can only 
process a few hundred packets per second; therefore, the processing 
capacity of the CPU limits the increase of the number of nodes. I have 
tried to adjust the processing capacity of the CPU in the actual 
commercial systems, the processing capacity may be increased to thousands 
packets per second in some way, but at the expense of other protocol 
processing performance. Therefore, in the large-scale FAT TREE system, the 
processing capacity of the CPU in those commercial systems cannot cope 
with a large number of OSPF protocol packets.

'These days IGPs can support thousands of nodes in an area without any 
problem, and converge sub-second, with precomputed backups, even withing 
few tens of miliseconds.'
For thousands of nodes, it is still a small scale. Once it comes up to 
tens of thousands of nodes, IGP will not do the job.

'There are real deployments that clearly prove it, it's not an academic 
statement.'
I have developed and implemented a lot of routing and switching commercial 
systems. These issues are not academic statements, but the real lessons.

'Even the periodic flooding can be avoided completely using RFC 4136.'
RFC 4136 is only applicable to medium-sized networks. In terms of tens of 
thousands of nodes in the data centers nowadays, it cannot do the job.

We need to be open-minded to adapt to the increasing scales of data 
centers.

Regards,

Richard Bin Liu




Peter Psenak <[email protected]> 
2014/05/15 14:41

收件人
[email protected], Hannes Gredler <[email protected]>, "Alvaro Retana 
(aretana)" <[email protected]>, 
抄送
[email protected], ytsun <[email protected]>, 
[email protected], [email protected]
主题
Re: Reply to comments -- Comparison between FAR and OSPF






Liu,

the arguments you use against existing IGPs would be valid 20 years ago, 
but not today. First, links have high bandwidth, CPUs are fast and any 
serious IGP implementation has addressed the bottlenecks you are talking 
about. These days IGPs can support thousands of nodes in an area without 
any problem, and converge sub-second, with precomputed backups, even 
withing few tens of miliseconds. There are real deployments that clearly 
prove it, it's not an academic statement. Even the periodic flooding can 
be avoided completely using RFC 4136.

regards,
Peter


On 5/15/14 07:08 , [email protected] wrote:
>
> Greeting all:
>
> I read comments on our draft, thank you for your comments.
>
> And some questions had already been replied in our latest FAR
> presentation material(not been presented at the meeting because of
> hard-deadline):
>
> --------"Draft is highly subjective. Data Centers are using existing
> protocols without problems."
> Why OSPF and other conventional routing methods do not work well in a
> large-scale network with several thousands of routers?
> As everyone knows, the OSPF protocol uses multiple databases, more
> topological exchange information (as seen in the following example) and
> complicated algorithm. It requires routers to consume more memory and
> CPU processing capability. But the processing rate of CPU on the
> protocol message per second is very limited. When the network expands,
> CPU will quickly approach its processing limits, and at this time OSPF
> can not continue to expand the scale of the management. The SPF
> algorithm itself does not thoroughly solve these problems.
>
> On the contrary, FAR does not have the convergence time delay and the
> additional CPU overheads, which SPF requires. Because in the initial
> stage, FAR already knows the regular information of the whole network
> topology and does not need to periodically do SPF operation.
>
> One of the examples of "more topological exchange information":
> In the OSPF protocol, LSA floods every 1800 seconds. Especially in the
> larger network, the occupation of CPU and band bandwidth will soon reach
> the router’s performance bottleneck.
> In order to reduce these adverse effects, OSPF introduced the concept of
> Area, which still has not solved the problem thoroughly). By dividing
> the OSPF Area into several areas, the routers in the same area do not
> need to know the topological details outside their area. (In comparison
> with FAR, after  OSPF introducing the concept of Area, the equivalent
> paths cannot be selected in the whole network scope)
>
>   OSPF can achieve the following results by Area :
> 1) Routers only need to maintain the same link state databases as other
> routers within the same Area, without the necessity of maintaining the
> same link state database as all routers in the whole OSPF domain.
> 2) The reduction of the link state databases means dealing with
> relatively fewer LSA, which reduces the CPU consumption of routers;
> 3) The large number of LSAs flood only within the same Area.
> But, its negative effect is that the smaller number of routers which can
> be managed in each OSPF area.
> On the contrary, because FAR does not have the above disadvantages, FAR
> can also manage large-scale network even without dividing Areas.
>
> The aging time of OSPF is set in order to adapt to routing
> transformation and protocol message exchange happened frequently in the
> irregular topology. Its negative effect is:
> when the network does not change, the LSA needs to be refreshed every
> 1800 seconds to reset the aging time. In the regular topology, as the
> routings are fixed, it does not need the complex protocol message
> exchange and aging rules to reflect the routing changes, as long as LFA
> mechanism in the FAR is enough.
>
> Therefore, in FAR, we can omit many unnecessary processing and the
> packet exchange. The benefits are fast convergence speed and much larger
> network scale than other dynamic routing protocol.
> Now there are some successful implementations of simplified routings in
> the regular topology in the HPC environment.
> Conclusion:As FAR needs few routing entries and the topology is
> regular, the database does not need to be updated regularly. Without the
> need for aging, there is no need for CPU and bandwidth overhead brought
> by LSA flood every 30 minutes, so the expansion of the network has no
> obvious effect on the performance of FAR, which is contrary to OSPF.
>
> --------"Network convergence doesn't follow link state
>            dynamics - Fast reroute exists. "
>
> Comparison of convergence time:
> The settings of OSPF spf_delay and spf_hold_time can affect the change
> of convergence time. The convergence time of the network with 2480 nodes
> is about 15-20 seconds(as seen in the following pages); while the FAR
> does not need to calculate the SFP, so there is no such convergence 
time.
> These issues *still exist*in rapid convergence technology of OSPF and
> ISIS (such as I-SPF). The convergence speed and network scale constraint
> each other. FAR does not have the above problems, and the convergence
> time is almost negligible.
>
> And test data is been include in another pptx material named OSPF in
> DCN(2).pptx, which can be download from IETF.
>
> Looking forward to further discussion.
>
> Best.
>
> Richard Bin Liu
>
>
> _______________________________________________
> rtgwg mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/rtgwg
>



_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg

Reply via email to