Greeting all:
I read comments on our draft, thank you for your comments.
And some questions had already been replied in our latest FAR presentation
material(not been presented at the meeting because of hard-deadline):
--------"Draft is highly subjective. Data Centers are using existing
protocols without problems."
Why OSPF and other conventional routing methods do not work well in a
large-scale network with several thousands of routers?
As everyone knows, the OSPF protocol uses multiple databases, more
topological exchange information (as seen in the following example) and
complicated algorithm. It requires routers to consume more memory and CPU
processing capability. But the processing rate of CPU on the protocol
message per second is very limited. When the network expands, CPU will
quickly approach its processing limits, and at this time OSPF can not
continue to expand the scale of the management. The SPF algorithm itself
does not thoroughly solve these problems.
On the contrary, FAR does not have the convergence time delay and the
additional CPU overheads, which SPF requires. Because in the initial
stage, FAR already knows the regular information of the whole network
topology and does not need to periodically do SPF operation.
One of the examples of "more topological exchange information":
In the OSPF protocol, LSA floods every 1800 seconds. Especially in the
larger network, the occupation of CPU and band bandwidth will soon reach
the router’s performance bottleneck.
In order to reduce these adverse effects, OSPF introduced the concept of
Area, which still has not solved the problem thoroughly). By dividing the
OSPF Area into several areas, the routers in the same area do not need to
know the topological details outside their area. (In comparison with FAR,
after OSPF introducing the concept of Area, the equivalent paths cannot
be selected in the whole network scope)
OSPF can achieve the following results by Area :
1) Routers only need to maintain the same link state databases as other
routers within the same Area, without the necessity of maintaining the
same link state database as all routers in the whole OSPF domain.
2) The reduction of the link state databases means dealing with relatively
fewer LSA, which reduces the CPU consumption of routers;
3) The large number of LSAs flood only within the same Area.
But, its negative effect is that the smaller number of routers which can
be managed in each OSPF area.
On the contrary, because FAR does not have the above disadvantages, FAR
can also manage large-scale network even without dividing Areas.
The aging time of OSPF is set in order to adapt to routing transformation
and protocol message exchange happened frequently in the irregular
topology. Its negative effect is:
when the network does not change, the LSA needs to be refreshed every 1800
seconds to reset the aging time. In the regular topology, as the routings
are fixed, it does not need the complex protocol message exchange and
aging rules to reflect the routing changes, as long as LFA mechanism in
the FAR is enough.
Therefore, in FAR, we can omit many unnecessary processing and the packet
exchange. The benefits are fast convergence speed and much larger network
scale than other dynamic routing protocol.
Now there are some successful implementations of simplified routings in
the regular topology in the HPC environment.
Conclusion:As FAR needs few routing entries and the topology is regular,
the database does not need to be updated regularly. Without the need for
aging, there is no need for CPU and bandwidth overhead brought by LSA
flood every 30 minutes, so the expansion of the network has no obvious
effect on the performance of FAR, which is contrary to OSPF.
--------"Network convergence doesn't follow link state
dynamics - Fast reroute exists. "
Comparison of convergence time:
The settings of OSPF spf_delay and spf_hold_time can affect the change of
convergence time. The convergence time of the network with 2480 nodes is
about 15-20 seconds(as seen in the following pages); while the FAR does
not need to calculate the SFP, so there is no such convergence time.
These issues still exist in rapid convergence technology of OSPF and ISIS
(such as I-SPF). The convergence speed and network scale constraint each
other. FAR does not have the above problems, and the convergence time is
almost negligible.
And test data is been include in another pptx material named OSPF in
DCN(2).pptx, which can be download from IETF.
Looking forward to further discussion.
Best.
Richard Bin Liu
_______________________________________________
rtgwg mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/rtgwg