Re: link monitoring

2021-05-01 Thread Saku Ytti
On Fri, 30 Apr 2021 at 00:35, Eric Kuhnke  wrote:

> The Junipers on both sides should have discrete SNMP OIDs that respond with a 
> FEC stress value, or FEC error value. See blue highlighted part here about 
> FEC. Depending on what version of JunOS you're running the MIB for it may or 
> may not exist.

This feature will be introduced by ER-079886 in some future date. You
may be confused about OTN FEC, which is available via MIB, but
unrelated to the topic.

I did plan to open a feature request for other vendors too, but I've
been lazy. It is broadly missing, We are doing very little as a
community to address problems before they become symptomatic and
undercapitalising the information we already have from DDM and RS-FEC.


Only slightly on-topic; people who interact with optical vendors might
want to ask about propagating RS-FEC correctable errors. RS-FEC of
course is point-to-point, so in your active optical system it
terminates on the first hop. But technically nothing stopping far end
optical link inducing RS-FEC correctable error, if there was an error.
Perhaps even standard to discriminate between organic near-hop RS-FEC
correctable error from induced.
We have a sort of precedent for this, as some cut-through switches can
discriminate between near-hop FCS error from other type of FCS,
because of course sender will know about FCS after it already sent the
frame, but it can add some symbol in this case, to let the receiver
know it's not near-end. This allows the receiver to keep two FCS
counters.

-- 
  ++ytti


Re: link monitoring

2021-04-30 Thread Michel Blais
Y.1731 or TWAMP if available on those devices.

Le ven. 30 avr. 2021 17:57, Colton Conor  a écrit :

> What NMS is everyone using to graph and alert on this data?
>
> On Fri, Apr 30, 2021 at 7:49 AM Alain Hebert  wrote:
>
>> Yes the JNP DOM MIB is what you are looking for.
>>
>> It also the traps for warnings and alarms thresholds you can use
>> which is driven by the optic own parameters.
>> ( Human Interface: show interfaces diagnostics optics  ] )
>>
>> TLDR:
>>
>> Realtime: Traps;
>> Monitoring: DOM MIB;
>>
>> PS: I suggest you join [ juniper-...@puck.nether.net ] mailing list.
>>
>> -
>> Alain Hebertaheb...@pubnix.net
>> PubNIX Inc.
>> 50 boul. St-Charles
>> P.O. Box 26770 Beaconsfield, Quebec H9W 6G7
>> Tel: 514-990-5911  http://www.pubnix.netFax: 514-990-9443
>>
>> On 4/29/21 5:32 PM, Eric Kuhnke wrote:
>>
>> The Junipers on both sides should have discrete SNMP OIDs that respond
>> with a FEC stress value, or FEC error value. See blue highlighted part here
>> about FEC. Depending on what version of JunOS you're running the MIB for it
>> may or may not exist.
>>
>>
>> https://kb.juniper.net/InfoCenter/index?page=content=KB36074=MX2008=LIST
>>
>> In other equipment sometimes it's found in a sub-tree of SNMP adjacent to
>> optical DOM values. Once you can acquire and poll that value, set it up as
>> a custom thing to graph and alert upon certain threshold values in your
>> choice of NMS.
>>
>> Additionally signs of a failing optic may show up in some of the optical
>> DOM MIB items you can poll:
>> https://mibs.observium.org/mib/JUNIPER-DOM-MIB/
>>
>> It helps if you have some non-misbehaving similar linecards and optics
>> which can be polled during custom graph/OID configuration, to establish a
>> baseline 'no problem' value, which if exceeded will trigger whatever
>> threshold value you set in your monitoring system.
>>
>> On Thu, Apr 29, 2021 at 1:40 PM Baldur Norddahl <
>> baldur.nordd...@gmail.com> wrote:
>>
>>> Hello
>>>
>>> We had a 100G link that started to misbehave and caused the customers to
>>> notice bad packet loss. The optical values are just fine but we had packet
>>> loss and latency. Interface shows FEC errors on one end and carrier
>>> transitions on the other end. But otherwise the link would stay up and our
>>> monitor system completely failed to warn about the failure. Had to find the
>>> bad link by traceroute (mtr) and observe where packet loss started.
>>>
>>> The link was between a Juniper MX204 and Juniper ACX5448. Link length 2
>>> meters using 2 km single mode SFP modules.
>>>
>>> What is the best practice to monitor links to avoid this scenarium? What
>>> options do we have to do link monitoring? I am investigating BFD but I am
>>> unsure if that would have helped the situation.
>>>
>>> Thanks,
>>>
>>> Baldur
>>>
>>>
>>>
>>


Re: link monitoring

2021-04-30 Thread Colton Conor
What NMS is everyone using to graph and alert on this data?

On Fri, Apr 30, 2021 at 7:49 AM Alain Hebert  wrote:

> Yes the JNP DOM MIB is what you are looking for.
>
> It also the traps for warnings and alarms thresholds you can use which
> is driven by the optic own parameters.
> ( Human Interface: show interfaces diagnostics optics  ] )
>
> TLDR:
>
> Realtime: Traps;
> Monitoring: DOM MIB;
>
> PS: I suggest you join [ juniper-...@puck.nether.net ] mailing list.
>
> -
> Alain Hebertaheb...@pubnix.net
> PubNIX Inc.
> 50 boul. St-Charles
> P.O. Box 26770 Beaconsfield, Quebec H9W 6G7
> Tel: 514-990-5911  http://www.pubnix.netFax: 514-990-9443
>
> On 4/29/21 5:32 PM, Eric Kuhnke wrote:
>
> The Junipers on both sides should have discrete SNMP OIDs that respond
> with a FEC stress value, or FEC error value. See blue highlighted part here
> about FEC. Depending on what version of JunOS you're running the MIB for it
> may or may not exist.
>
>
> https://kb.juniper.net/InfoCenter/index?page=content=KB36074=MX2008=LIST
>
> In other equipment sometimes it's found in a sub-tree of SNMP adjacent to
> optical DOM values. Once you can acquire and poll that value, set it up as
> a custom thing to graph and alert upon certain threshold values in your
> choice of NMS.
>
> Additionally signs of a failing optic may show up in some of the optical
> DOM MIB items you can poll:
> https://mibs.observium.org/mib/JUNIPER-DOM-MIB/
>
> It helps if you have some non-misbehaving similar linecards and optics
> which can be polled during custom graph/OID configuration, to establish a
> baseline 'no problem' value, which if exceeded will trigger whatever
> threshold value you set in your monitoring system.
>
> On Thu, Apr 29, 2021 at 1:40 PM Baldur Norddahl 
> wrote:
>
>> Hello
>>
>> We had a 100G link that started to misbehave and caused the customers to
>> notice bad packet loss. The optical values are just fine but we had packet
>> loss and latency. Interface shows FEC errors on one end and carrier
>> transitions on the other end. But otherwise the link would stay up and our
>> monitor system completely failed to warn about the failure. Had to find the
>> bad link by traceroute (mtr) and observe where packet loss started.
>>
>> The link was between a Juniper MX204 and Juniper ACX5448. Link length 2
>> meters using 2 km single mode SFP modules.
>>
>> What is the best practice to monitor links to avoid this scenarium? What
>> options do we have to do link monitoring? I am investigating BFD but I am
>> unsure if that would have helped the situation.
>>
>> Thanks,
>>
>> Baldur
>>
>>
>>
>


Re: link monitoring

2021-04-30 Thread Alain Hebert

    Yes the JNP DOM MIB is what you are looking for.

    It also the traps for warnings and alarms thresholds you can use 
which is driven by the optic own parameters.

    ( Human Interface: show interfaces diagnostics optics  ] )

    TLDR:

        Realtime: Traps;
        Monitoring: DOM MIB;

    PS: I suggest you join [ juniper-...@puck.nether.net ] mailing list.

-
Alain Hebertaheb...@pubnix.net
PubNIX Inc.
50 boul. St-Charles
P.O. Box 26770 Beaconsfield, Quebec H9W 6G7
Tel: 514-990-5911  http://www.pubnix.netFax: 514-990-9443

On 4/29/21 5:32 PM, Eric Kuhnke wrote:
The Junipers on both sides should have discrete SNMP OIDs that respond 
with a FEC stress value, or FEC error value. See blue highlighted part 
here about FEC. Depending on what version of JunOS you're running the 
MIB for it may or may not exist.


https://kb.juniper.net/InfoCenter/index?page=content=KB36074=MX2008=LIST 
<https://kb.juniper.net/InfoCenter/index?page=content=KB36074=MX2008=LIST>


In other equipment sometimes it's found in a sub-tree of SNMP adjacent 
to optical DOM values. Once you can acquire and poll that value, set 
it up as a custom thing to graph and alert upon certain threshold 
values in your choice of NMS.


Additionally signs of a failing optic may show up in some of the 
optical DOM MIB items you can poll: 
https://mibs.observium.org/mib/JUNIPER-DOM-MIB/ 
<https://mibs.observium.org/mib/JUNIPER-DOM-MIB/>


It helps if you have some non-misbehaving similar linecards and optics 
which can be polled during custom graph/OID configuration, to 
establish a baseline 'no problem' value, which if exceeded will 
trigger whatever threshold value you set in your monitoring system.


On Thu, Apr 29, 2021 at 1:40 PM Baldur Norddahl 
mailto:baldur.nordd...@gmail.com>> wrote:


Hello

We had a 100G link that started to misbehave and caused the
customers to notice bad packet loss. The optical values are just
fine but we had packet loss and latency. Interface shows FEC
errors on one end and carrier transitions on the other end. But
otherwise the link would stay up and our monitor system completely
failed to warn about the failure. Had to find the bad link by
traceroute (mtr) and observe where packet loss started.

The link was between a Juniper MX204 and Juniper ACX5448. Link
length 2 meters using 2 km single mode SFP modules.

What is the best practice to monitor links to avoid this
scenarium? What options do we have to do link monitoring? I am
investigating BFD but I am unsure if that would have helped the
situation.

Thanks,

Baldur






RE: link monitoring

2021-04-29 Thread Travis Garrison
We use LibreNMS and smokeping to monitor latency and dropped packets on all our 
links and setup alerts if they go over a certain threshold. We are working on a 
script to automatically reroute traffic based on the alerts to route around the 
bad link to give us time to fix it.

Thanks
Travis

From: NANOG  On Behalf Of 
Baldur Norddahl
Sent: Thursday, April 29, 2021 3:39 PM
To: nanog@nanog.org
Subject: link monitoring

Hello

We had a 100G link that started to misbehave and caused the customers to notice 
bad packet loss. The optical values are just fine but we had packet loss and 
latency. Interface shows FEC errors on one end and carrier transitions on the 
other end. But otherwise the link would stay up and our monitor system 
completely failed to warn about the failure. Had to find the bad link by 
traceroute (mtr) and observe where packet loss started.

The link was between a Juniper MX204 and Juniper ACX5448. Link length 2 meters 
using 2 km single mode SFP modules.

What is the best practice to monitor links to avoid this scenarium? What 
options do we have to do link monitoring? I am investigating BFD but I am 
unsure if that would have helped the situation.

Thanks,

Baldur




Re: link monitoring

2021-04-29 Thread Eric Kuhnke
If I may add one thing I forgot, this post reminded me. In the question I
think it was probably a 100G CWDM4 short distance link. When monitoring a
100G coherent (QPSK, 16QAM, whatever) longer distance link, be absolutely
sure to poll all of the SNMP OIDs for it the same as if it was a point to
point microwave link.

Depending on exactly what line card and optic it is, it may behave somewhat
similarly to a faded or misaligned radio link under conditions related to
degradation of the fiber or the lasers. In particular I'm thinking of
coherent 100G linecards that can switch on the fly between 'low FEC' and
'high FEC' payload vs FEC percentage (much as an ACM-capable 18 or 23 GHz
band radio would), which should absolutely trigger an alarm. And also the
data for FEC decode stress percentage level, etc.

On Thu, Apr 29, 2021 at 2:37 PM Lady Benjamin Cannon of Glencoe, ASCE <
l...@6by7.net> wrote:

> We monitor light levels and FEC values on all links and have thresholds
> for early-warning and PRe-failure analysis.
>
> Short answer is yes we see links lose packets before completely failing
> and for dozens of reasons that’s still a good thing, but you need to
> monitor every part of a resilient network.
>
> Ms. Lady Benjamin PD Cannon of Glencoe, ASCE
> 6x7 Networks & 6x7 Telecom, LLC
> CEO
> l...@6by7.net
> "The only fully end-to-end encrypted global telecommunications company
> in the world.”
>
> FCC License KJ6FJJ
>
> Sent from my iPhone via RFC1149.
>
> On Apr 29, 2021, at 2:32 PM, Eric Kuhnke  wrote:
>
> 
> The Junipers on both sides should have discrete SNMP OIDs that respond
> with a FEC stress value, or FEC error value. See blue highlighted part here
> about FEC. Depending on what version of JunOS you're running the MIB for it
> may or may not exist.
>
>
> https://kb.juniper.net/InfoCenter/index?page=content=KB36074=MX2008=LIST
>
> In other equipment sometimes it's found in a sub-tree of SNMP adjacent to
> optical DOM values. Once you can acquire and poll that value, set it up as
> a custom thing to graph and alert upon certain threshold values in your
> choice of NMS.
>
> Additionally signs of a failing optic may show up in some of the optical
> DOM MIB items you can poll:
> https://mibs.observium.org/mib/JUNIPER-DOM-MIB/
>
> It helps if you have some non-misbehaving similar linecards and optics
> which can be polled during custom graph/OID configuration, to establish a
> baseline 'no problem' value, which if exceeded will trigger whatever
> threshold value you set in your monitoring system.
>
> On Thu, Apr 29, 2021 at 1:40 PM Baldur Norddahl 
> wrote:
>
>> Hello
>>
>> We had a 100G link that started to misbehave and caused the customers to
>> notice bad packet loss. The optical values are just fine but we had packet
>> loss and latency. Interface shows FEC errors on one end and carrier
>> transitions on the other end. But otherwise the link would stay up and our
>> monitor system completely failed to warn about the failure. Had to find the
>> bad link by traceroute (mtr) and observe where packet loss started.
>>
>> The link was between a Juniper MX204 and Juniper ACX5448. Link length 2
>> meters using 2 km single mode SFP modules.
>>
>> What is the best practice to monitor links to avoid this scenarium? What
>> options do we have to do link monitoring? I am investigating BFD but I am
>> unsure if that would have helped the situation.
>>
>> Thanks,
>>
>> Baldur
>>
>>
>>


Re: link monitoring

2021-04-29 Thread Lady Benjamin Cannon of Glencoe, ASCE
We monitor light levels and FEC values on all links and have thresholds for 
early-warning and PRe-failure analysis. 

Short answer is yes we see links lose packets before completely failing and for 
dozens of reasons that’s still a good thing, but you need to monitor every part 
of a resilient network. 

Ms. Lady Benjamin PD Cannon of Glencoe, ASCE
6x7 Networks & 6x7 Telecom, LLC 
CEO 
l...@6by7.net
"The only fully end-to-end encrypted global telecommunications company in the 
world.”

FCC License KJ6FJJ

Sent from my iPhone via RFC1149.

> On Apr 29, 2021, at 2:32 PM, Eric Kuhnke  wrote:
> 
> 
> The Junipers on both sides should have discrete SNMP OIDs that respond with a 
> FEC stress value, or FEC error value. See blue highlighted part here about 
> FEC. Depending on what version of JunOS you're running the MIB for it may or 
> may not exist.
> 
> https://kb.juniper.net/InfoCenter/index?page=content=KB36074=MX2008=LIST
> 
> In other equipment sometimes it's found in a sub-tree of SNMP adjacent to 
> optical DOM values. Once you can acquire and poll that value, set it up as a 
> custom thing to graph and alert upon certain threshold values in your choice 
> of NMS. 
> 
> Additionally signs of a failing optic may show up in some of the optical DOM 
> MIB items you can poll: https://mibs.observium.org/mib/JUNIPER-DOM-MIB/
> 
> It helps if you have some non-misbehaving similar linecards and optics which 
> can be polled during custom graph/OID configuration, to establish a baseline 
> 'no problem' value, which if exceeded will trigger whatever threshold value 
> you set in your monitoring system.
> 
>> On Thu, Apr 29, 2021 at 1:40 PM Baldur Norddahl  
>> wrote:
>> Hello
>> 
>> We had a 100G link that started to misbehave and caused the customers to 
>> notice bad packet loss. The optical values are just fine but we had packet 
>> loss and latency. Interface shows FEC errors on one end and carrier 
>> transitions on the other end. But otherwise the link would stay up and our 
>> monitor system completely failed to warn about the failure. Had to find the 
>> bad link by traceroute (mtr) and observe where packet loss started.
>> 
>> The link was between a Juniper MX204 and Juniper ACX5448. Link length 2 
>> meters using 2 km single mode SFP modules.
>> 
>> What is the best practice to monitor links to avoid this scenarium? What 
>> options do we have to do link monitoring? I am investigating BFD but I am 
>> unsure if that would have helped the situation.
>> 
>> Thanks,
>> 
>> Baldur
>> 
>> 


Re: link monitoring

2021-04-29 Thread Eric Kuhnke
The Junipers on both sides should have discrete SNMP OIDs that respond with
a FEC stress value, or FEC error value. See blue highlighted part here
about FEC. Depending on what version of JunOS you're running the MIB for it
may or may not exist.

https://kb.juniper.net/InfoCenter/index?page=content=KB36074=MX2008=LIST

In other equipment sometimes it's found in a sub-tree of SNMP adjacent to
optical DOM values. Once you can acquire and poll that value, set it up as
a custom thing to graph and alert upon certain threshold values in your
choice of NMS.

Additionally signs of a failing optic may show up in some of the optical
DOM MIB items you can poll: https://mibs.observium.org/mib/JUNIPER-DOM-MIB/

It helps if you have some non-misbehaving similar linecards and optics
which can be polled during custom graph/OID configuration, to establish a
baseline 'no problem' value, which if exceeded will trigger whatever
threshold value you set in your monitoring system.

On Thu, Apr 29, 2021 at 1:40 PM Baldur Norddahl 
wrote:

> Hello
>
> We had a 100G link that started to misbehave and caused the customers to
> notice bad packet loss. The optical values are just fine but we had packet
> loss and latency. Interface shows FEC errors on one end and carrier
> transitions on the other end. But otherwise the link would stay up and our
> monitor system completely failed to warn about the failure. Had to find the
> bad link by traceroute (mtr) and observe where packet loss started.
>
> The link was between a Juniper MX204 and Juniper ACX5448. Link length 2
> meters using 2 km single mode SFP modules.
>
> What is the best practice to monitor links to avoid this scenarium? What
> options do we have to do link monitoring? I am investigating BFD but I am
> unsure if that would have helped the situation.
>
> Thanks,
>
> Baldur
>
>
>


Re: link monitoring

2021-04-29 Thread Pete Rohrman

I'll sell you my Solar Winds license - cheap!

Pete Rohrman
Stage2 Support
212 497 8000, Opt. 2

On 4/29/21 4:39 PM, Baldur Norddahl wrote:

Hello

We had a 100G link that started to misbehave and caused the customers 
to notice bad packet loss. The optical values are just fine but we had 
packet loss and latency. Interface shows FEC errors on one end and 
carrier transitions on the other end. But otherwise the link would 
stay up and our monitor system completely failed to warn about the 
failure. Had to find the bad link by traceroute (mtr) and observe 
where packet loss started.


The link was between a Juniper MX204 and Juniper ACX5448. Link length 
2 meters using 2 km single mode SFP modules.


What is the best practice to monitor links to avoid this scenarium? 
What options do we have to do link monitoring? I am investigating BFD 
but I am unsure if that would have helped the situation.


Thanks,

Baldur




link monitoring

2021-04-29 Thread Baldur Norddahl
Hello

We had a 100G link that started to misbehave and caused the customers to
notice bad packet loss. The optical values are just fine but we had packet
loss and latency. Interface shows FEC errors on one end and carrier
transitions on the other end. But otherwise the link would stay up and our
monitor system completely failed to warn about the failure. Had to find the
bad link by traceroute (mtr) and observe where packet loss started.

The link was between a Juniper MX204 and Juniper ACX5448. Link length 2
meters using 2 km single mode SFP modules.

What is the best practice to monitor links to avoid this scenarium? What
options do we have to do link monitoring? I am investigating BFD but I am
unsure if that would have helped the situation.

Thanks,

Baldur


Re: link monitoring and BFD in SDN networks

2015-01-22 Thread Sudeep Khuraijam
Gents, 

We need to separate the context of fast reroute via control plane topology
map vs local link protection with OAM at mac/phy sub-layer and time frames
at which they are relevant.
There are efforts going on at the media level but then there are current
solutions that are media and encapsulation independent which need to be
juxtaposed to the SDN paradigm.

Going back to the original question that Glen posed, it is more a question
on implementation complexity.
The more state machines that are pushed down to the Nodes in SDN network
away from the control plane, the more cost and barriers to entry for OEM
products, inter-op issues etc.
Now looking squarely at BFD, the popular application is bootstrapping BFD
link state to routing topology and peer pathway which may traverse
multiple nodes/switches/media and encapsulations.
BFD is a next hop communication failure detection mechanism which may
itself rely (bootstrap) on routing topology to find alternate paths and is
therefore a larger time frame event than a phy/mac sub layer protection,
and is media/encapsulation independent.  And the fact that such a state
change will have a high probability to trigger a topology/network  wide
event (if not less need to run BFD) makes it a controller centric state
which it needs to bootstrap its routing services on.  Link layer OAM on
the other hand may be a mechanism that protects the BFD event from
triggering.  

Further, BFD will enable faster end to end connectivity
communication/reachability  detection than hold down timers allow on
hardware that do not support OAM features.  Finally the scale at which BFD
is used is far less than the number of links. I.e if you have a 10K port
network, you are likely using BFD on a few tens maybe (for Datacenters)
and the timescale is typically in the 100s ms which any control plane
software module can handle at large scale and should be run just like any
hello protocols for routing services. Link layer state machines on the
nodes on the other hand operate in the sub 1ms timeframe.  It is an
overhead, but an insignificantly small tax.

Cheers,

Sudeep Khuraijam

On 1/21/15, 3:14 PM, Nitin Sharma nitin...@gmail.com wrote:

On Wed, Jan 21, 2015 at 12:22 PM, Ronald van der Pol 
ronald.vander...@rvdp.org wrote:

 On Mon, Jan 19, 2015 at 22:55:04 +, Dave Bell wrote:

  http://www.rvdp.org/presentations/SC11-SRS-8021ag.pdf;

 The 802.1ag code used is open source and available on:
 https://svn.surfnet.nl/trac/dot1ag-utils/

  Of course if you want fast failover, you need to send packets very
  rapidly. Every 250ms is not unreasonable. This is going to cause the
  control plane to get very chatty. Typically on high end routers,
  processes such as BFD are actually ran on line cards as opposed to on
  the routing engine. When a failure is detected this reports up into
  the control plane to trigger a reconvergence event. I see no reason
  why this couldn't occur using SDN.

 Exactly. This is something you want to do in hardware, especially
 if you want to do fast reroute with the OpenFlow group table.
 Problem is that many 1U OpenFlow switches do not support 802.1ag.
 We made the propotype mentioned above to show and investigate the
 benefits of OAM. The closed open networking foundation is supposed
 to be working on this, but I don't know the status because their
 mailing lists are closed.

 In SDN/OpenFlow I think a couple of things are needed:
 - configure 802.1ag on the interfaces (via ofconfig?)
 - configure OpenFlow paths (e.g. primary and backup) and also create
   forwarding entries for 802.1ag datagrams along those paths
 - configure fast reroute with the group table (ofconfig?)


Fast reroute (in the form of fast failover) is supported in the OF spec
(1.3+), using Group Tables.


 By doing this detection and failover are handled in hardware.

 rvdp


Data plane reachability could be performed in SDN/OpenFlow networks using
BFD/ Ethernet CFM (802.1ag), Y.1731, preferably on silicon if there is
support (which i believe every silicon vendor should work on). It would
not
be ideal if these OAM frames are forwarded to a central controller. Today
-
I think it is done on some form of software layer (ovs, sdks) that reside
on these OF switches.



Re: link monitoring and BFD in SDN networks

2015-01-21 Thread Ronald van der Pol
On Mon, Jan 19, 2015 at 22:55:04 +, Dave Bell wrote:

 http://www.rvdp.org/presentations/SC11-SRS-8021ag.pdf;

The 802.1ag code used is open source and available on:
https://svn.surfnet.nl/trac/dot1ag-utils/

 Of course if you want fast failover, you need to send packets very
 rapidly. Every 250ms is not unreasonable. This is going to cause the
 control plane to get very chatty. Typically on high end routers,
 processes such as BFD are actually ran on line cards as opposed to on
 the routing engine. When a failure is detected this reports up into
 the control plane to trigger a reconvergence event. I see no reason
 why this couldn't occur using SDN.

Exactly. This is something you want to do in hardware, especially
if you want to do fast reroute with the OpenFlow group table.
Problem is that many 1U OpenFlow switches do not support 802.1ag.
We made the propotype mentioned above to show and investigate the
benefits of OAM. The closed open networking foundation is supposed
to be working on this, but I don't know the status because their
mailing lists are closed.

In SDN/OpenFlow I think a couple of things are needed:
- configure 802.1ag on the interfaces (via ofconfig?)
- configure OpenFlow paths (e.g. primary and backup) and also create
  forwarding entries for 802.1ag datagrams along those paths
- configure fast reroute with the group table (ofconfig?)
By doing this detection and failover are handled in hardware.

rvdp


Re: link monitoring and BFD in SDN networks

2015-01-21 Thread Nitin Sharma
On Wed, Jan 21, 2015 at 12:22 PM, Ronald van der Pol 
ronald.vander...@rvdp.org wrote:

 On Mon, Jan 19, 2015 at 22:55:04 +, Dave Bell wrote:

  http://www.rvdp.org/presentations/SC11-SRS-8021ag.pdf;

 The 802.1ag code used is open source and available on:
 https://svn.surfnet.nl/trac/dot1ag-utils/

  Of course if you want fast failover, you need to send packets very
  rapidly. Every 250ms is not unreasonable. This is going to cause the
  control plane to get very chatty. Typically on high end routers,
  processes such as BFD are actually ran on line cards as opposed to on
  the routing engine. When a failure is detected this reports up into
  the control plane to trigger a reconvergence event. I see no reason
  why this couldn't occur using SDN.

 Exactly. This is something you want to do in hardware, especially
 if you want to do fast reroute with the OpenFlow group table.
 Problem is that many 1U OpenFlow switches do not support 802.1ag.
 We made the propotype mentioned above to show and investigate the
 benefits of OAM. The closed open networking foundation is supposed
 to be working on this, but I don't know the status because their
 mailing lists are closed.

 In SDN/OpenFlow I think a couple of things are needed:
 - configure 802.1ag on the interfaces (via ofconfig?)
 - configure OpenFlow paths (e.g. primary and backup) and also create
   forwarding entries for 802.1ag datagrams along those paths
 - configure fast reroute with the group table (ofconfig?)


Fast reroute (in the form of fast failover) is supported in the OF spec
(1.3+), using Group Tables.


 By doing this detection and failover are handled in hardware.

 rvdp


Data plane reachability could be performed in SDN/OpenFlow networks using
BFD/ Ethernet CFM (802.1ag), Y.1731, preferably on silicon if there is
support (which i believe every silicon vendor should work on). It would not
be ideal if these OAM frames are forwarded to a central controller. Today -
I think it is done on some form of software layer (ovs, sdks) that reside
on these OF switches.


Re: link monitoring and BFD in SDN networks

2015-01-19 Thread Dave Bell
BFD etc aim to prove there is end-to-end connectivity between two
points, not just that all links are up along the path. All ports could
be up, but end-to-end connectivity broken, for example a misconfigured
VLAN across a L2 network. Sending some kind of packet across the
network is pretty much the only way to guarantee reachability.

The OpenFlow protocol in particular has a way to instruct a switch to
send a frame out of an interface. By default, the OpenFlow switches
will forward all frames it has received and doesn't know what to do
with back to the controller. This means someone could write an OAM
protocol that will work via OpenFlow. A quick google for 'OpenFlow
OAM' brought me this link which has someone who has done just that:
http://www.rvdp.org/presentations/SC11-SRS-8021ag.pdf;

Of course if you want fast failover, you need to send packets very
rapidly. Every 250ms is not unreasonable. This is going to cause the
control plane to get very chatty. Typically on high end routers,
processes such as BFD are actually ran on line cards as opposed to on
the routing engine. When a failure is detected this reports up into
the control plane to trigger a reconvergence event. I see no reason
why this couldn't occur using SDN.

Regards,
Dave

On 19 January 2015 at 22:01, Glen Kent glen.k...@gmail.com wrote:
 Hi,

 Routers connected back to back often rely on BFD for link failures. Its
 certainly possible that there is a switch between two routers and hence a
 link down event on one side is not visible to the other side. So, you run
 some sort of an OAM protocol on the two routers so that they can detect
 link flaps/failures.

 How will this happen in SDN networks where there is no control plane on the
 routers. Will the routers be sending a state of all their links to a
 central controller who will then detect that a link has gone down. This
 just doesnt sound good. I am presuming that some sort of control plane
 will always be required.

 Any pointers here?

 Is there any other reason other than link events for which we would need a
 control plane on the routers in SDN?

 Thanks,
 Glen


link monitoring and BFD in SDN networks

2015-01-19 Thread Glen Kent
Hi,

Routers connected back to back often rely on BFD for link failures. Its
certainly possible that there is a switch between two routers and hence a
link down event on one side is not visible to the other side. So, you run
some sort of an OAM protocol on the two routers so that they can detect
link flaps/failures.

How will this happen in SDN networks where there is no control plane on the
routers. Will the routers be sending a state of all their links to a
central controller who will then detect that a link has gone down. This
just doesnt sound good. I am presuming that some sort of control plane
will always be required.

Any pointers here?

Is there any other reason other than link events for which we would need a
control plane on the routers in SDN?

Thanks,
Glen