Hi Raj,

Can you please review the blueprint and provide your feedback.

# 1. Introduction
Currently vRouter , VMI and Virtual Network UVE provide Packet Loss 
details(from Contrail 3.2). Due to unavailability of Python ruleset, 
Packet Loss alarm is not getting triggered.We are proposing to create new 
packet loss alarm in OPSERVER for Virtual Machine Interface(VMI), VRouter 
and Virtual Network(VN) by using dropstats statistics. Contrail GUI can 
use Opserver API to pull Packet loss statistics.

# 2. Problem statement
Currently vRouter captures Packet Loss counters by following UVEs. Due to 
unavailability of Python ruleset, Packet Loss alarm is not getting 
triggered. Below are the agents for each UVE.
    1. VrouterStatsAgent for vRouter
    2. UveVirtualNetworkAgent for Virtual Network
    3. UveVMInterfaceAgent for VirtualMachineInterface
# 3. Proposed solution
Create new alarms to detect packet loss . This will help in detecting 
failures early and can be rectified without any major issues.
dropstats percentage will be calculated using the formula: 
drop_pkts_percent = drop_pkts/( in_pkts + out_pkts).
     If drop_pkts_percent > 1% then notification will be raised and 
response
     will be triggered.
     Notification and response needs to be defined.
Use cases – Identified 7 fields in dropstats output for which alarm needs 
be created
Virtual Machine Interface , Vrouter, Physical Interface.
For Vrouter :
VrouterStatsAgent.exception_packets > 200

For Virtual Machine Interface , Vrouter, Physical Interface.
---Trap No IF  >50
drop_stats_1h.ds_trap_no_if > 50

counters are incremented when vrouter is not able to find the interface to 
trap the packets to vrouter agent, and should not happen in a working 
system.
---IF Drop  >50
drop_stats_1h.ds_interface_drop > 50
counters indicate packets that are dropped in the interface layer. The 
increase can typically happen when interface settings are wrong.
---Flow No Memory >50
drop_stats_1h.ds_flow_no_memory > 50
 counter increments when the flow block doesn't have enough memory to 
perform internal operations.
---ds_discard >50 or 100
drop_stats_1h.ds_discard > 50 
 counter tracks packets that hit a discard next hop. For various reasons 
interpreted by the agent and during some transient conditions, a route can 
point to a discard next hop. When packets hit that route, they are 
dropped.
--- Mcast Clone Fail > 30
drop_stats_1h.ds_mcast_clone_fail > 30
happens when the vrouter is not able to replicate a packet for flooding.
---Invalid NH  > 30
drop_stats_1h.ds_invalid_nh > 30 
counter tracks the number of packets that hit a next hop that was not in a 
state to be used (usually in transient conditions) or a next hop that was 
not expected, or no next hops when there was a next hop expected. Such 
increments happen rarely, and should not continuously increment.
---Rewrite Fail  >30
drop_stats_1h.ds_rewrite_fail
counter tracks the number of times vrouter was not able to write next hop 
rewrite data to the packet.

     ## 3.1 Alternatives considered
User can execute dropstats command manually and check or user can check 
values of Analytics API’s manually.
## 3.2 API schema changes
No schema changes.
## 3.3 User workflow impact
####Describe how users will use the feature.
Automatically alarms are created when python coded rules are satisfied.
## 3.4 UI changes
No UI changes.
## 3.5 Notification impact
TBD 
# 4. Implementation
## 4.1 Work items
 Changes in Opserver:
 New plugins are added in opserver for packet loss alarms.
 Install python plugin for an alarm on Analytics node.
 Plugins are added in the folder : controller/src/opserver/plugins/
 Implementation of the plugin is in (alarm_packet_loss/main.py).Python 
coded rules are added in the main.py files of the corresponding plugins.
 Plugins created are added in Sconscript.
 Python coded rule will check the dropstats values to the values in 
message table of the corresponding UVE of Virtual Machine Interface, 
VRouter and Virtual Network.
Create entry points in setup.py for the corresponding plugins.
# 5. Performance and scaling impact
No impact on performance
##5.2 Forwarding performance
# 6. Upgrade
We can add more alarms depending on the requirement.
# 7. Deprecations
####If this feature deprecates any older feature or API then list it here.

# 8. Dependencies
####Describe dependent features or components.

# 9. Testing
## 9.1 Unit tests
New tests will be added.
## 9.2 Dev tests
## 9.3 System tests

# 10. Documentation Impact

# 11. References


Thanks and Regards,
Pavani Addanki
____________________________________________

=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you


_______________________________________________
Dev mailing list
Dev@lists.opencontrail.org
http://lists.opencontrail.org/mailman/listinfo/dev_lists.opencontrail.org

Reply via email to