Hello openflowplugin-dev I founded a BUG, opened here https://bugs.opendaylight.org/show_bug.cgi?id=6625 <https://bugs.opendaylight.org/show_bug.cgi?id=6625> and I raised this BUG as critical, as it appears to me that it is really alarming for any production large scaled environment, as flapping network unfortunately happen a lot.
When RPC are called to add/remove/update a flow on the switch, and the switch in going down while the request is flying, the RPC never returns a result, as pending on the underlying async request to return. As it never returns because the device is no longer present, the thread is leaked along with the FD for the RESTCONF request that triggered the RPC. Current fix [1] is to setup a failed future once a timeout of 2000 milliseconds is reached. This way the RPC returns and resources are freed. About that timeout, I’ve seen that the RequestContext can be set with a timeout but this wasn’t doing anything. I think this issue is hiding a more deeper problem regarding resource management and the global tracking/livecycle of requests flying for a given switch. As when the switch goes down, all on-going requests should be closed. Please provide feedback on this. [1]: https://git.opendaylight.org/gerrit/#/c/45112/ <https://git.opendaylight.org/gerrit/#/c/45112/>
_______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
