Two new findings:
1) MDS does not set an explicit RCVBUF size for its TIPC data socket. This
means it is upto the deployment to tune the kernel parameter
net.core.rmem_default so that it suits MDS. Problems have been found on systems
where net.core.rmem_default is 64KB. With TIPC segmentation/reassembly at 66000
byte it is probably wise to increase the default limit for the MDS TIPC data
socket to something like twice the amount of the TIPC segmentation/reassembly
limit - 128KB. If the system default is higher, it should be used.
2) The mds receive thread grabs the global lock after poll() but before
recvfrom(). This means the receive thread can be blocked by a sender if there
is outgoing link congestion. As I can understand there is no need to take the
lock between poll() and recvfrom(), instead it can be taken after recvfrom().
That will reduce the blocking time for the MDS receive thread emptying the TIPC
socket buffer thus reducing the risk for overload on it.
---
** [tickets:#654] MDS improvements**
**Status:** review
**Created:** Wed Dec 11, 2013 10:26 AM UTC by Hans Feldt
**Last Updated:** Fri Dec 13, 2013 12:42 PM UTC
**Owner:** Hans Feldt
Identified short comings:
- MDS does not use the segmentation/reassembly support in the underlying
transport protocol. For example TIPC accepts messages upto 66000 bytes
- The built in segmentation/reassembly is totally insecure, lost fragments are
not retransmitted and the complete message is dropped (without users knowledge)
- In TIPC mode DEST_DROPPABLE is not used at all. This means that messages can
be silently dropped at a receiving node at congestion.
Suggested improvements:
1) Introduce a variable fragmentation limit when sending messages. This needs
to be based on information received at service discovery. If the sender is
"old" use the classic 1400 byte limit. If the sender is new, first use TIPCs
segmentation/reassembly and then the MDS one.
2) Configure DEST_DROPPABLE=False and use returned messages for diagnostics (as
a first step). That means only for logging purposes and not for retransmission
(which is not possible since messages are not stored in a send queue)
I have working prototype patches for both. Will send out them shortly.
Using TIPC segmentation/reassembly gives a number of advantages:
- reduced risk of link congestion on sending node since TIPC counts messages
not bytes
- secure transport of large messages, TIPC handles retransmission
- possibly improved characteristics due to implicit use of Ethernet jumbo frames
Long term we should consider if MDS segmentation/reassembly can be removed.
Sending large messages should really be using stream sockets.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets