Re: syncrepl multicast MMR

2015-02-09 Thread Howard Chu

Andrew Findlay wrote:

On Sun, Feb 08, 2015 at 12:52:40PM +, Howard Chu wrote:


Been thinking this would be worth trying for a while now. Set a
config option for syncprov to send Persist messages to a multicast
group instead of the original TCP session. All the consumers would
also join the group and listen for updates. This would also exercise
the cldap:// support in libldap.


It certainly makes sense to have the network do more of the work
where it can. This would be particularly valuable in the high fan-out
case.

Encryption and message signing needs some thought - this is usually
harder to get right in datagram protocols than in streams.

While we are talking datagrams and multicast, have you looked at Fountain Codes?
It seems to me that they would be an ideal way to initialise a large set
of replica servers. They could also be used in the persist update case,
avoiding the need for any sort of back-channel.


Interesting reading. Seems a bit of overkill to me though; that's 
designed for multicast to millions of subscribers where a back-channel 
isn't feasible. Syncrepl would never be used with such a high fanout, 
and we already have the back-channel anyway, why not keep using it?



For those who have not met them, Fountain Codes allow you to broadcast
large datasets to an unknown number of receivers over lossy channels.
If well designed, each receiver needs to collect any randomly-chosen
subset of datagrams adding up to a few percent more bytes than the
source data. One such code is described in RFC5053, though there
appear to be patent issues to be considered.

Andrew




--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/



Re: syncrepl multicast MMR

2015-02-09 Thread Andrew Findlay
On Sun, Feb 08, 2015 at 12:52:40PM +, Howard Chu wrote:

> Been thinking this would be worth trying for a while now. Set a
> config option for syncprov to send Persist messages to a multicast
> group instead of the original TCP session. All the consumers would
> also join the group and listen for updates. This would also exercise
> the cldap:// support in libldap.

It certainly makes sense to have the network do more of the work
where it can. This would be particularly valuable in the high fan-out
case.

Encryption and message signing needs some thought - this is usually
harder to get right in datagram protocols than in streams.

While we are talking datagrams and multicast, have you looked at Fountain Codes?
It seems to me that they would be an ideal way to initialise a large set
of replica servers. They could also be used in the persist update case,
avoiding the need for any sort of back-channel.

For those who have not met them, Fountain Codes allow you to broadcast
large datasets to an unknown number of receivers over lossy channels.
If well designed, each receiver needs to collect any randomly-chosen
subset of datagrams adding up to a few percent more bytes than the
source data. One such code is described in RFC5053, though there
appear to be patent issues to be considered.

Andrew
-- 
---
| From Andrew Findlay, Skills 1st Ltd |
| Consultant in large-scale systems, networks, and directory services |
| http://www.skills-1st.co.uk/+44 1628 782565 |
---



Re: syncrepl multicast MMR

2015-02-08 Thread Emmanuel Lécharny
Le 09/02/15 05:15, Howard Chu a écrit :
> Emmanuel Lécharny wrote:
>> Le 08/02/15 13:52, Howard Chu a écrit :
>>> Been thinking this would be worth trying for a while now. Set a config
>>> option for syncprov to send Persist messages to a multicast group
>>> instead of the original TCP session. All the consumers would also join
>>> the group and listen for updates. This would also exercise the
>>> cldap:// support in libldap.
>>>
>>> Implementation details: since datagrams are unreliable, we need to
>>> include sequence numbers on each message, which the consumer can check
>>> to make sure it hasn't missed an update. Moreover, it should be able
>>> to send a request to the provider to resend (over the TCP session) the
>>> message corresponding to a given sequence number.
>>
>> Ok but how do you detect that a consumer has missed an update, if no
>> other update occurs ? You may have some desunchronized server for quite
>> a long period of time if you don't have a mechinism for the consumer to
>> regularly check if it is up to date.
>
> Good point, but easily solved with a periodic keepalive msg.

One more thing : you will have to deal with TLS at some point. There is
a RFC draft
(https://tools.ietf.org/html/draft-keoh-tls-multicast-security-00) that
proposes something, it seems to be 3 years old, and not active anymore.





Re: syncrepl multicast MMR

2015-02-08 Thread Emmanuel Lécharny
Le 09/02/15 05:15, Howard Chu a écrit :
> Emmanuel Lécharny wrote:
>> Le 08/02/15 13:52, Howard Chu a écrit :
>>> Been thinking this would be worth trying for a while now. Set a config
>>> option for syncprov to send Persist messages to a multicast group
>>> instead of the original TCP session. All the consumers would also join
>>> the group and listen for updates. This would also exercise the
>>> cldap:// support in libldap.
>>>
>>> Implementation details: since datagrams are unreliable, we need to
>>> include sequence numbers on each message, which the consumer can check
>>> to make sure it hasn't missed an update. Moreover, it should be able
>>> to send a request to the provider to resend (over the TCP session) the
>>> message corresponding to a given sequence number.
>>
>> Ok but how do you detect that a consumer has missed an update, if no
>> other update occurs ? You may have some desunchronized server for quite
>> a long period of time if you don't have a mechinism for the consumer to
>> regularly check if it is up to date.
>
> Good point, but easily solved with a periodic keepalive msg.
A heart-beat would be good to have : the producer would periodically
multi-cast the latest CSN, allowing desynchronized servers to catch up.

Another pb iw that Datagram are limited in size, which means big entries
will have to be split in many parts.