[jira] [Commented] (ROCKETMQ-193) Develop rocketmq-redis-replicator component

ASF GitHub Bot (JIRA) Wed, 06 Sep 2017 23:18:44 -0700

    [ 
https://issues.apache.org/jira/browse/ROCKETMQ-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156531#comment-16156531
 ]


ASF GitHub Bot commented on ROCKETMQ-193:
-----------------------------------------

Github user Zhang-Ke closed the pull request at:

    https://github.com/apache/incubator-rocketmq-externals/pull/23


> Develop rocketmq-redis-replicator component
> -------------------------------------------
>
>                 Key: ROCKETMQ-193
>                 URL: https://issues.apache.org/jira/browse/ROCKETMQ-193
>             Project: Apache RocketMQ
>          Issue Type: Task
>            Reporter: Rich Zhang
>            Assignee: Rich Zhang
>            Priority: Minor
>             Fix For: 4.2.0-incubating
>
>
> Design:
>   Redis supplies an official replication mechanism , and slave communicates 
> to master with RESP protocol, so a natural way to design the 
> rocketmq-redis-replicator component is simulating itself as a slave, sending 
> commands to master and receiving datas from master timely, and then resending 
> to rocketmq broker.
>   If you are not familiar with redis replication mechanism, please learn this 
> section first [1]. After that, I will illustrate some key points ahead.
> 1. To make slave start from the point where it left off when it reconnects, 
> slave and master should agree on a master runId and a replication offset. 
> Slave acknowledges this offset to master periodically. In other words，slave 
> may received duplicate commands. Along with, the rocketmq-redis-replicator 
> component may send duplicate messages too. A good way to minimize the 
> duplicate time window is reducing the "ack period" to a smaller one, such as 
> 100ms.
> 2. If slave keeps offline for some time, it’s easy to use up backlog whose 
> default value is just 1M, especially for a high-traffic redis instance. 
> Unfortunatelly,if slave replication offset has already been covered in master 
> backlog, a full synchronization will have to execute, which is unacceptable 
> for rocketmq-redis-replicator component as a large number of messages will be 
> sent out intensively.
> 3. When synchronizing from master fully, master will generate a new rdb 
> file(the rdb file format [2])，and slave will receive this file,store in disk, 
> and last apply to memory. This strategy makes slave reaches a consistent 
> state with master as soon as possible, and hardly fail. For 
> rocketmq-redis-replicator component, it’s also a good way to prevent  
> synchronizing initial rdb file from failure in halfway. 
>   There already an open source project [3] which focuses on replicating redis 
> data, and provides api to handle data received [4]. The principal thoughts 
> are simulating itself as a slave , following official replication procedure, 
> communicating with master by RESP, and acking master with replication offset. 
> Base on this project to develop is a good idea, meanwhile some aspects should 
> also be enhanced and considered more robust. Here is some points: 
>   [High Available]
>    Keeping the replication component's high availability is not difficult but 
> important, not only for providing an uninterruptible service. If component 
> leaves off for some time, a unacceptable full synchronization may be 
> triggered. 
>    It’s also easy to reach high availability, including adopting master/slave 
> module, using zookeeper to coordinate and switch master/slave, storing data 
> onto zookeeper to keep component stateless. 
>   [Data Loss]
>    Generally, data loss should be tried best to avoid. The key point is that 
> slave only acks replication offset to master after sending command to 
> rocketmq broker successfully.
>   [Data Stale]
>    It also happened when slave reconnect. Consider case below:
>    `time1`         `time2`          `time3`
>    set k=a         set k=b         set k=c
>    If slave left off at time3, but the latest replication offset reported to 
> master is only at time1, when slave reconnected, it re-apply commands “set 
> k=b… set k=c”. In a small time window, “k” will equal the stale “b” until 
> “set k=c” command is applied. So the slave offline time shorter, the better.
>    [Message Order]
>    Redis uses single thread model to keep command execute in order, because 
> of its high performance. Replicating data with a single thread in slave is 
> also fine, as it is also totally memory operation. But sending all data to 
> rocketmq in a global order is a good choose? Producer should have no 
> performance issue, but consumer may not be able to consume messages in time, 
> especially redis was in a high load. 
>    Hashing “KEY” to different rocketmq queue is a good strategy. Guarantee 
> the same key operation route to a unique queue, to keep partial ordered, and 
> the downstream consumer could consume messages concurrently. Of course, some 
> dependency “KEY”s may need hash to a unique queue too. We should supply 
> configuration or api to support this individuation. 
>    [Transaction]
>     Redis supports simple transaction. A transaction starts with a “MULTI” 
> command, and redis buffers latter commands and execute them until receiving a 
> “EXEC” command. But if one of the buffered commands executes fail, the 
> preceding executed commands won’t roll back, and the latter commands will be 
> discarded. So redis transaction could not guarantee atomic.
>     In rocketmq, it’s also impossible to gather multi messages consume 
> operation into a transaction. But the rocketmq-redis-replicator component 
> will just receive multi commands after redis server get a “EXEC” command. 
> From this aspect, the “transaction semantic doesn’t strengthen or weaken 
> after this component resend messages.
>    [Avoid component switched to master]
>    In Sentinel or Redis Cluster, master crash could be detected by some 
> mechanism, and one slave will be switched to master. Master has all 
> information about its slaves, and the candidate slave is picked up 
> automatically. Obviously, rocketmq-redis-replicator component have no ability 
> to undertake master role. Configure this component to “read only” slave is a 
> good way to avoid the component switch to master.
>   [Support Redis Cluster]
>    No matter Redis Cluster, or previous Partition, a slave only keeps track 
> of one master. So the replication mechanism won’t change in single-node redis 
> instance or a redis cluster.
> [1] : https://redis.io/topics/replication
> [2]: 
> https://github.com/sripathikrishnan/redis-rdb-tools/wiki/Redis-RDB-Dump-File-Format
> [3]: https://github.com/leonchen83/redis-replicator
> [4]: https://github.com/leonchen83/redis-replicator#31-replication-via-socket
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (ROCKETMQ-193) Develop rocketmq-redis-replicator component

Reply via email to