[
https://issues.apache.org/jira/browse/ROCKETMQ-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053857#comment-16053857
]
ASF GitHub Bot commented on ROCKETMQ-193:
-----------------------------------------
Github user ranqiqiang commented on the issue:
https://github.com/apache/incubator-rocketmq-externals/pull/23
这项目 直接 pull 下来不能编译额、、、
> Develop rocketmq-redis-replicator component
> -------------------------------------------
>
> Key: ROCKETMQ-193
> URL: https://issues.apache.org/jira/browse/ROCKETMQ-193
> Project: Apache RocketMQ
> Issue Type: Task
> Reporter: Rich Zhang
> Assignee: Rich Zhang
> Priority: Minor
> Fix For: 4.2.0-incubating
>
>
> Design:
> Redis supplies an official replication mechanism , and slave communicates
> to master with RESP protocol, so a natural way to design the
> rocketmq-redis-replicator component is simulating itself as a slave, sending
> commands to master and receiving datas from master timely, and then resending
> to rocketmq broker.
> If you are not familiar with redis replication mechanism, please learn this
> section first [1]. After that, I will illustrate some key points ahead.
> 1. To make slave start from the point where it left off when it reconnects,
> slave and master should agree on a master runId and a replication offset.
> Slave acknowledges this offset to master periodically. In other words,slave
> may received duplicate commands. Along with, the rocketmq-redis-replicator
> component may send duplicate messages too. A good way to minimize the
> duplicate time window is reducing the "ack period" to a smaller one, such as
> 100ms.
> 2. If slave keeps offline for some time, it’s easy to use up backlog whose
> default value is just 1M, especially for a high-traffic redis instance.
> Unfortunatelly,if slave replication offset has already been covered in master
> backlog, a full synchronization will have to execute, which is unacceptable
> for rocketmq-redis-replicator component as a large number of messages will be
> sent out intensively.
> 3. When synchronizing from master fully, master will generate a new rdb
> file(the rdb file format [2]),and slave will receive this file,store in disk,
> and last apply to memory. This strategy makes slave reaches a consistent
> state with master as soon as possible, and hardly fail. For
> rocketmq-redis-replicator component, it’s also a good way to prevent
> synchronizing initial rdb file from failure in halfway.
> There already an open source project [3] which focuses on replicating redis
> data, and provides api to handle data received [4]. The principal thoughts
> are simulating itself as a slave , following official replication procedure,
> communicating with master by RESP, and acking master with replication offset.
> Base on this project to develop is a good idea, meanwhile some aspects should
> also be enhanced and considered more robust. Here is some points:
> [High Available]
> Keeping the replication component's high availability is not difficult but
> important, not only for providing an uninterruptible service. If component
> leaves off for some time, a unacceptable full synchronization may be
> triggered.
> It’s also easy to reach high availability, including adopting master/slave
> module, using zookeeper to coordinate and switch master/slave, storing data
> onto zookeeper to keep component stateless.
> [Data Loss]
> Generally, data loss should be tried best to avoid. The key point is that
> slave only acks replication offset to master after sending command to
> rocketmq broker successfully.
> [Data Stale]
> It also happened when slave reconnect. Consider case below:
> `time1` `time2` `time3`
> set k=a set k=b set k=c
> If slave left off at time3, but the latest replication offset reported to
> master is only at time1, when slave reconnected, it re-apply commands “set
> k=b… set k=c”. In a small time window, “k” will equal the stale “b” until
> “set k=c” command is applied. So the slave offline time shorter, the better.
> [Message Order]
> Redis uses single thread model to keep command execute in order, because
> of its high performance. Replicating data with a single thread in slave is
> also fine, as it is also totally memory operation. But sending all data to
> rocketmq in a global order is a good choose? Producer should have no
> performance issue, but consumer may not be able to consume messages in time,
> especially redis was in a high load.
> Hashing “KEY” to different rocketmq queue is a good strategy. Guarantee
> the same key operation route to a unique queue, to keep partial ordered, and
> the downstream consumer could consume messages concurrently. Of course, some
> dependency “KEY”s may need hash to a unique queue too. We should supply
> configuration or api to support this individuation.
> [Transaction]
> Redis supports simple transaction. A transaction starts with a “MULTI”
> command, and redis buffers latter commands and execute them until receiving a
> “EXEC” command. But if one of the buffered commands executes fail, the
> preceding executed commands won’t roll back, and the latter commands will be
> discarded. So redis transaction could not guarantee atomic.
> In rocketmq, it’s also impossible to gather multi messages consume
> operation into a transaction. But the rocketmq-redis-replicator component
> will just receive multi commands after redis server get a “EXEC” command.
> From this aspect, the “transaction semantic doesn’t strengthen or weaken
> after this component resend messages.
> [Avoid component switched to master]
> In Sentinel or Redis Cluster, master crash could be detected by some
> mechanism, and one slave will be switched to master. Master has all
> information about its slaves, and the candidate slave is picked up
> automatically. Obviously, rocketmq-redis-replicator component have no ability
> to undertake master role. Configure this component to “read only” slave is a
> good way to avoid the component switch to master.
> [Support Redis Cluster]
> No matter Redis Cluster, or previous Partition, a slave only keeps track
> of one master. So the replication mechanism won’t change in single-node redis
> instance or a redis cluster.
> [1] : https://redis.io/topics/replication
> [2]:
> https://github.com/sripathikrishnan/redis-rdb-tools/wiki/Redis-RDB-Dump-File-Format
> [3]: https://github.com/leonchen83/redis-replicator
> [4]: https://github.com/leonchen83/redis-replicator#31-replication-via-socket
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)