[
https://issues.apache.org/jira/browse/KUDU-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428784#comment-17428784
]
wangningito commented on KUDU-3290:
-----------------------------------
Thanks for maintaining my previous work since i've leaved the work for months.
To explain a little more, this work took some idea from 'HTAP', which embedded
a analysis engine as a learner of raft.
But I get a little confused with the purpose of giving it to community.
# Analysis engine is not good at be a main force of a system, it's not good at
transaction.
# Kudu is always the downstream kafka, in a well designed architecture of
system. And also, kudu has very powerful backup ability, so dose kafka.
# If the code is merged to the mainstream, would the default behave of raft
changed? breaking change to the normal user is not a good thing.
# How difficult would it be if a new guy want to implement the 'duplicator' or
'syncer' interfaces?
> Implement Replicate table's data to Kafka(or other Storage System)
> ------------------------------------------------------------------
>
> Key: KUDU-3290
> URL: https://issues.apache.org/jira/browse/KUDU-3290
> Project: Kudu
> Issue Type: New Feature
> Components: tserver
> Reporter: shenxingwuying
> Priority: Critical
>
> h1. background & problem
> We use kudu to store the user profile data, because business requirements,
> exchange and share data from multi-tenant users, which is reasonable in our
> application scene, we need replicate data from one system to another. The
> destination storage system we pick kafka, because of our company's
> architecture at now.
> At this time, we have two ideas to solve it.
> h1. two replication scheme
> Generally, Raft group has three replicas, one is leader and the other two are
> followers. We’ll add a replica, its role is Learner. Learner only receive all
> the data, but not pariticipart in ther leadership election.
> The learner replica, its state machine will be a plugin system, eg:
> # We can support KuduEngine, which just a data backup like mongodb’s hidden
> replica.
> # We can write to the thirdparty store system, like kafka or any other
> system we need. Then we can replicate data to another system use its client.
> At Paxos has a learner role, which only receive data. we need such a role for
> new membership.
> But it Kudu Learner has been used for the copying(recovering) tablet replica.
> Maybe we need a new role name, at this, we still use Learner to represent the
> new role. (We should think over new role name)
> In our application scene, we will replicate data to kafka, and I will explain
> the method.
> h2. Learner replication
> # Add a new replica role, maybe we call it learner, because Paxos has a
> learner role, which only receive data. We need such a role for new
> membership. But at Kudu Learner has been used for the copying(recovering)
> tablet replica. Maybe we need a new role name, at this, we still use Learner
> to represent the new role. (We should think over new role name)
> # The voters's safepoint of clean obsoleted wal is min(leader’ max wal
> sequence number, followers max wal sequence number, learner’ max wal sequence
> number)
> # The learner not voter, not partitipant in elections
> # Raft can replication data to the learner
> # The process of learner applydb, just like raft followers, the logs before
> committed index will replicate to kafka, kafka’s response ok. the apply index
> will increase.
> # We need kafka client, it will be added to kudu as an option, maybe as an
> compile option
> # When a kudu-tserver decomission or corrupted, the learner must move to new
> kudu-tserver. So the leader should save learner apply OpId, and replicate to
> followers, when learner's failover when leader down.
> # The leader must save the learners apply OpId and replicate it to
> followers, when learner's recovery can make sure no data loss when leader
> down. If leader no save the applyIndex, learner maybe loss data
> # Followers save the learners applyindex and term, coz followers maybe
> become leader.
> # When load balancer running,we shoud support move learner another
> kudu-tserver
> # Table should add a switch option to determine whether raft group has
> learner, can support setting it when creating table.
> # Support altering table to add learners maybe an idea, but need solve the
> base data migrate problem.
> # Base data migrate. The simple but heavy cost, when learner's max_OpId <
> committed_OpId (maybe data loss, maybe we alter table add learner replication
> for a existing table), we can trigger a full scan at the timestamp and
> replicate data to learner, and then recover the appendEntries flow.
> # Kudu not support split and merge, we not discuss it now. If KuduSupport
> split or merge, we can implement it use 12, of course we can use more better
> method.
> # If we need the funtion, our cluster should at least 4 tservers.
> If kafka fail or topic not exist, the learner will stop replicate wal, that
> will occupt more disk space. if learner loss or corrupted, it can recover
> from the leader. We need make sure the safepoint.
> h2. Leader replication
> We can replication data to kafka or any other storage system from leader
> directly.
> # We need not set a role, but the dest is kafka, PeerManager's one peer is
> different from the others, that will make something complex.
> # Reuse the leader’s wal, save a output network bandwidth compare to above
> method.
> # All replica should maintenance a point to save the apply OpId at kafka.
> # Safepoint of clean obsoleted wal is min( voters’ max wal sequence number,
> applyIndex at kafka), which leader save it.
> # Any leader transfer must recover the apply OpId at kafka.
> # We need kafka client, it will be added to kudu as an option, may be as an
> compile option
> # If kafka's topic or kafka failure, print errorlog. and wal holds
> # Some process the same as Learner replication. such as base data replicate.
> # If we need the funtion, our cluster should at least 3 tservers.
> If kafka fail or topic not exist, the leader will stop replicate wal to
> kafka, that will occupt more disk space. We need make sure the safepoint.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)