[
https://issues.apache.org/jira/browse/HBASE-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962351#comment-13962351
]
Jean-Daniel Cryans commented on HBASE-10504:
--------------------------------------------
It does seem that people want to be able to listen to row operations in HBase.
I'm not sure how we can fully support this use case if bulk loads and edits
that aren't hitting the WALs are mixed in. The current contract is that we
replicate everything that's sent to the WAL and that got sync'd.
Regarding the actual interfaces, here's how I see it (I'm sure I'm missing a
few things):
h5. Replication source
h6. Filtering WALEdits
We need to formalize what {{ReplicationSource#removeNonReplicableEdits}} is
currently doing, maybe it could be done as a chain à la {{FileCleanerDelegate}}.
h6. Replication management
Currently, enable/disabling/adding/removing peers is all done via ZK which
we're trying to not use as a permanent data store. This jira goes into more
details [HBASE-10295|https://issues.apache.org/jira/browse/HBASE-10295]. If the
master is going to be in charge of it, then it means we need to define a new
protobuf service that the RS will implement. It should be separate from
AdminProtos.
h5. Replication sink
h6. Re-creating a region server
Replication is currently done via our RPC mechanism, so you need to start
{{RpcServer}} in order to receive requests. The the next part of the contract
that replication relies on is that the sinks are discoverable via ZooKeeper,
basically piggybacking on the RS discovery process. This means setting up a
{{ZooKeeperWatcher}}, crafting a server name and then creating the znode. A
good example of this can be found in SEP:
https://github.com/NGDATA/hbase-indexer/blob/master/hbase-sep/hbase-sep-impl-0.95/src/main/java/com/ngdata/sep/impl/SepConsumer.java
It may not seem as whole lot of code but it's code that can easily be broken
with a few signature changes since those interfaces aren't clearly marked.
h6. ReplicateWALEntry
{{ReplicateWALEntry}} is a service offered as part of AdminProtos so it needs
to move out. It should be a separate service from the previous one I described
in "Replication management". The unfortunate thing here is that {{Replay}}
relies on the same messages:
https://github.com/apache/hbase/blob/trunk/hbase-protocol/src/main/protobuf/Admin.proto#L260.
To extract {{ReplicateWALEntry}} in a compatible way we'll have to deprecate
it and maybe also deprecate {{Replay}}'s current signature to give it its own
appropriately-named messages (or not, not a big deal).
> Define Replication Interface
> ----------------------------
>
> Key: HBASE-10504
> URL: https://issues.apache.org/jira/browse/HBASE-10504
> Project: HBase
> Issue Type: Task
> Reporter: stack
> Assignee: stack
> Priority: Blocker
> Fix For: 0.99.0
>
>
> HBase has replication. Fellas have been hijacking the replication apis to do
> all kinds of perverse stuff like indexing hbase content (hbase-indexer
> https://github.com/NGDATA/hbase-indexer) and our [~toffer] just showed up w/
> overrides that replicate via an alternate channel (over a secure thrift
> channel between dcs over on HBASE-9360). This issue is about surfacing these
> APIs as public with guarantees to downstreamers similar to those we have on
> our public client-facing APIs (and so we don't break them for downstreamers).
> Any input [~phunt] or [~gabriel.reid] or [~toffer]?
> Thanks.
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)