Mind putting the below proposal on HBASE-16415 ?

Thanks

On Thu, Jun 8, 2017 at 3:24 PM, Jan Kunigk <jan.kun...@gmail.com> wrote:

> Hi, with regards to the above JIRA I would like to make the following
> contribution.
> I am looking very much forward to feedback and comments.
>
> ReplicationSourceWALReaderThread continuously follows WALEntries to be
> replicated for a specified WAL via WAL.Reader's next() method and adds them
> to WALEntryBatches
>
> As far as I can see, those WALEntries are copies of the originally
> persisted local WALs. In order to direct these Entries to TableNames,
> different to the source, I propose to intercept the copied WALEntries on
> the source cluster and probe if they belong to a TableName, which is to be
> re-written.
>
> If such a probe is successful, then the WALKey of any such WALEntry needs
> to be changed accordingly. WALKey provides a getTableName() method, but
> currently not a setTableName() method, which would simply have to be added
> to change the private TableName member.
>
> I propose to intercept the entries via a new method redirectEntry(), which
> is invoked shortly before the entry is added to its WALEntryBatch and
> immediately after the entry has been filtered by filterEntry() like so:
>
>             Entry entry = entryStream.next();
>             if (updateSerialReplPos(batch, entry)) {
>               batch.lastWalPosition = entryStream.getPosition();
>               break;
>             }
>             entry = filterEntry(entry);
>             entry = redirectEntry(entry); // <--
>             if (entry != null) {
>               WALEdit edit = entry.getEdit();
>               if (edit != null && !edit.isEmpty()) {
>                 long entrySize = getEntrySize(entry);
>                 batch.addEntry(entry);
>
> redirectEntry() bases its decisions on a 'Map<TableName, TableName>
> redirections', where the keys are the source table name and the values the
> destination table name. The Map would be included in the
> ReplicationPeerConfig, which can be obtained from within
> ReplicationSourceWALReaderThread via the instance of
> ReplicationSourceManager, which is in turn passed as an argument to both
> available constructors.
>
> When a TableName object from a WALKey from the WALEntryStream matches the
> key of any of the entries in the redirections map, that WALKey's TableName
> is replaced by the the value of that entry.
>
> The rationale for intercepting on the sending side is that the setup and
> peer management is performed on the source today already and there is no
> mechanism I can see which would carry the redirection rules themselves
> across.
>
> Similarly to the way that the hbase shell allows to specify the tables and
> column families to be replicated (set_peer_table_CFs), I propose a new
> command (also on the sending side) 'set_peer_table_redirections', which
> accepts a map of Strings, corresponding to the required final specification
> of the redirections as TableNames:
>
> set_peer_redirections['ns_source1:table_source1' : 'ns_dest1:table_dest1',
> 'ns_source2:table_source2' : 'ns_dest2:table_dest2', ...
> 'ns_sourcen:table_sourcen' : 'ns_destn:table_destn', ]
>
> Thanks, best, J
>

Reply via email to