Mind putting the below proposal on HBASE-16415 ? Thanks
On Thu, Jun 8, 2017 at 3:24 PM, Jan Kunigk <jan.kun...@gmail.com> wrote: > Hi, with regards to the above JIRA I would like to make the following > contribution. > I am looking very much forward to feedback and comments. > > ReplicationSourceWALReaderThread continuously follows WALEntries to be > replicated for a specified WAL via WAL.Reader's next() method and adds them > to WALEntryBatches > > As far as I can see, those WALEntries are copies of the originally > persisted local WALs. In order to direct these Entries to TableNames, > different to the source, I propose to intercept the copied WALEntries on > the source cluster and probe if they belong to a TableName, which is to be > re-written. > > If such a probe is successful, then the WALKey of any such WALEntry needs > to be changed accordingly. WALKey provides a getTableName() method, but > currently not a setTableName() method, which would simply have to be added > to change the private TableName member. > > I propose to intercept the entries via a new method redirectEntry(), which > is invoked shortly before the entry is added to its WALEntryBatch and > immediately after the entry has been filtered by filterEntry() like so: > > Entry entry = entryStream.next(); > if (updateSerialReplPos(batch, entry)) { > batch.lastWalPosition = entryStream.getPosition(); > break; > } > entry = filterEntry(entry); > entry = redirectEntry(entry); // <-- > if (entry != null) { > WALEdit edit = entry.getEdit(); > if (edit != null && !edit.isEmpty()) { > long entrySize = getEntrySize(entry); > batch.addEntry(entry); > > redirectEntry() bases its decisions on a 'Map<TableName, TableName> > redirections', where the keys are the source table name and the values the > destination table name. The Map would be included in the > ReplicationPeerConfig, which can be obtained from within > ReplicationSourceWALReaderThread via the instance of > ReplicationSourceManager, which is in turn passed as an argument to both > available constructors. > > When a TableName object from a WALKey from the WALEntryStream matches the > key of any of the entries in the redirections map, that WALKey's TableName > is replaced by the the value of that entry. > > The rationale for intercepting on the sending side is that the setup and > peer management is performed on the source today already and there is no > mechanism I can see which would carry the redirection rules themselves > across. > > Similarly to the way that the hbase shell allows to specify the tables and > column families to be replicated (set_peer_table_CFs), I propose a new > command (also on the sending side) 'set_peer_table_redirections', which > accepts a map of Strings, corresponding to the required final specification > of the redirections as TableNames: > > set_peer_redirections['ns_source1:table_source1' : 'ns_dest1:table_dest1', > 'ns_source2:table_source2' : 'ns_dest2:table_dest2', ... > 'ns_sourcen:table_sourcen' : 'ns_destn:table_destn', ] > > Thanks, best, J >