[ 
https://issues.apache.org/jira/browse/HBASE-18846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-18846:
--------------------------
    Description: 
This is a follow-on from HBASE-10504, Define a Replication Interface. There we 
defined a new, flexible replication endpoint for others to implement but it did 
little to help the case of the lily hbase-indexer. This issue takes up the case 
of the hbase-indexer.

The hbase-indexer poses to hbase as a 'fake' peer cluster (For why 
hbase-indexer is implemented so, the advantage to having the indexing done in a 
separate process set that can be independently scaled, can participate in the 
same security realm, etc., see discussion in HBASE-10504). The hbase-indexer 
will start up a cut-down "RegionServer" processes that are just an instance of 
hbase RpcServer hosting an AdminProtos Service. They make themselves 'appear' 
to the Replication Source by hoisting up an ephemeral znode 'registering' as a 
RegionServer. The source cluster then streams WALEdits to the Admin Protos 
method:

{code}
 public ReplicateWALEntryResponse replicateWALEntry(final RpcController 
controller,
      final ReplicateWALEntryRequest request) throws ServiceException {
{code}

The hbase-indexer relies on other hbase internals like Server so it can get a 
ZooKeeperWatcher instance and know the 'name' to use for this cut-down server.

Thoughts on how to proceed include:
 
 * Better formalize its current digestion of hbase internals; make it so 
rpcserver is allowed to be used by others, etc. This would be hard to do given 
they use basics like Server, Protobuf serdes for WAL types, and AdminProtos 
Service. Any change in this wide API breaks (again) hbase-indexer. We have made 
a 'channel' for Coprocessor Endpoints so they continue to work though they use 
'internal' types. They can use protos in hbase-protocol. hbase-protocol protos 
are in a limbo currently where they are sort-of 'public'; a TODO. Perhaps the 
hbase-indexer could do similar relying on the hbase-protocol (pb2.5) content 
and we could do something to reveal rpcserver and zk for hbase-indexer safe use.
 * Start an actual RegionServer only have it register the AdminProtos Service 
only -- not ClientProtos and the Service that does Master interaction, etc. 
Then have the hbase-indexer implement an AdminCoprocessor to override the 
replicateWALEntry method (the Admin CP implementation may need work). This 
would narrow the hbase-indexer exposure to that of the Admin Coprocessor 
Interface
 
Other crazy notions occur including the setup of an Admin Interface Coprocessor 
Endpoint. A new ReplicationEndpoint would feed the replication stream to the 
remote cluster via the CPEP registered channel.

But time is short. Hopefully we can figure something that will work in 2.0 
timeframe w/o too much code movement.

  was:
This is a follow-on from HBASE-10504, Define a Replication Interface. There we 
defined a new, flexible replication endpoint for others to implement but it did 
little to help the case of the lily hbase-indexer. This issue takes up the case 
of the hbase-indexer.

The hbase-indexer poses to hbase as a 'fake' peer cluster. The hbase-indexer 
will start up cut-down "RegionServer" processes that are nought but an hbase 
RpcServer hosting an AdminProtos Service. They make themselves 'appear' to the 
Replication Source by hoisting up an ephemeral znode 'registering' as a 
RegionServer. The source cluster then streams WALEdits to the Admin Protos 
method:

{code}
 public ReplicateWALEntryResponse replicateWALEntry(final RpcController 
controller,
      final ReplicateWALEntryRequest request) throws ServiceException {
{code}

The hbase-indexer relies on other hbase internals like Server so it can get a 
ZooKeeperWatcher instance and know the 'name' to use for this cut-down server.

Thoughts on how to proceed include:
 
* Better formalize its current digestion of hbase internals; make it so 
rpcserver is allowed to be used by others, etc.
* Start an actual RegionServer only have it register the AdminProtos Service 
only -- not AdminProtos and ClientProtos, etc. Then have the hbase-indexer 
implement an AdminCoprocessor to override the replicateWALEntry method (the 
Admin CP implementation may need work).

I'll be back....


> Accommodate the hbase-indexer/lily/SEP consumer deploy-type
> -----------------------------------------------------------
>
>                 Key: HBASE-18846
>                 URL: https://issues.apache.org/jira/browse/HBASE-18846
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>
> This is a follow-on from HBASE-10504, Define a Replication Interface. There 
> we defined a new, flexible replication endpoint for others to implement but 
> it did little to help the case of the lily hbase-indexer. This issue takes up 
> the case of the hbase-indexer.
> The hbase-indexer poses to hbase as a 'fake' peer cluster (For why 
> hbase-indexer is implemented so, the advantage to having the indexing done in 
> a separate process set that can be independently scaled, can participate in 
> the same security realm, etc., see discussion in HBASE-10504). The 
> hbase-indexer will start up a cut-down "RegionServer" processes that are just 
> an instance of hbase RpcServer hosting an AdminProtos Service. They make 
> themselves 'appear' to the Replication Source by hoisting up an ephemeral 
> znode 'registering' as a RegionServer. The source cluster then streams 
> WALEdits to the Admin Protos method:
> {code}
>  public ReplicateWALEntryResponse replicateWALEntry(final RpcController 
> controller,
>       final ReplicateWALEntryRequest request) throws ServiceException {
> {code}
> The hbase-indexer relies on other hbase internals like Server so it can get a 
> ZooKeeperWatcher instance and know the 'name' to use for this cut-down server.
> Thoughts on how to proceed include:
>  
>  * Better formalize its current digestion of hbase internals; make it so 
> rpcserver is allowed to be used by others, etc. This would be hard to do 
> given they use basics like Server, Protobuf serdes for WAL types, and 
> AdminProtos Service. Any change in this wide API breaks (again) 
> hbase-indexer. We have made a 'channel' for Coprocessor Endpoints so they 
> continue to work though they use 'internal' types. They can use protos in 
> hbase-protocol. hbase-protocol protos are in a limbo currently where they are 
> sort-of 'public'; a TODO. Perhaps the hbase-indexer could do similar relying 
> on the hbase-protocol (pb2.5) content and we could do something to reveal 
> rpcserver and zk for hbase-indexer safe use.
>  * Start an actual RegionServer only have it register the AdminProtos Service 
> only -- not ClientProtos and the Service that does Master interaction, etc. 
> Then have the hbase-indexer implement an AdminCoprocessor to override the 
> replicateWALEntry method (the Admin CP implementation may need work). This 
> would narrow the hbase-indexer exposure to that of the Admin Coprocessor 
> Interface
>  
> Other crazy notions occur including the setup of an Admin Interface 
> Coprocessor Endpoint. A new ReplicationEndpoint would feed the replication 
> stream to the remote cluster via the CPEP registered channel.
> But time is short. Hopefully we can figure something that will work in 2.0 
> timeframe w/o too much code movement.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to