[
https://issues.apache.org/jira/browse/HBASE-18846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-18846:
--------------------------
Release Note:
Makes it so hbase-indexer/lily can rely on public APIs.
Adds being able to disable near-all HRegionServer services. This along with an
existing plugin mechanism which allows configuring the RegionServer to host an
alternate Connection implementation, makes it so we can put up a cluster of
hollowed-out HRegionServers whose only purpose is to pose as a Replication Sink
for a source HBase Cluster. In the alternate supplied Connection
implementation, hbase-indexer would install its own code to catch the
Replication.
Below and attached are sample hbase-server.xml files and alternate Connection
implementations. To start up an HRegionServer as a sink, first make sure there
is a ZooKeeper ensemble we can talk to. If none, just start one:
{code}
./bin/hbase-daemon.sh start zookeeper
{code}
To start up a single RegionServer, put in place the below sample hbase-site.xml
and a derviative of the below IndexerConnection on the CLASSPATH, and then
start the RegionServer:
{code}
./bin/hbase-daemon.sh start org.apache.hadoop.hbase.regionserver.HRegionServer
{code}
Stdout and Stderr will go into files under configured logs directory.
This patch adds configuration to disable RegionServer internal Services,
Managers, Caches, etc., starting up.
By default a RegionServer starts up an Admin and Client Service. To disable
either or both, use the below booleans:
{code}
hbase.regionserver.admin.service
hbase.regionserver.client.service
{code}
Both default true.
To make a HRegionServer startup and stay up without expecting to communicate
with a master, set the below boolean to false:
{code}
hbase.masterless
{code]
Default is false.
h3. Sample hbase-site.xml that disables internal HRegionServer Services
Below is an example hbase-site.xml that turns off most Services and that then
installs an alternate Connection implementation, one that is nulled out in all
regards except in being able to return a "Table" that can catch a Replication
Stream in its {code}batch(List<? extends Row> actions, Object[] results){code}
method. i.e. what the hbase-indexer wants. I also add the example alternate
Connection implementation below (both of these files are also attached to this
issue). Expects there to be an up and running zookeeper ensemble.
{code}
<configuration>
<!-- This file is an example for hbase-indexer. It shuts down
facility in the regionserver and interjects a special
Connection implementation which is how hbase-indexer will
receive the replication stream from source hbase cluster.
See the class referenced in the config.
Most of the config in here is booleans set to off and
setting values to zero so services doon't start. Some of
the flags are new via this patch.
-->
<!--Need this for the RegionServer to come up standalone-->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<!--This is what you implement, a Connection that returns a Table that
overrides the batch call. It is at this point you do your indexer
inserts.
-->
<property>
<name>hbase.client.connection.impl</name>
<value>org.apache.hadoop.hbase.client.IndexerConnection</value>
<description>A customs connection implementation just so we can interject
our
own Table class, one that has an override for the batch call which
receives
the replication stream edits; i.e. it is called by the replication sink
#replicateEntries method.</description>
</property>
<!--Set hbase.regionserver.info.port to -1 for no webui-->
<!--Below are configs to shut down unused services in hregionserver-->
<property>
<name>hbase.regionserver.admin.service</name>
<value>false</value>
<description>Do NOT stand up an Admin Service Interface on RPC</description>
</property>
<property>
<name>hbase.regionserver.client.service</name>
<value>false</value>
<description>Do NOT stand up a client-facing Service on RPC</description>
</property>
<property>
<name>hbase.wal.provider</name>
<value>org.apache.hadoop.hbase.wal.DisabledWALProvider</value>
<description>Set WAL service to be the null WAL</description>
</property>
<property>
<name>hbase.regionserver.workers</name>
<value>false</value>
<description>Turn off all background workers, log splitters, executors,
etc.</description>
</property>
<property>
<name>hfile.block.cache.size</name>
<value>0.0001</value>
<description>Turn off block cache completely</description>
</property>
<property>
<name>hbase.mob.file.cache.size</name>
<value>0</value>
<description>Disable MOB cache.</description>
</property>
<property>
<name>hbase.masterless</name>
<value>true</value>
<description>Do not expect Master in cluster.</description>
</property>
<property>
<name>hbase.regionserver.metahandler.count</name>
<value>1</value>
<description>How many priority handlers to run; we probably need none.
Default is 20 which is too much on a server like this.</description>
</property>
<property>
<name>hbase.regionserver.replication.handler.count</name>
<value>1</value>
<description>How many replication handlers to run; we probably need none.
Default is 3 which is too much on a server like this.</description>
</property>
<property>
<name>hbase.regionserver.handler.count</name>
<value>10</value>
<description>How many default handlers to run; tie to # of CPUs.
Default is 30 which is too much on a server like this.</description>
</property>
<property>
<name>hbase.ipc.server.read.threadpool.size</name>
<value>3</value>
<description>How many Listener request reaaders to run; tie to a portion #
of CPUs (1/4?).
Default is 10 which is too much on a server like this.</description>
</property>
</configuration>
{code}
h2. Sample Connection Implementation
Has call-out for where an hbase-indexer would insert its capture code.
{code}
package org.apache.hadoop.hbase.client;
import com.google.protobuf.Descriptors;
import com.google.protobuf.Message;
import com.google.protobuf.Service;
import com.google.protobuf.ServiceException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.CompareOperator;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.coprocessor.Batch;
import org.apache.hadoop.hbase.filter.CompareFilter;
import org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel;
import org.apache.hadoop.hbase.security.User;
import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutorService;
/**
* Sample class for hbase-indexer.
* DO NOT COMMIT TO HBASE CODEBASE!!!
* Overrides Connection just so we can return a Table that has the
* method that the replication sink calls, i.e. Table#batch.
* It is at this point that the hbase-indexer catches the replication
* stream so it can insert into the lucene index.
*/
public class IndexerConnection implements Connection {
private final Configuration conf;
private final User user;
private final ExecutorService pool;
private volatile boolean closed = false;
public IndexerConnection(Configuration conf, ExecutorService pool, User user)
throws IOException {
this.conf = conf;
this.user = user;
this.pool = pool;
}
@Override
public void abort(String why, Throwable e) {}
@Override
public boolean isAborted() {
return false;
}
@Override
public Configuration getConfiguration() {
return this.conf;
}
@Override
public BufferedMutator getBufferedMutator(TableName tableName) throws
IOException {
return null;
}
@Override
public BufferedMutator getBufferedMutator(BufferedMutatorParams params)
throws IOException {
return null;
}
@Override
public RegionLocator getRegionLocator(TableName tableName) throws IOException
{
return null;
}
@Override
public Admin getAdmin() throws IOException {
return null;
}
@Override
public void close() throws IOException {
if (!this.closed) this.closed = true;
}
@Override
public boolean isClosed() {
return this.closed;
}
@Override
public TableBuilder getTableBuilder(final TableName tn, ExecutorService pool)
{
if (isClosed()) {
throw new RuntimeException("IndexerConnection is closed.");
}
final Configuration passedInConfiguration = getConfiguration();
return new TableBuilder() {
@Override
public TableBuilder setOperationTimeout(int timeout) {
return null;
}
@Override
public TableBuilder setRpcTimeout(int timeout) {
return null;
}
@Override
public TableBuilder setReadRpcTimeout(int timeout) {
return null;
}
@Override
public TableBuilder setWriteRpcTimeout(int timeout) {
return null;
}
@Override
public Table build() {
return new Table() {
private final Configuration conf = passedInConfiguration;
private final TableName tableName = tn;
@Override
public TableName getName() {
return this.tableName;
}
@Override
public Configuration getConfiguration() {
return this.conf;
}
@Override
public void batch(List<? extends Row> actions, Object[] results)
throws IOException, InterruptedException {
// Implementation goes here.
}
@Override
public HTableDescriptor getTableDescriptor() throws IOException {
return null;
}
@Override
public TableDescriptor getDescriptor() throws IOException {
return null;
}
@Override
public boolean exists(Get get) throws IOException {
return false;
}
@Override
public boolean[] existsAll(List<Get> gets) throws IOException {
return new boolean[0];
}
@Override
public <R> void batchCallback(List<? extends Row> actions, Object[]
results, Batch.Callback<R> callback) throws IOException, InterruptedException {
}
@Override
public Result get(Get get) throws IOException {
return null;
}
@Override
public Result[] get(List<Get> gets) throws IOException {
return new Result[0];
}
@Override
public ResultScanner getScanner(Scan scan) throws IOException {
return null;
}
@Override
public ResultScanner getScanner(byte[] family) throws IOException {
return null;
}
@Override
public ResultScanner getScanner(byte[] family, byte[] qualifier)
throws IOException {
return null;
}
@Override
public void put(Put put) throws IOException {
}
@Override
public void put(List<Put> puts) throws IOException {
}
@Override
public boolean checkAndPut(byte[] row, byte[] family, byte[]
qualifier, byte[] value, Put put) throws IOException {
return false;
}
@Override
public boolean checkAndPut(byte[] row, byte[] family, byte[]
qualifier, CompareFilter.CompareOp compareOp, byte[] value, Put put) throws
IOException {
return false;
}
@Override
public boolean checkAndPut(byte[] row, byte[] family, byte[]
qualifier, CompareOperator op, byte[] value, Put put) throws IOException {
return false;
}
@Override
public void delete(Delete delete) throws IOException {
}
@Override
public void delete(List<Delete> deletes) throws IOException {
}
@Override
public boolean checkAndDelete(byte[] row, byte[] family, byte[]
qualifier, byte[] value, Delete delete) throws IOException {
return false;
}
@Override
public boolean checkAndDelete(byte[] row, byte[] family, byte[]
qualifier, CompareFilter.CompareOp compareOp, byte[] value, Delete delete)
throws IOException {
return false;
}
@Override
public boolean checkAndDelete(byte[] row, byte[] family, byte[]
qualifier, CompareOperator op, byte[] value, Delete delete) throws IOException {
return false;
}
@Override
public void mutateRow(RowMutations rm) throws IOException {
}
@Override
public Result append(Append append) throws IOException {
return null;
}
@Override
public Result increment(Increment increment) throws IOException {
return null;
}
@Override
public long incrementColumnValue(byte[] row, byte[] family, byte[]
qualifier, long amount) throws IOException {
return 0;
}
@Override
public long incrementColumnValue(byte[] row, byte[] family, byte[]
qualifier, long amount, Durability durability) throws IOException {
return 0;
}
@Override
public void close() throws IOException {
}
@Override
public CoprocessorRpcChannel coprocessorService(byte[] row) {
return null;
}
@Override
public <T extends Service, R> Map<byte[], R>
coprocessorService(Class<T> service, byte[] startKey, byte[] endKey,
Batch.Call<T, R> callable) throws ServiceException, Throwable {
return null;
}
@Override
public <T extends Service, R> void coprocessorService(Class<T>
service, byte[] startKey, byte[] endKey, Batch.Call<T, R> callable,
Batch.Callback<R> callback) throws ServiceException, Throwable {
}
@Override
public <R extends Message> Map<byte[], R>
batchCoprocessorService(Descriptors.MethodDescriptor methodDescriptor, Message
request, byte[] startKey, byte[] endKey, R responsePrototype) throws
ServiceException, Throwable {
return null;
}
@Override
public <R extends Message> void
batchCoprocessorService(Descriptors.MethodDescriptor methodDescriptor, Message
request, byte[] startKey, byte[] endKey, R responsePrototype, Batch.Callback<R>
callback) throws ServiceException, Throwable {
}
@Override
public boolean checkAndMutate(byte[] row, byte[] family, byte[]
qualifier, CompareFilter.CompareOp compareOp, byte[] value, RowMutations
mutation) throws IOException {
return false;
}
@Override
public boolean checkAndMutate(byte[] row, byte[] family, byte[]
qualifier, CompareOperator op, byte[] value, RowMutations mutation) throws
IOException {
return false;
}
@Override
public void setOperationTimeout(int operationTimeout) {
}
@Override
public int getOperationTimeout() {
return 0;
}
@Override
public int getRpcTimeout() {
return 0;
}
@Override
public void setRpcTimeout(int rpcTimeout) {
}
@Override
public int getReadRpcTimeout() {
return 0;
}
@Override
public void setReadRpcTimeout(int readRpcTimeout) {
}
@Override
public int getWriteRpcTimeout() {
return 0;
}
@Override
public void setWriteRpcTimeout(int writeRpcTimeout) {
}
};
}
};
}
}
{code}
was:
Makes it so hbase-indexer can use public APIs. Adds being able to start one or
more HRegionServer instances with all usual Services disabled. An existing
mechanism allows plugging in an alternate Connection implementation. Combining
these two features, we can put up a cluster of hollowed-out HRegionServers
whose only role is to pose as a Replication Sink. In the alternate Connection
implementation, hbase-indexer installs its own code to catch the Replication.
Adds configuration to disable RegionServer Services starting up.
By default a RegionServer starts up an Admin and Client Service. To disable
either or both, use the below booleans:
{code}hbase.regionserver.admin.service{code}
{code}hbase.regionserver.client.service{code}
Both default true.
To make a HRegionServer startup and stay up without expecting to communicate
with a master, set the below boolean to false:
{code}hbase.masterless{code]
Below is an example hbase-site.xml that turns off most Services and that then
installs an alternate Connection implementation, one that is nulled out in all
regards except in being able to return a "Table" that can catch a Replication
Stream in its {code}batch(List<? extends Row> actions, Object[] results){code}
method. i.e. what the hbase-indexer wants. I also add the example alternate
Connection implementation below (both of these files are also attached to this
issue). Expects there to be an up and running zookeeper ensemble.
{code}
<configuration>
<!-- This file is an example for hbase-indexer. It shuts down
facility in the regionserver and interjects a special
Connection implementation which is how hbase-indexer will
receive the replication stream from source hbase cluster.
See the class referenced in the config.
Most of the config in here is booleans set to off and
setting values to zero so services doon't start. Some of
the flags are new via this patch.
-->
<!--Need this for the RegionServer to come up standalone-->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<!--This is what you implement, a Connection that returns a Table that
overrides the batch call. It is at this point you do your indexer
inserts.
-->
<property>
<name>hbase.client.connection.impl</name>
<value>org.apache.hadoop.hbase.client.IndexerConnection</value>
<description>A customs connection implementation just so we can interject
our
own Table class, one that has an override for the batch call which
receives
the replication stream edits; i.e. it is called by the replication sink
#replicateEntries method.</description>
</property>
<!--Set hbase.regionserver.info.port to -1 for no webui-->
<!--Below are configs to shut down unused services in hregionserver-->
<property>
<name>hbase.regionserver.admin.service</name>
<value>false</value>
<description>Do NOT stand up an Admin Service Interface on RPC</description>
</property>
<property>
<name>hbase.regionserver.client.service</name>
<value>false</value>
<description>Do NOT stand up a client-facing Service on RPC</description>
</property>
<property>
<name>hbase.wal.provider</name>
<value>org.apache.hadoop.hbase.wal.DisabledWALProvider</value>
<description>Set WAL service to be the null WAL</description>
</property>
<property>
<name>hbase.regionserver.workers</name>
<value>false</value>
<description>Turn off all background workers, log splitters, executors,
etc.</description>
</property>
<property>
<name>hfile.block.cache.size</name>
<value>0.0001</value>
<description>Turn off block cache completely</description>
</property>
<property>
<name>hbase.mob.file.cache.size</name>
<value>0</value>
<description>Disable MOB cache.</description>
</property>
<property>
<name>hbase.masterless</name>
<value>true</value>
<description>Do not expect Master in cluster.</description>
</property>
<property>
<name>hbase.regionserver.metahandler.count</name>
<value>1</value>
<description>How many priority handlers to run; we probably need none.
Default is 20 which is too much on a server like this.</description>
</property>
<property>
<name>hbase.regionserver.replication.handler.count</name>
<value>1</value>
<description>How many replication handlers to run; we probably need none.
Default is 3 which is too much on a server like this.</description>
</property>
<property>
<name>hbase.regionserver.handler.count</name>
<value>10</value>
<description>How many default handlers to run; tie to # of CPUs.
Default is 30 which is too much on a server like this.</description>
</property>
<property>
<name>hbase.ipc.server.read.threadpool.size</name>
<value>3</value>
<description>How many Listener request reaaders to run; tie to a portion #
of CPUs (1/4?).
Default is 10 which is too much on a server like this.</description>
</property>
</configuration>
{code}
{code}
package org.apache.hadoop.hbase.client;
import com.google.protobuf.Descriptors;
import com.google.protobuf.Message;
import com.google.protobuf.Service;
import com.google.protobuf.ServiceException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.CompareOperator;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.coprocessor.Batch;
import org.apache.hadoop.hbase.filter.CompareFilter;
import org.apache.hadoop.hbase.ipc.CoprocessorRpcChannel;
import org.apache.hadoop.hbase.security.User;
import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutorService;
/**
* Sample class for hbase-indexer.
* DO NOT COMMIT TO HBASE CODEBASE!!!
* Overrides Connection just so we can return a Table that has the
* method that the replication sink calls, i.e. Table#batch.
* It is at this point that the hbase-indexer catches the replication
* stream so it can insert into the lucene index.
*/
public class IndexerConnection implements Connection {
private final Configuration conf;
private final User user;
private final ExecutorService pool;
private volatile boolean closed = false;
public IndexerConnection(Configuration conf, ExecutorService pool, User user)
throws IOException {
this.conf = conf;
this.user = user;
this.pool = pool;
}
@Override
public void abort(String why, Throwable e) {}
@Override
public boolean isAborted() {
return false;
}
@Override
public Configuration getConfiguration() {
return this.conf;
}
@Override
public BufferedMutator getBufferedMutator(TableName tableName) throws
IOException {
return null;
}
@Override
public BufferedMutator getBufferedMutator(BufferedMutatorParams params)
throws IOException {
return null;
}
@Override
public RegionLocator getRegionLocator(TableName tableName) throws IOException
{
return null;
}
@Override
public Admin getAdmin() throws IOException {
return null;
}
@Override
public void close() throws IOException {
if (!this.closed) this.closed = true;
}
@Override
public boolean isClosed() {
return this.closed;
}
@Override
public TableBuilder getTableBuilder(final TableName tn, ExecutorService pool)
{
if (isClosed()) {
throw new RuntimeException("IndexerConnection is closed.");
}
final Configuration passedInConfiguration = getConfiguration();
return new TableBuilder() {
@Override
public TableBuilder setOperationTimeout(int timeout) {
return null;
}
@Override
public TableBuilder setRpcTimeout(int timeout) {
return null;
}
@Override
public TableBuilder setReadRpcTimeout(int timeout) {
return null;
}
@Override
public TableBuilder setWriteRpcTimeout(int timeout) {
return null;
}
@Override
public Table build() {
return new Table() {
private final Configuration conf = passedInConfiguration;
private final TableName tableName = tn;
@Override
public TableName getName() {
return this.tableName;
}
@Override
public Configuration getConfiguration() {
return this.conf;
}
@Override
public void batch(List<? extends Row> actions, Object[] results)
throws IOException, InterruptedException {
// Implementation goes here.
}
@Override
public HTableDescriptor getTableDescriptor() throws IOException {
return null;
}
@Override
public TableDescriptor getDescriptor() throws IOException {
return null;
}
@Override
public boolean exists(Get get) throws IOException {
return false;
}
@Override
public boolean[] existsAll(List<Get> gets) throws IOException {
return new boolean[0];
}
@Override
public <R> void batchCallback(List<? extends Row> actions, Object[]
results, Batch.Callback<R> callback) throws IOException, InterruptedException {
}
@Override
public Result get(Get get) throws IOException {
return null;
}
@Override
public Result[] get(List<Get> gets) throws IOException {
return new Result[0];
}
@Override
public ResultScanner getScanner(Scan scan) throws IOException {
return null;
}
@Override
public ResultScanner getScanner(byte[] family) throws IOException {
return null;
}
@Override
public ResultScanner getScanner(byte[] family, byte[] qualifier)
throws IOException {
return null;
}
@Override
public void put(Put put) throws IOException {
}
@Override
public void put(List<Put> puts) throws IOException {
}
@Override
public boolean checkAndPut(byte[] row, byte[] family, byte[]
qualifier, byte[] value, Put put) throws IOException {
return false;
}
@Override
public boolean checkAndPut(byte[] row, byte[] family, byte[]
qualifier, CompareFilter.CompareOp compareOp, byte[] value, Put put) throws
IOException {
return false;
}
@Override
public boolean checkAndPut(byte[] row, byte[] family, byte[]
qualifier, CompareOperator op, byte[] value, Put put) throws IOException {
return false;
}
@Override
public void delete(Delete delete) throws IOException {
}
@Override
public void delete(List<Delete> deletes) throws IOException {
}
@Override
public boolean checkAndDelete(byte[] row, byte[] family, byte[]
qualifier, byte[] value, Delete delete) throws IOException {
return false;
}
@Override
public boolean checkAndDelete(byte[] row, byte[] family, byte[]
qualifier, CompareFilter.CompareOp compareOp, byte[] value, Delete delete)
throws IOException {
return false;
}
@Override
public boolean checkAndDelete(byte[] row, byte[] family, byte[]
qualifier, CompareOperator op, byte[] value, Delete delete) throws IOException {
return false;
}
@Override
public void mutateRow(RowMutations rm) throws IOException {
}
@Override
public Result append(Append append) throws IOException {
return null;
}
@Override
public Result increment(Increment increment) throws IOException {
return null;
}
@Override
public long incrementColumnValue(byte[] row, byte[] family, byte[]
qualifier, long amount) throws IOException {
return 0;
}
@Override
public long incrementColumnValue(byte[] row, byte[] family, byte[]
qualifier, long amount, Durability durability) throws IOException {
return 0;
}
@Override
public void close() throws IOException {
}
@Override
public CoprocessorRpcChannel coprocessorService(byte[] row) {
return null;
}
@Override
public <T extends Service, R> Map<byte[], R>
coprocessorService(Class<T> service, byte[] startKey, byte[] endKey,
Batch.Call<T, R> callable) throws ServiceException, Throwable {
return null;
}
@Override
public <T extends Service, R> void coprocessorService(Class<T>
service, byte[] startKey, byte[] endKey, Batch.Call<T, R> callable,
Batch.Callback<R> callback) throws ServiceException, Throwable {
}
@Override
public <R extends Message> Map<byte[], R>
batchCoprocessorService(Descriptors.MethodDescriptor methodDescriptor, Message
request, byte[] startKey, byte[] endKey, R responsePrototype) throws
ServiceException, Throwable {
return null;
}
@Override
public <R extends Message> void
batchCoprocessorService(Descriptors.MethodDescriptor methodDescriptor, Message
request, byte[] startKey, byte[] endKey, R responsePrototype, Batch.Callback<R>
callback) throws ServiceException, Throwable {
}
@Override
public boolean checkAndMutate(byte[] row, byte[] family, byte[]
qualifier, CompareFilter.CompareOp compareOp, byte[] value, RowMutations
mutation) throws IOException {
return false;
}
@Override
public boolean checkAndMutate(byte[] row, byte[] family, byte[]
qualifier, CompareOperator op, byte[] value, RowMutations mutation) throws
IOException {
return false;
}
@Override
public void setOperationTimeout(int operationTimeout) {
}
@Override
public int getOperationTimeout() {
return 0;
}
@Override
public int getRpcTimeout() {
return 0;
}
@Override
public void setRpcTimeout(int rpcTimeout) {
}
@Override
public int getReadRpcTimeout() {
return 0;
}
@Override
public void setReadRpcTimeout(int readRpcTimeout) {
}
@Override
public int getWriteRpcTimeout() {
return 0;
}
@Override
public void setWriteRpcTimeout(int writeRpcTimeout) {
}
};
}
};
}
}
{code}
> Accommodate the hbase-indexer/lily/SEP consumer deploy-type
> -----------------------------------------------------------
>
> Key: HBASE-18846
> URL: https://issues.apache.org/jira/browse/HBASE-18846
> Project: HBase
> Issue Type: Bug
> Reporter: stack
> Assignee: stack
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18846.master.001.patch,
> HBASE-18846.master.002.patch, HBASE-18846.master.003.patch,
> IndexerConnection.java, hbase-site.xml, javadoc.txt
>
>
> This is a follow-on from HBASE-10504, Define a Replication Interface. There
> we defined a new, flexible replication endpoint for others to implement but
> it did little to help the case of the lily hbase-indexer. This issue takes up
> the case of the hbase-indexer.
> The hbase-indexer poses to hbase as a 'fake' peer cluster (For why
> hbase-indexer is implemented so, the advantage to having the indexing done in
> a separate process set that can be independently scaled, can participate in
> the same security realm, etc., see discussion in HBASE-10504). The
> hbase-indexer will start up a cut-down "RegionServer" processes that are just
> an instance of hbase RpcServer hosting an AdminProtos Service. They make
> themselves 'appear' to the Replication Source by hoisting up an ephemeral
> znode 'registering' as a RegionServer. The source cluster then streams
> WALEdits to the Admin Protos method:
> {code}
> public ReplicateWALEntryResponse replicateWALEntry(final RpcController
> controller,
> final ReplicateWALEntryRequest request) throws ServiceException {
> {code}
> The hbase-indexer relies on other hbase internals like Server so it can get a
> ZooKeeperWatcher instance and know the 'name' to use for this cut-down server.
> Thoughts on how to proceed include:
>
> * Better formalize its current digestion of hbase internals; make it so
> rpcserver is allowed to be used by others, etc. This would be hard to do
> given they use basics like Server, Protobuf serdes for WAL types, and
> AdminProtos Service. Any change in this wide API breaks (again)
> hbase-indexer. We have made a 'channel' for Coprocessor Endpoints so they
> continue to work though they use 'internal' types. They can use protos in
> hbase-protocol. hbase-protocol protos are in a limbo currently where they are
> sort-of 'public'; a TODO. Perhaps the hbase-indexer could do similar relying
> on the hbase-protocol (pb2.5) content and we could do something to reveal
> rpcserver and zk for hbase-indexer safe use.
> * Start an actual RegionServer only have it register the AdminProtos Service
> only -- not ClientProtos and the Service that does Master interaction, etc.
> [I checked, this is not as easy to do as I at first thought -- St.Ack] Then
> have the hbase-indexer implement an AdminCoprocessor to override the
> replicateWALEntry method (the Admin CP implementation may need work). This
> would narrow the hbase-indexer exposure to that of the Admin Coprocessor
> Interface
> * Over in HBASE-10504, [~enis] suggested "... if we want to provide
> isolation for the replication services in hbase, we can have a simple host as
> another daemon which hosts the ReplicationEndpoint implementation. RS's will
> use a built-in RE to send the edits to this layer, and the host will delegate
> it to the RE implementation. The flow would be something like: RS --> RE
> inside RS --> Host daemon for RE --> Actual RE implementation --> third party
> system..."
>
> Other crazy notions occur including the setup of an Admin Interface
> Coprocessor Endpoint. A new ReplicationEndpoint would feed the replication
> stream to the remote cluster via the CPEP registered channel.
> But time is short. Hopefully we can figure something that will work in 2.0
> timeframe w/o too much code movement.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)