[
https://issues.apache.org/jira/browse/HBASE-21196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nihal Jain updated HBASE-21196:
-------------------------------
Description:
Operations which use {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction,
ServerName, MultiResponse, int)}} API with tablename set to null reset the meta
cache of the corresponding server after each call. One such operation is put
operation of HTableMultiplexer (Might not be the only one). This may impact the
performance of the system severely as all new ops directed to that server will
have to go to zk first to get the meta table address and then get the location
of the table region as it will become empty after every htablemultiplexer put.
>From the logs below, one can see after every other put the cached region
>locations are cleared. As a side effect of this, before every put the server
>needs to contact zk and get meta table location and read meta to get region
>locations of the table.
{noformat}
2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): Removed
all cached region locations that map to root1-thinkpad-t440p,35811,1536857446588
2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5]
client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for
root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
2018-09-13 22:21:15,515 TRACE
[RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811]
ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get"
request_param: true priority: 0 timeout: 60000 totalRequestSize: 137 bytes
2018-09-13 22:21:15,515 TRACE
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.CallRunner(105):
callId: 218 service: ClientService methodName: Get size: 137 connection:
127.0.0.1:42338 executing as root1
2018-09-13 22:21:15,515 TRACE
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.RpcServer(2356):
callId: 218 service: ClientService methodName: Get size: 137 connection:
127.0.0.1:42338 param: region=
testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f.,
row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result {
associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0
totalTime: 0
2018-09-13 22:21:15,516 TRACE
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811]
io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, count=0,
allocations=1
2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get,
callTime: 2ms
2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan
table=hbase:meta,
startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99999999999999
2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179):
Advancing internal small scanner to startKey at
'testHTableMultiplexer_1,\x00\x00\x00\xC5,99999999999999'
2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up
meta region location in ZK,
connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
{noformat}
>From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see
>that the string "Removed all cached region locations that map" and "Looking up
>meta region location in ZK" are present 800+ times for 1000 back to back puts.
was:
Operations which use {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction,
ServerName, MultiResponse, int)}} API with tablename set to null reset the meta
cache of the corresponding server after each call. One such operation is put
operation of HTableMultiplexer (Might not be the only one). This may impact the
performance of the system severely as all new ops directed to that server will
have to go to zk first to get the meta table address and then get the location
of the table region as it will become empty after every htablemultiplexer put.
>From the logs below, one can see after every other put the cached region
>locations are cleared. As a side effect of this, before every put the server
>needs to contact zk and get meta table location and read meta to get region
>locations of the table.
{noformat}
2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283): Removed
all cached region locations that map to root1-thinkpad-t440p,35811,1536857446588
2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5]
client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for
root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
2018-09-13 22:21:15,515 TRACE
[RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811]
ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get"
request_param: true priority: 0 timeout: 60000 totalRequestSize: 137 bytes
2018-09-13 22:21:15,515 TRACE
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.CallRunner(105):
callId: 218 service: ClientService methodName: Get size: 137 connection:
127.0.0.1:42338 executing as root1
2018-09-13 22:21:15,515 TRACE
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811] ipc.RpcServer(2356):
callId: 218 service: ClientService methodName: Get size: 137 connection:
127.0.0.1:42338 param: region=
testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f.,
row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result {
associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0
totalTime: 0
2018-09-13 22:21:15,516 TRACE
[RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811]
io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0, count=0,
allocations=1
2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get,
callTime: 2ms
2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan
table=hbase:meta,
startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99999999999999
2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179):
Advancing internal small scanner to startKey at
'testHTableMultiplexer_1,\x00\x00\x00\xC5,99999999999999'
2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up
meta region location in ZK,
connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
{noformat}
> HTableMultiplexer clears the meta cache after every put operation
> -----------------------------------------------------------------
>
> Key: HBASE-21196
> URL: https://issues.apache.org/jira/browse/HBASE-21196
> Project: HBase
> Issue Type: Bug
> Components: Performance
> Affects Versions: 3.0.0, 1.3.3, 2.2.0
> Reporter: Nihal Jain
> Assignee: Nihal Jain
> Priority: Critical
> Attachments: HTableMultiplexer1000Puts.UT.txt
>
>
> Operations which use {{AsyncRequestFutureImpl.receiveMultiAction(MultiAction,
> ServerName, MultiResponse, int)}} API with tablename set to null reset the
> meta cache of the corresponding server after each call. One such operation is
> put operation of HTableMultiplexer (Might not be the only one). This may
> impact the performance of the system severely as all new ops directed to that
> server will have to go to zk first to get the meta table address and then get
> the location of the table region as it will become empty after every
> htablemultiplexer put.
> From the logs below, one can see after every other put the cached region
> locations are cleared. As a side effect of this, before every put the server
> needs to contact zk and get meta table location and read meta to get region
> locations of the table.
> {noformat}
> 2018-09-13 22:21:15,467 TRACE [htable-pool11-t1] client.MetaCache(283):
> Removed all cached region locations that map to
> root1-thinkpad-t440p,35811,1536857446588
> 2018-09-13 22:21:15,467 DEBUG [HTableFlushWorker-5]
> client.HTableMultiplexer$FlushWorker(632): Processed 1 put requests for
> root1-ThinkPad-T440p:35811 and 0 failed, latency for this send: 5
> 2018-09-13 22:21:15,515 TRACE
> [RpcServer.reader=1,bindAddress=root1-ThinkPad-T440p,port=35811]
> ipc.RpcServer$Connection(1954): RequestHeader call_id: 218 method_name: "Get"
> request_param: true priority: 0 timeout: 60000 totalRequestSize: 137 bytes
> 2018-09-13 22:21:15,515 TRACE
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811]
> ipc.CallRunner(105): callId: 218 service: ClientService methodName: Get size:
> 137 connection: 127.0.0.1:42338 executing as root1
> 2018-09-13 22:21:15,515 TRACE
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811]
> ipc.RpcServer(2356): callId: 218 service: ClientService methodName: Get size:
> 137 connection: 127.0.0.1:42338 param: region=
> testHTableMultiplexer_1,,1536857451720.304d914b641a738624937c7f9b4d684f.,
> row=\x00\x00\x00\xC4 connection: 127.0.0.1:42338, response result {
> associated_cell_count: 1 stale: false } queueTime: 0 processingTime: 0
> totalTime: 0
> 2018-09-13 22:21:15,516 TRACE
> [RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=35811]
> io.BoundedByteBufferPool(106): runningAverage=16384, totalCapacity=0,
> count=0, allocations=1
> 2018-09-13 22:21:15,516 TRACE [main] ipc.AbstractRpcClient(236): Call: Get,
> callTime: 2ms
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientScanner(122): Scan
> table=hbase:meta,
> startRow=testHTableMultiplexer_1,\x00\x00\x00\xC5,99999999999999
> 2018-09-13 22:21:15,516 TRACE [main] client.ClientSmallReversedScanner(179):
> Advancing internal small scanner to startKey at
> 'testHTableMultiplexer_1,\x00\x00\x00\xC5,99999999999999'
> 2018-09-13 22:21:15,517 TRACE [main] client.ZooKeeperRegistry(59): Looking up
> meta region location in ZK,
> connection=org.apache.hadoop.hbase.client.ZooKeeperRegistry@599f571f
> {noformat}
> From the minicluster logs [^HTableMultiplexer1000Puts.UT.txt] one can see
> that the string "Removed all cached region locations that map" and "Looking
> up meta region location in ZK" are present 800+ times for 1000 back to back
> puts.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)