Hey Nick,
Looks like you are failing to find your table in meta data cache, if you don’t
find it in the meta data cache, we end up rebuilding the meta data from both
the SYSTEM.CATALOG and SYSTEM.STATS table
The rebuilding process from the meta data table is a scan on the both the
tables. So we will end up going to zookeeper and to find the region and execute
the scan on the region, it is a expensive operation, that explains the calls to
a specific region and the zookeeper
Similar thing happens in PHOENIX-2607, but that is specifiically related to
timestamps, if the client timestamp is less than or equal to the stats
timestamp of the table, I will end up doing the above process rebuilding
process repeatedly for every batch of insert from PhoenixMapReduceUtil. I don’t
believe you have an timestamp issue, but I have the repeated lookup and scans
are happening from what you have described
James has suggested a workaround for the meta data cache value refresh using
UPDATE_CACHE_FREQUENCY, and it will probably help (we are trying it out for
timestamp issue)
Hope this helps.
Thanks
Arun
Please look at the following lines in MetadataCacheImpl
private PTable doGetTable(byte[] key, long clientTimeStamp, RowLock rowLock)
throws IOException, SQLException {
ImmutableBytesPtr cacheKey = new ImmutableBytesPtr(key);
Cache<ImmutableBytesPtr, PMetaDataEntity> metaDataCache =
GlobalCache.getInstance(this.env).getMetaDataCache();
PTable table = (PTable)metaDataCache.getIfPresent(cacheKey);
// We only cache the latest, so we'll end up building the table with every
call if the
// client connection has specified an SCN.
// TODO: If we indicate to the client that we're returning an older
version, but there's a
// newer version available, the client
// can safely not call this, since we only allow modifications to the
latest.
if (table != null && table.getTimeStamp() < clientTimeStamp) {
// Table on client is up-to-date with table on server, so just return
if (isTableDeleted(table)) {
return null;
}
return table;
}
// Ask Lars about the expense of this call - if we don't take the lock, we
still won't get
// partial results
// get the co-processor environment
// TODO: check that key is within region.getStartKey() and
region.getEndKey()
// and return special code to force client to lookup region from meta.
Region region = env.getRegion();
/*
* Lock directly on key, though it may be an index table. This will just
prevent a table
* from getting rebuilt too often.
*/
final boolean wasLocked = (rowLock != null);
if (!wasLocked) {
rowLock = region.getRowLock(key, true);
if (rowLock == null) {
throw new IOException("Failed to acquire lock on " +
Bytes.toStringBinary(key));
}
}
try {
// Try cache again in case we were waiting on a lock
table = (PTable)metaDataCache.getIfPresent(cacheKey);
// We only cache the latest, so we'll end up building the table with
every call if the
// client connection has specified an SCN.
// TODO: If we indicate to the client that we're returning an older
version, but there's
// a newer version available, the client
// can safely not call this, since we only allow modifications to the
latest.
if (table != null && table.getTimeStamp() < clientTimeStamp) {
// Table on client is up-to-date with table on server, so just
return
if (isTableDeleted(table)) {
return null;
}
return table;
}
// Query for the latest table first, since it's not cached
table = buildTable(key, cacheKey, region, HConstants.LATEST_TIMESTAMP);
if (table != null && table.getTimeStamp() < clientTimeStamp) {
return table;
}
// Otherwise, query for an older version of the table - it won't be
cached
return buildTable(key, cacheKey, region, clientTimeStamp);
} finally {
if (!wasLocked) rowLock.release();
}
}
From: Nick Dimiduk <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Date: Tuesday, February 16, 2016 at 5:58 PM
To: "[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Cc: James Taylor <[email protected]<mailto:[email protected]>>, Lars
Hofhansl <[email protected]<mailto:[email protected]>>
Subject: Write path blocked by MetaDataEndpoint acquiring region lock
Hello,
I have a high throughput ingest pipeline that's seised up. My ingest
application ultimately crashes, contains the following stack trace [0].
Independently, I noticed that the RPC call time of one of the machines was
significantly higher than others (95pct at multiple seconds vs 10's of ms). I
grabbed the RS log and a couple jstacks from the process. In the logs I see
handler threads creating ZK connections excessively (~50 INFO lines per
second). The jstacks show handler threads parked while taking region row locks,
calling HRegion.getRowLockInternal() via MetaDataEndpointImpl.doGetTable() [1].
The one handler thread I see that's in the same MetaDataEndpointImpl area but
not under lock appears to be making an RPC to read the statistics table [2].
I believe these two occurrences are related.
My working theory is that the metaDataCache object is performing poorly for
some reason. This results in excessive meta data lookups, some of which appear
to require making an RPC call while under lock.
What can I do to relive pressure on this rowlock? Looking at the code around
the lock, this looks like it's populating to a connection-level cache of schema
metadata. The host with the high RPC call time is hosting the SYSTEM.CATALOG
table region. I see some configuration settings that may be related (ie,
phoenix.coprocessor.maxMetaDataCacheSize), but I see no way to get debug
information about cache size or evictions from this Guava cache instance.
I'll be digging into this further, but I appreciate any pointers you may have.
BTW, this is Phoenix 4.6.0 + HBase 1.1.2.
Thanks a lot,
-n
[0]: client-side stack
Caused by: org.apache.phoenix.exception.PhoenixIOException: Interrupted calling
coprocessor service
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService for row
\x00<schema>\x00<user_table>
at
org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1053)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1016)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.getTable(ConnectionQueryServicesImpl.java:1289)
at
org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:446)
at
org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:389)
at
org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:385)
at
org.apache.phoenix.execute.MutationState.validate(MutationState.java:369)
at
org.apache.phoenix.execute.MutationState.commit(MutationState.java:417)
at
org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:482)
at
org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:479)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at
org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:479)
at
org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:84)
[1] rs handlers blocked stack
"B.defaultRpcServer.handler=48,queue=3,port=16020" #91 daemon prio=5 os_prio=0
tid=0x00007ff4ce458000 nid=0xca0a waiting on condition [0x00007ff47a607000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000708ef59c0> (a
java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at
org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5047)
at org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:5013)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2397)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2365)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:440)
at
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:11609)
at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7435)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1875)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1857)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
[2]: rs handler stats rpc stack
"B.defaultRpcServer.handler=19,queue=4,port=16020" #62 daemon prio=5 os_prio=0
tid=0x00007ff4ce420000 nid=0xc9ea in Object.wait() [0x00007ff47c323000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:460)
at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348)
at
org.apache.hadoop.hbase.client.ResultBoundedCompletionService.poll(ResultBoundedCompletionService.java:155)
- locked <0x00000007a3f5e030> (a
[Lorg.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture;)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:168)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
at
org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403)
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364)
at
org.apache.phoenix.schema.stats.StatisticsUtil.readStatistics(StatisticsUtil.java:99)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:836)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:472)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2418)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2365)
at
org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:440)
at
org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:11609)
at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7435)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1875)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1857)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101)
at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
at java.lang.Thread.run(Thread.java:745)
----------------------------------------------------------------------
This message and any attachments are intended only for the use of the addressee
and may contain information that is privileged and confidential. If the reader
of the message is not the intended recipient or an authorized representative of
the intended recipient, you are hereby notified that any dissemination of this
communication is strictly prohibited. If you have received this communication
in error, notify the sender immediately by return email and delete the message
and any attachments from your system.