Vikas Vishwakarma created PHOENIX-1215:
------------------------------------------
Summary: some transactions are hanging indefinitely on
Key: PHOENIX-1215
URL: https://issues.apache.org/jira/browse/PHOENIX-1215
Project: Phoenix
Issue Type: Bug
Affects Versions: 4.0.0
Environment: Num Cores: 24, Model : Intel(R) Xeon(R) CPU E5-2420 0 @
1.90GHz
MemTotal: 65801264 kB, 1 Gbps Network
Red Hat Enterprise Linux Server release 6.2
Reporter: Vikas Vishwakarma
Test Setup:
==========
4 load client with 16 threads each running against a HBase cluster with 6
RegionServers.
Each load client spawns a thread and runs a upsert query in batch mode
Pseudo code:
==========
SQL_UPSERT_HBASE_DATA = UPSERT INTO TABLE_NAME (ROW_KEY_ID, ROW_VAL) VALUES
(?, ?)
for (int i=0; i<6400; i++) {
upsertSql.append(SQL_UPSERT_HBASE_DATA);
ps = conn.prepareStatement(preparedSqlTableStmt.toString());
ps.setString(1, rowKey);
ps.setString(2, rowVal);
ps.execute();
}
conn.commit();
tryClose(conn);
Observation:
==============
Observing some issues where a small number of transactions (approx 40 out of
80,000) are hanging indefinitely on preparedStatement.execute() call.
I tried to put a timeout on the transaction but looks like phoenix still does
not support preparedStatement.setQueryTimeout().
Log Analysis:
=============
On the client side there are no errors or exceptions except some threads in
hung state
On the service side, the observation is as follows:
Also after some log analysis, it looks like all the batches that got stuck,
are during the start of the test. From the RegionServer logs, the phoenix
co-processor loading and index updates is in progress during this time (which
involves some failures and retries). Once this phase is over, remaining write
batches that started later have succeeded.
RegionServer log snapshots during the time period this issue was observed on
the client side:
---------
2014-08-27 03:28:00,114 INFO
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Loaded coprocessor
org.apache.phoenix.coprocessor.ScanRegionObserver from HTD of HT_42
successfully.
2014-08-27 03:28:00,114 INFO
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Loaded coprocessor
org.apache.phoenix.coprocessor.ScanRegionObserver from HTD of HT_42
successfully.
2014-08-27 03:28:00,252 INFO org.apache.phoenix.hbase.index.Indexer: Found some
outstanding index updates that didn't succeed during WAL replay - attempting to
replay now.
2014-08-27 03:28:00,252 INFO org.apache.phoenix.hbase.index.Indexer: Found some
outstanding index updates that didn't succeed during WAL replay - attempting to
replay now.
2014-08-27 03:28:00,252 INFO org.apache.phoenix.hbase.index.Indexer: Found some
outstanding index updates that didn't succeed during WAL replay - attempting to
replay now.
2014-08-27 03:28:00,931 INFO
org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver: Starting
ungrouped coprocessor scan
{"timeRange":[0,1409110080691],"batch":-1,"startRow":"\\x0E","stopRow":"\\x0F","loadColumnFamiliesOnDemand":true,"totalColumns":1,"cacheBlocks":true,"families":{"0":["ALL"]},"maxResultSize":-1,"maxVersions":1,"caching":-1}
2014-08-27 03:28:00,931 INFO
org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver: Starting
ungrouped coprocessor scan
{"timeRange":[0,1409110080691],"batch":-1,"startRow":"\\x07","stopRow":"\\x08","loadColumnFamiliesOnDemand":true,"totalColumns":1,"cacheBlocks":true,"families":{"0":["ALL"]},"maxResultSize":-1,"maxVersions":1,"caching":-1}
....
2014-08-27 03:28:06,507 DEBUG
org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Loading coprocessor class
org.apache.phoenix.coprocessor.ServerCachingEndpointImpl with path null and
priority 1
2014-08-27 03:28:06,507 INFO
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Loaded coprocessor
org.apache.phoenix.coprocessor.ServerCachingEndpointImpl from HTD of HT_44
successfully.
....
2014-08-27 03:28:06,508 DEBUG
org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Loading coprocessor class
org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver with path null
and priority 1
2014-08-27 03:28:06,508 INFO
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Loaded coprocessor
org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver from HTD of HT_44
successfully.
...
2014-08-27 03:28:19,525 INFO org.apache.phoenix.hbase.index.Indexer: Found some
outstanding index updates that didn't succeed during WAL replay - attempting to
replay now.
2014-08-27 03:28:19,547 INFO org.apache.phoenix.hbase.index.Indexer: Found some
outstanding index updates that didn't succeed during WAL replay - attempting to
replay now.
..
2014-08-27 03:28:20,660 INFO
org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver: Starting
ungrouped coprocessor scan
{"timeRange":[0,1409110100554],"batch":-1,"startRow":"\\x0C","stopRow":"\\x0D","loadColumnFamiliesOnDemand":true,"totalColumns":1,"cacheBlocks":true,"families":{"0":["ALL"]},"maxResultSize":-1,"maxVersions":1,"caching":-1}
2014-08-27 03:28:20,660 INFO
org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver: Finished
scanning 0 rows for ungrouped coprocessor scan
{"timeRange":[0,1409110100554],"batch":-1,"startRow":"\\x0C","stopRow":"\\x0D","loadColumnFamiliesOnDemand":true,"totalColumns":1,"cacheBlocks":true,"families":{"0":["ALL"]},"maxResultSize":-1,"maxVersions":1,"caching":-1}
--
This message was sent by Atlassian JIRA
(v6.2#6252)