zhangduo created HBASE-13011:
--------------------------------
Summary: TestLoadIncrementalHFiles is flakey when using using
AsyncRpcClient as client implementation
Key: HBASE-13011
URL: https://issues.apache.org/jira/browse/HBASE-13011
Project: HBase
Issue Type: Bug
Affects Versions: 2.0.0
Reporter: zhangduo
The test sometimes failed because of timeout.
https://builds.apache.org/job/PreCommit-HBASE-Build/12769/testReport/junit/org.apache.hadoop.hbase.mapreduce/TestLoadIncrementalHFiles/testSimpleLoad/
Dig into it, I found this
{noformat}
2015-02-11 02:01:47,304 INFO [LoadIncrementalHFiles-1]
mapreduce.LoadIncrementalHFiles(563): Trying to load
hfile=hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
first=ddd last=ooo
2015-02-11 02:01:47,308 INFO [LoadIncrementalHFiles-0]
mapreduce.LoadIncrementalHFiles(563): Trying to load
hfile=hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0
first=aaaa last=cccc
2015-02-11 02:01:47,317 DEBUG [LoadIncrementalHFiles-2]
mapreduce.LoadIncrementalHFiles$3(664): Going to connect to server
region=bulkNS:mytable_testSimpleLoad,,1423620104753.fdcbd21e43683c753bae40f1d890daa6.,
hostname=asf910.gq1.ygridcore.net,41003,1423620099272, seqNum=2 for row with
hfile group
[{[B@7173d25a,hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0}]
2015-02-11 02:01:47,320 DEBUG [LoadIncrementalHFiles-3]
mapreduce.LoadIncrementalHFiles$3(664): Going to connect to server
region=bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.,
hostname=asf910.gq1.ygridcore.net,41003,1423620099272, seqNum=2 for row ddd
with hfile group
[{[B@7173d25a,hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1}]
{noformat}
There are two files to commit, but after this
{noformat}
2015-02-11 02:01:47,327 INFO [B.defaultRpcServer.handler=3,queue=0,port=41003]
regionserver.HStore(690): Validating hfile at
hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0
for inclusion in store myfam region
bulkNS:mytable_testSimpleLoad,,1423620104753.fdcbd21e43683c753bae40f1d890daa6.
2015-02-11 02:01:47,330 INFO [B.defaultRpcServer.handler=1,queue=0,port=41003]
regionserver.HStore(690): Validating hfile at
hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
for inclusion in store myfam region
bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.
2015-02-11 02:01:47,330 INFO [B.defaultRpcServer.handler=4,queue=0,port=41003]
regionserver.HStore(690): Validating hfile at
hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
for inclusion in store myfam region
bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.
{noformat}
We can see that hfile_1 have been committed twice and the second call will fail
and cause the test timeout.
I'm not sure if it is a issue of AsyncRpcClient. But if I use RpcClientImpl,
the test always pass.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)