[ 
https://issues.apache.org/jira/browse/HBASE-13011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318234#comment-14318234
 ] 

zhangduo commented on HBASE-13011:
----------------------------------

Oh, there is a patch already.
IMHO, the patch does not work...Certainly this patch could reduce the 
probability of writing one call twice, but it can not prevent all...
Let's see
t1 check call.writeLock, it is false
t2 check call.writeLock, it is still false
t1 set call.writeLock to true and writeRequest
t2 set call.writeLock to true and writeRequest
OK, call is written twice...

Of course there are synchronization methods that could work without a lock, but 
these methods are all complicated I'd say...

> TestLoadIncrementalHFiles is flakey when using AsyncRpcClient as client 
> implementation
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-13011
>                 URL: https://issues.apache.org/jira/browse/HBASE-13011
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0, 1.1.0
>            Reporter: zhangduo
>            Assignee: zhangduo
>             Fix For: 2.0.0, 1.1.0
>
>         Attachments: HBASE-13011.patch, HBASE-13011_1.patch
>
>
> The test sometimes failed because of timeout.
> https://builds.apache.org/job/PreCommit-HBASE-Build/12769/testReport/junit/org.apache.hadoop.hbase.mapreduce/TestLoadIncrementalHFiles/testSimpleLoad/
> Dig into it, I found this
> {noformat}
> 2015-02-11 02:01:47,304 INFO  [LoadIncrementalHFiles-1] 
> mapreduce.LoadIncrementalHFiles(563): Trying to load 
> hfile=hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
>  first=ddd last=ooo
> 2015-02-11 02:01:47,308 INFO  [LoadIncrementalHFiles-0] 
> mapreduce.LoadIncrementalHFiles(563): Trying to load 
> hfile=hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0
>  first=aaaa last=cccc
> 2015-02-11 02:01:47,317 DEBUG [LoadIncrementalHFiles-2] 
> mapreduce.LoadIncrementalHFiles$3(664): Going to connect to server 
> region=bulkNS:mytable_testSimpleLoad,,1423620104753.fdcbd21e43683c753bae40f1d890daa6.,
>  hostname=asf910.gq1.ygridcore.net,41003,1423620099272, seqNum=2 for row  
> with hfile group 
> [{[B@7173d25a,hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0}]
> 2015-02-11 02:01:47,320 DEBUG [LoadIncrementalHFiles-3] 
> mapreduce.LoadIncrementalHFiles$3(664): Going to connect to server 
> region=bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.,
>  hostname=asf910.gq1.ygridcore.net,41003,1423620099272, seqNum=2 for row ddd 
> with hfile group 
> [{[B@7173d25a,hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1}]
> {noformat}
> There are two files to commit, but after this
> {noformat}
> 2015-02-11 02:01:47,327 INFO  
> [B.defaultRpcServer.handler=3,queue=0,port=41003] regionserver.HStore(690): 
> Validating hfile at 
> hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0
>  for inclusion in store myfam region 
> bulkNS:mytable_testSimpleLoad,,1423620104753.fdcbd21e43683c753bae40f1d890daa6.
> 2015-02-11 02:01:47,330 INFO  
> [B.defaultRpcServer.handler=1,queue=0,port=41003] regionserver.HStore(690): 
> Validating hfile at 
> hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
>  for inclusion in store myfam region 
> bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.
> 2015-02-11 02:01:47,330 INFO  
> [B.defaultRpcServer.handler=4,queue=0,port=41003] regionserver.HStore(690): 
> Validating hfile at 
> hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1
>  for inclusion in store myfam region 
> bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.
> {noformat}
> We can see that hfile_1 have been committed twice and the second call will 
> fail and cause the test timeout.
> I'm not sure if it is a issue of AsyncRpcClient. But if I use RpcClientImpl, 
> the test always passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to