[
https://issues.apache.org/jira/browse/HBASE-13308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhangduo updated HBASE-13308:
-----------------------------
Description:
https://builds.apache.org/job/HBase-TRUNK-jacoco/24/testReport/junit/org.apache.hadoop.hbase.regionserver/TestEndToEndSplitTransaction/testFromClientSideWhileSplitting/
First, we split 'e9eb97847340ea7c6b9616d63d62a784' to
'abe1973ea732066b12d8e33fce12a951' and '4940dad7ef9b4b699fd13eede5740d9d'.
And then, we try to split 'abe1973ea732066b12d8e33fce12a951'.
{noformat}
2015-03-21 03:58:46,970 INFO [Thread-191]
regionserver.TestEndToEndSplitTransaction(399): Initiating region split
for:testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
2015-03-21 03:58:46,976 INFO [PriorityRpcServer.handler=7,queue=1,port=54177]
regionserver.RSRpcServices(1596): Splitting
testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
2015-03-21 03:58:46,977 DEBUG [PriorityRpcServer.handler=7,queue=1,port=54177]
regionserver.CompactSplitThread(259): Split requested for
testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951..
compaction_queue=(0:0), split_queue=1, merge_queue=0
2015-03-21 03:58:46,978 INFO [Thread-191]
regionserver.TestEndToEndSplitTransaction(399): blocking until region is
split:testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
2015-03-21 03:58:46,985 DEBUG [RS:0;priapus:54177-splits-1426910324832]
lock.ZKInterProcessLockBase(226): Acquired a lock for
/hbase/table-lock/testFromClientSideWhileSplitting/read-regionserver:541770000000002
2015-03-21 03:58:46,988 DEBUG [RS:0;priapus:54177-splits-1426910324832]
lock.ZKInterProcessLockBase(328): Released
/hbase/table-lock/testFromClientSideWhileSplitting/read-regionserver:541770000000002
2015-03-21 03:58:46,988 DEBUG [Thread-191] ipc.AsyncRpcClient(163): Use global
event loop group NioEventLoopGroup
2015-03-21 03:58:46,988 INFO [RS:0;priapus:54177-splits-1426910324832]
regionserver.SplitRequest(142): Split transaction journal:
STARTED at 1426910326977
{noformat}
We can see that it failed without any error message.
I think can only happen when the parent is not splittable or we can not find a
splitrow.
{noformat}
2015-03-21 03:58:47,019 INFO
[RS:0;priapus:54177-shortCompactions-1426910324308] regionserver.HStore(1334):
Completed major compaction of 2 (all) file(s) in family of
testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
into e97bccdc4b014c15a52a579cd49ebb31(size=12.6 K), total size for store is
12.6 K. This selection was in queue for 0sec, and took 0sec to execute.
2015-03-21 03:58:47,019 INFO
[RS:0;priapus:54177-shortCompactions-1426910324308]
regionserver.CompactSplitThread$CompactionRunner(523): Completed compaction:
Request =
regionName=testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.,
storeName=family, fileCount=2, fileSize=25.5 K, priority=1,
time=14542808784655186; duration=0sec
2015-03-21 03:58:47,020 DEBUG
[RS:0;priapus:54177-shortCompactions-1426910324308]
regionserver.CompactSplitThread$CompactionRunner(546): CompactSplitThread
Status: compaction_queue=(0:0), split_queue=0, merge_queue=0
{noformat}
We can see that, the compaction was completed at 03:58:47,019, but split was
started at 03:58:46,970 which is earlier.
So we have a reference file and is not splittable.
I think the problem is 'compactAndBlockUntilDone' is not reliable, it may
return before the compaction complete.
Will try to prepare a patch.
was:
https://builds.apache.org/job/HBase-TRUNK-jacoco/24/testReport/junit/org.apache.hadoop.hbase.regionserver/TestEndToEndSplitTransaction/testFromClientSideWhileSplitting/
First, we split 'e9eb97847340ea7c6b9616d63d62a784' to
'abe1973ea732066b12d8e33fce12a951' and '4940dad7ef9b4b699fd13eede5740d9d'.
And then, we try to split 'abe1973ea732066b12d8e33fce12a951'.
{noformat}
2015-03-21 03:58:46,970 INFO [Thread-191]
regionserver.TestEndToEndSplitTransaction(399): Initiating region split
for:testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
2015-03-21 03:58:46,976 INFO [PriorityRpcServer.handler=7,queue=1,port=54177]
regionserver.RSRpcServices(1596): Splitting
testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
2015-03-21 03:58:46,977 DEBUG [PriorityRpcServer.handler=7,queue=1,port=54177]
regionserver.CompactSplitThread(259): Split requested for
testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951..
compaction_queue=(0:0), split_queue=1, merge_queue=0
2015-03-21 03:58:46,978 INFO [Thread-191]
regionserver.TestEndToEndSplitTransaction(399): blocking until region is
split:testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
2015-03-21 03:58:46,985 DEBUG [RS:0;priapus:54177-splits-1426910324832]
lock.ZKInterProcessLockBase(226): Acquired a lock for
/hbase/table-lock/testFromClientSideWhileSplitting/read-regionserver:541770000000002
2015-03-21 03:58:46,988 DEBUG [RS:0;priapus:54177-splits-1426910324832]
lock.ZKInterProcessLockBase(328): Released
/hbase/table-lock/testFromClientSideWhileSplitting/read-regionserver:541770000000002
2015-03-21 03:58:46,988 DEBUG [Thread-191] ipc.AsyncRpcClient(163): Use global
event loop group NioEventLoopGroup
2015-03-21 03:58:46,988 INFO [RS:0;priapus:54177-splits-1426910324832]
regionserver.SplitRequest(142): Split transaction journal:
STARTED at 1426910326977
{noformat}
We can see that it failed without any error message.
I think can only happen when the parent is not splittable or we can not find a
splitrow.
{noformat}
2015-03-21 03:58:47,019 INFO
[RS:0;priapus:54177-shortCompactions-1426910324308] regionserver.HStore(1334):
Completed major compaction of 2 (all) file(s) in family of
testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
into e97bccdc4b014c15a52a579cd49ebb31(size=12.6 K), total size for store is
12.6 K. This selection was in queue for 0sec, and took 0sec to execute.
2015-03-21 03:58:47,019 INFO
[RS:0;priapus:54177-shortCompactions-1426910324308]
regionserver.CompactSplitThread$CompactionRunner(523): Completed compaction:
Request =
regionName=testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.,
storeName=family, fileCount=2, fileSize=25.5 K, priority=1,
time=14542808784655186; duration=0sec
2015-03-21 03:58:47,020 DEBUG
[RS:0;priapus:54177-shortCompactions-1426910324308]
regionserver.CompactSplitThread$CompactionRunner(546): CompactSplitThread
Status: compaction_queue=(0:0), split_queue=0, merge_queue=0
{noformat}
We can see that, the compaction was completed at 03:58:47,019, but split is
started at 03:58:46,970 which is earlier.
So we have a reference file and is not splittable.
I think the problem is 'compactAndBlockUntilDone' is not reliable, it may
return before the compaction complete.
Will try to prepare a patch.
> Fix flaky TestEndToEndSplitTransaction
> --------------------------------------
>
> Key: HBASE-13308
> URL: https://issues.apache.org/jira/browse/HBASE-13308
> Project: HBase
> Issue Type: Bug
> Components: test
> Reporter: zhangduo
> Assignee: zhangduo
>
> https://builds.apache.org/job/HBase-TRUNK-jacoco/24/testReport/junit/org.apache.hadoop.hbase.regionserver/TestEndToEndSplitTransaction/testFromClientSideWhileSplitting/
> First, we split 'e9eb97847340ea7c6b9616d63d62a784' to
> 'abe1973ea732066b12d8e33fce12a951' and '4940dad7ef9b4b699fd13eede5740d9d'.
> And then, we try to split 'abe1973ea732066b12d8e33fce12a951'.
> {noformat}
> 2015-03-21 03:58:46,970 INFO [Thread-191]
> regionserver.TestEndToEndSplitTransaction(399): Initiating region split
> for:testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
> 2015-03-21 03:58:46,976 INFO
> [PriorityRpcServer.handler=7,queue=1,port=54177]
> regionserver.RSRpcServices(1596): Splitting
> testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
> 2015-03-21 03:58:46,977 DEBUG
> [PriorityRpcServer.handler=7,queue=1,port=54177]
> regionserver.CompactSplitThread(259): Split requested for
> testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951..
> compaction_queue=(0:0), split_queue=1, merge_queue=0
> 2015-03-21 03:58:46,978 INFO [Thread-191]
> regionserver.TestEndToEndSplitTransaction(399): blocking until region is
> split:testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
> 2015-03-21 03:58:46,985 DEBUG [RS:0;priapus:54177-splits-1426910324832]
> lock.ZKInterProcessLockBase(226): Acquired a lock for
> /hbase/table-lock/testFromClientSideWhileSplitting/read-regionserver:541770000000002
> 2015-03-21 03:58:46,988 DEBUG [RS:0;priapus:54177-splits-1426910324832]
> lock.ZKInterProcessLockBase(328): Released
> /hbase/table-lock/testFromClientSideWhileSplitting/read-regionserver:541770000000002
> 2015-03-21 03:58:46,988 DEBUG [Thread-191] ipc.AsyncRpcClient(163): Use
> global event loop group NioEventLoopGroup
> 2015-03-21 03:58:46,988 INFO [RS:0;priapus:54177-splits-1426910324832]
> regionserver.SplitRequest(142): Split transaction journal:
> STARTED at 1426910326977
> {noformat}
> We can see that it failed without any error message.
> I think can only happen when the parent is not splittable or we can not find
> a splitrow.
> {noformat}
> 2015-03-21 03:58:47,019 INFO
> [RS:0;priapus:54177-shortCompactions-1426910324308]
> regionserver.HStore(1334): Completed major compaction of 2 (all) file(s) in
> family of
> testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
> into e97bccdc4b014c15a52a579cd49ebb31(size=12.6 K), total size for store is
> 12.6 K. This selection was in queue for 0sec, and took 0sec to execute.
> 2015-03-21 03:58:47,019 INFO
> [RS:0;priapus:54177-shortCompactions-1426910324308]
> regionserver.CompactSplitThread$CompactionRunner(523): Completed compaction:
> Request =
> regionName=testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.,
> storeName=family, fileCount=2, fileSize=25.5 K, priority=1,
> time=14542808784655186; duration=0sec
> 2015-03-21 03:58:47,020 DEBUG
> [RS:0;priapus:54177-shortCompactions-1426910324308]
> regionserver.CompactSplitThread$CompactionRunner(546): CompactSplitThread
> Status: compaction_queue=(0:0), split_queue=0, merge_queue=0
> {noformat}
> We can see that, the compaction was completed at 03:58:47,019, but split was
> started at 03:58:46,970 which is earlier.
> So we have a reference file and is not splittable.
> I think the problem is 'compactAndBlockUntilDone' is not reliable, it may
> return before the compaction complete.
> Will try to prepare a patch.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)