Josh Elser created ACCUMULO-4365:
------------------------------------
Summary: ShellServerIT#trace() failing intermittently due to
missing "sendMutations" block
Key: ACCUMULO-4365
URL: https://issues.apache.org/jira/browse/ACCUMULO-4365
Project: Accumulo
Issue Type: Bug
Components: test
Reporter: Josh Elser
Fix For: 2.0.0
Noticed this on master, but not sure if it also affects other branches.
{noformat}
trace(org.apache.accumulo.test.ShellServerIT) Time elapsed: 5.166 sec <<<
FAILURE!
java.lang.AssertionError
at org.apache.accumulo.test.ShellServerIT.trace(ShellServerIT.java:1630)
{noformat}
This is a trace that was observed when the test case failed.
{noformat}
Trace started at 2016/07/10 22:43:38.277
Time Start Service@Location Name
3446+0 shell@localhost shell:root
1+160 [email protected] beginFateOperation
4+167 [email protected] executeFateOperation
3+173 [email protected] CreateTable
2+176 [email protected] CreateTable
16+181 [email protected] SetupPermissions
4+200 [email protected] PopulateZookeeper
19+204 [email protected] PopulateZookeeper
1+694 [email protected] ChooseDir
1+709 [email protected] CreateDir
2+712 [email protected] PopulateMetadata
1+713 [email protected] update
1+713 [email protected] prep
5+716 [email protected] FinishCreateTable
563+172 [email protected] waitForFateOperation
2+736 [email protected] finishFateOperation
1513+745 shell@localhost close
13+746 shell@localhost BinMutations 1
5+746 shell@localhost binMutations
2+748 [email protected] startScan
1+748 [email protected] metadata tablets read ahead 5
3+2259 [email protected] getTableConfiguration
3+2263 [email protected] getTableConfiguration
3+2267 [email protected] getTableConfiguration
3+2270 [email protected] getTableConfiguration
3+2281 shell@localhost scan
2+2282 shell@localhost scan:location
2+2282 [email protected] startScan
2+2282 [email protected] tablet read ahead 6
7+2285 [email protected] beginFateOperation
2+2293 [email protected] executeFateOperation
3+2297 [email protected] DeleteTable
1+2300 [email protected] DeleteTable
4+2413 [email protected] CleanUp
2+2415 [email protected] scan
1+2415 [email protected] scan:location
1+2415 [email protected] startScan
1+2415 [email protected] metadata tablets read ahead 6
20+2417 [email protected] CleanUp
2+2417 [email protected] batch scanner 555- 1
1+2417 [email protected] client:startMultiScan
1+2418 [email protected] startMultiScan
1+2418 [email protected] metadata tablets read ahead 7
1+2420 [email protected] scan
1+2420 [email protected] scan:location
1+2420 [email protected] startScan
1+2420 [email protected] metadata tablets read ahead 1
2+2421 [email protected] close
1+2421 [email protected] BinMutations 1
1+2421 [email protected] binMutations
1+2423 [email protected] scan
1+2423 [email protected] scan:location
1+2423 [email protected] startScan
1+2423 [email protected] metadata tablets read ahead 8
145+2296 [email protected] waitForFateOperation
1+2441 [email protected] finishFateOperation
{noformat}
In another run where the test did not fail:
{noformat}
Trace started at 2016/07/10 22:48:06.432
Time Start Service@Location Name
3066+0 shell@localhost shell:root
5+210 [email protected] beginFateOperation
4+222 [email protected] executeFateOperation
2+228 [email protected] CreateTable
2+230 [email protected] CreateTable
15+235 [email protected] SetupPermissions
1+252 [email protected] PopulateZookeeper
10+253 [email protected] PopulateZookeeper
2+266 [email protected] ChooseDir
70+227 [email protected] waitForFateOperation
2+298 [email protected] finishFateOperation
1511+306 shell@localhost close
9+306 shell@localhost BinMutations 1
5+306 shell@localhost binMutations
2+308 [email protected] startScan
1+308 [email protected] metadata tablets read ahead 5
6+1818 [email protected] getTableConfiguration
3+1825 [email protected] getTableConfiguration
4+1828 [email protected] getTableConfiguration
3+1833 [email protected] getTableConfiguration
1+1836 shell@localhost client:getUserAuthorizations
3+1845 shell@localhost scan
2+1846 shell@localhost scan:location
2+1846 [email protected] startScan
1+1847 [email protected] tablet read ahead 8
7+1849 [email protected] beginFateOperation
3+1856 [email protected] executeFateOperation
2+1860 [email protected] DeleteTable
1+1862 [email protected] DeleteTable
5+2027 [email protected] CleanUp
4+2028 [email protected] scan
4+2028 [email protected] scan:location
2+2029 [email protected] startScan
1+2030 [email protected] metadata tablets read ahead 4
24+2032 [email protected] CleanUp
3+2032 [email protected] batch scanner 560- 1
1+2032 [email protected] client:startMultiScan
1+2033 [email protected] startMultiScan
1+2033 [email protected] metadata tablets read ahead 5
2+2035 [email protected] scan
2+2035 [email protected] scan:location
1+2036 [email protected] startScan
1+2036 [email protected] metadata tablets read ahead 6
2+2037 [email protected] close
2+2039 [email protected] scan
2+2039 [email protected] scan:location
1+2040 [email protected] startScan
1+2040 [email protected] metadata tablets read ahead 7
200+1859 [email protected] waitForFateOperation
1+2060 [email protected] finishFateOperation
The following spans are not rooted (probably due to a parent span of length
0ms):
2+273 [email protected] PopulateMetadata
1+274 [email protected] update
1+274 [email protected] wal
5+278 [email protected] FinishCreateTable
1+2038 [email protected] sendMutations
1+2038 [email protected] update
1+2038 [email protected]
org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter 1
1+2038 [email protected] prep
{noformat}
Note how the only sendMutations is actually coming from the Master (and is
unrooted for some reason...), not actually from the BatchWriter as we'd
expected.
[~ShawnWalker], maybe this is related to your changes in ACCUMULO-4191? Do you
have any time to look into this?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)