[
https://issues.apache.org/jira/browse/HDFS-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720511#comment-16720511
]
Ayush Saxena edited comment on HDFS-14145 at 12/13/18 8:02 PM:
---------------------------------------------------------------
[~elgoiri]
In the test the at :
{code:java}
bsmAttemptedItems.add(0L, 0L, 0L, blocksMap, 0);
{code}
There are two operations that are performed
1-->
{code:java}
synchronized (*storageMovementAttemptedItems*) {
storageMovementAttemptedItems.add(itemInfo);
}
{code}
and 2-->
{code:java}
synchronized (scheduledBlkLocs) {
scheduledBlkLocs.putAll(assignedBlocks);
}
{code}
And in parallel there is the Thread BlocksStorageMovementAttemptMonitor
In which in the blocksStorageMovementUnReportedItemsCheck() --->
{code:java}
synchronized (*storageMovementAttemptedItems*) {
Iterator<AttemptedItemInfo> iter = storageMovementAttemptedItems
.iterator();
.
.
.
blockStorageMovementNeeded.add(candidate);
*iter.remove();*
{code}
If this thread just hits after (1) is performed .This removes the item we added
and puts it in blockStorageMovementNeeded.That is why when we check in the
assertion :
{code:java}
assertEquals("Item doesn't exist in the attempted list", 1,
bsmAttemptedItems.getAttemptedItemsCount());
{code}
We get 0 instead of 1.
This thread comes back after 1 minute of interval.To outsmart this move.I added
the sleep before we going to enter our process of add into
storageMovementAttemptedItems.So that this thread goes up its first round and
doesn't interfere in our process.
If waiting doesn't seems as an alt
We can remove ::
{code:java}
bsmAttemptedItems.start();
{code}
This will not start up the monitor thread which is interfering.
Observed in the test just above testAddReportedMoveAttemptFinishedBlocks() . It
is also doing something similar and doesn't have this in.
was (Author: ayushtkn):
[~elgoiri]
In the test the at :
{code:java}
bsmAttemptedItems.add(0L, 0L, 0L, blocksMap, 0);
{code}
There are two operations that are performed
1-->
{code:java}
synchronized (*storageMovementAttemptedItems*) {
storageMovementAttemptedItems.add(itemInfo);
}
{code}
and 2-->
{code:java}
synchronized (scheduledBlkLocs) {
scheduledBlkLocs.putAll(assignedBlocks);
}
{code}
And in parallel there is the Thread BlocksStorageMovementAttemptMonitor
In which in the blocksStorageMovementUnReportedItemsCheck() --->
{code:java}
synchronized (*storageMovementAttemptedItems*) {
Iterator<AttemptedItemInfo> iter = storageMovementAttemptedItems
.iterator();
.
.
.
blockStorageMovementNeeded.add(candidate);
*iter.remove();*
{code}
If this thread just hits after (1) is performed .This removes the item we added
and puts it in blockStorageMovementNeeded.That is why when we check in the
assertion :
{code:java}
assertEquals("Item doesn't exist in the attempted list", 1,
bsmAttemptedItems.getAttemptedItemsCount());
{code}
We get 0 instead of 1.
This thread comes back after 1 minute of interval.To outsmart this move.I added
the sleep before we going to enter our process of add into
storageMovementAttemptedItems.So that this thread goes up its first round and
doesn't interfere in our process.
> TestBlockStorageMovementAttemptedItems#testNoBlockMovementAttemptFinishedReportAdded
> fails sporadically in Trunk
> ----------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-14145
> URL: https://issues.apache.org/jira/browse/HDFS-14145
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Ayush Saxena
> Assignee: Ayush Saxena
> Priority: Major
> Attachments: HDFS-14145-01.patch
>
>
> Reference :
> https://builds.apache.org/job/PreCommit-HDFS-Build/25739/testReport/junit/org.apache.hadoop.hdfs.server.namenode.sps/TestBlockStorageMovementAttemptedItems/testNoBlockMovementAttemptFinishedReportAdded/
> https://builds.apache.org/job/PreCommit-HDFS-Build/25746/testReport/junit/org.apache.hadoop.hdfs.server.namenode.sps/TestBlockStorageMovementAttemptedItems/testNoBlockMovementAttemptFinishedReportAdded/
> https://builds.apache.org/job/PreCommit-HDFS-Build/25768/testReport/junit/org.apache.hadoop.hdfs.server.namenode.sps/TestBlockStorageMovementAttemptedItems/testNoBlockMovementAttemptFinishedReportAdded/
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]