[
https://issues.apache.org/jira/browse/HDFS-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13796438#comment-13796438
]
Junping Du commented on HDFS-4376:
----------------------------------
In v3 patch:
- fix a bug involved in v2 patch
- identify a new and important race condition: bytesMoved.get() is not
synchronized. Although bytesMoved.inc() is already synchronized, the read
operation of get() is still must to be synchronized as JVM will treat 64-bit
long or double variable's read and write operation as two 32-bit operations.
Thus, without putting synchronized here or mark it as volatile, the read
operation could have a random result (mix with half of old value and new value
in case of updating).
Already test new patch in 100 iterations, all passed!
> Intermittent timeout of TestBalancerWithNodeGroup
> -------------------------------------------------
>
> Key: HDFS-4376
> URL: https://issues.apache.org/jira/browse/HDFS-4376
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: balancer, test
> Affects Versions: 2.0.3-alpha
> Reporter: Aaron T. Myers
> Assignee: Junping Du
> Attachments: BalancerTest-HDFS-4376-v1.tar.gz, HDFS-4376-v1.patch,
> HDFS-4376-v2.patch, HDFS-4376-v3.patch,
> test-balancer-with-node-group-timeout.txt
>
>
> HDFS-4261 fixed several issues with the balancer and balancer tests, and
> reduced the frequency with which TestBalancerWithNodeGroup times out. Despite
> this, occasional timeouts still occur in this test. This JIRA is to track and
> fix this problem.
--
This message was sent by Atlassian JIRA
(v6.1#6144)