[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated HDFS-12116: -- Target Version/s: 3.5.0 (was: 3.4.0) > BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail > -- > > Key: HDFS-12116 > URL: https://issues.apache.org/jira/browse/HDFS-12116 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Xiao Chen >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, > HDFS-12116.03.patch, > TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml > > > This seems to be long-standing, but the failure rate (~10%) is slightly > higher in dist-test run in using cdh. > In both _08 and _09 tests: > # an attempt is made to make a replica in {{TEMPORARY}} > state, by {{waitForTempReplica}}. > # Once that's returned, the test goes on to verify block reports shows > correct pending replication blocks. > But there's a race condition. If the replica is replicated between steps #1 > and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how > many replicas are replicated, hence failing the test. > Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and > {{TestNNHandlesCombinedBlockReport}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated HDFS-12116: Target Version/s: 3.4.0 (was: 3.3.0) Bulk update: moved all 3.3.0 non-blocker issues, please move back if it is a blocker. > BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail > -- > > Key: HDFS-12116 > URL: https://issues.apache.org/jira/browse/HDFS-12116 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Xiao Chen >Assignee: Zsolt Venczel >Priority: Major > Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, > HDFS-12116.03.patch, > TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml > > > This seems to be long-standing, but the failure rate (~10%) is slightly > higher in dist-test run in using cdh. > In both _08 and _09 tests: > # an attempt is made to make a replica in {{TEMPORARY}} > state, by {{waitForTempReplica}}. > # Once that's returned, the test goes on to verify block reports shows > correct pending replication blocks. > But there's a race condition. If the replica is replicated between steps #1 > and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how > many replicas are replicated, hence failing the test. > Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and > {{TestNNHandlesCombinedBlockReport}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12116: - Target Version/s: 3.0.0 (was: 2.9.0) > BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail > -- > > Key: HDFS-12116 > URL: https://issues.apache.org/jira/browse/HDFS-12116 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, > HDFS-12116.03.patch, > TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml > > > This seems to be long-standing, but the failure rate (~10%) is slightly > higher in dist-test run in using cdh. > In both _08 and _09 tests: > # an attempt is made to make a replica in {{TEMPORARY}} > state, by {{waitForTempReplica}}. > # Once that's returned, the test goes on to verify block reports shows > correct pending replication blocks. > But there's a race condition. If the replica is replicated between steps #1 > and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how > many replicas are replicated, hence failing the test. > Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and > {{TestNNHandlesCombinedBlockReport}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated HDFS-12116: --- Is this still on target for 2.9.0 ? If not, can we we push this out to the next major release ? > BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail > -- > > Key: HDFS-12116 > URL: https://issues.apache.org/jira/browse/HDFS-12116 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, > HDFS-12116.03.patch, > TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml > > > This seems to be long-standing, but the failure rate (~10%) is slightly > higher in dist-test run in using cdh. > In both _08 and _09 tests: > # an attempt is made to make a replica in {{TEMPORARY}} > state, by {{waitForTempReplica}}. > # Once that's returned, the test goes on to verify block reports shows > correct pending replication blocks. > But there's a race condition. If the replica is replicated between steps #1 > and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how > many replicas are replicated, hence failing the test. > Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and > {{TestNNHandlesCombinedBlockReport}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12116: - Attachment: HDFS-12116.03.patch Will run dist-test again with patch 3 just in case. > BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail > -- > > Key: HDFS-12116 > URL: https://issues.apache.org/jira/browse/HDFS-12116 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, > HDFS-12116.03.patch, > TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml > > > This seems to be long-standing, but the failure rate (~10%) is slightly > higher in dist-test run in using cdh. > In both _08 and _09 tests: > # an attempt is made to make a replica in {{TEMPORARY}} > state, by {{waitForTempReplica}}. > # Once that's returned, the test goes on to verify block reports shows > correct pending replication blocks. > But there's a race condition. If the replica is replicated between steps #1 > and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how > many replicas are replicated, hence failing the test. > Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and > {{TestNNHandlesCombinedBlockReport}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12116: - Description: This seems to be long-standing, but the failure rate (~10%) is slightly higher in dist-test run in using cdh. In both _08 and _09 tests: # an attempt is made to make a replica in {{TEMPORARY}} state, by {{waitForTempReplica}}. # Once that's returned, the test goes on to verify block reports shows correct pending replication blocks. But there's a race condition. If the replica is replicated between steps #1 and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how many replicas are replicated, hence failing the test. Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and {{TestNNHandlesCombinedBlockReport}} was: This seems to be long-standing, but the failure rate (~10%) is slightly higher in dist-test run in using cdh. In both _08 and _09 tests: # an attempt is made to make a replica in {{TEMPORARY}} state, by {{waitForTempReplica}}. # Once that's returned, the test goes on to verify block reports shows correct pending replication blocks. But there's a race condition. If the replica is replicated between steps #1 and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how many replicas are replicated, hence failing the test. > BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail > -- > > Key: HDFS-12116 > URL: https://issues.apache.org/jira/browse/HDFS-12116 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, > TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml > > > This seems to be long-standing, but the failure rate (~10%) is slightly > higher in dist-test run in using cdh. > In both _08 and _09 tests: > # an attempt is made to make a replica in {{TEMPORARY}} > state, by {{waitForTempReplica}}. > # Once that's returned, the test goes on to verify block reports shows > correct pending replication blocks. > But there's a race condition. If the replica is replicated between steps #1 > and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how > many replicas are replicated, hence failing the test. > Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and > {{TestNNHandlesCombinedBlockReport}} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12116: - Attachment: HDFS-12116.02.patch > BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail > -- > > Key: HDFS-12116 > URL: https://issues.apache.org/jira/browse/HDFS-12116 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, > TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml > > > This seems to be long-standing, but the failure rate (~10%) is slightly > higher in dist-test run in using cdh. > In both _08 and _09 tests: > # an attempt is made to make a replica in {{TEMPORARY}} > state, by {{waitForTempReplica}}. > # Once that's returned, the test goes on to verify block reports shows > correct pending replication blocks. > But there's a race condition. If the replica is replicated between steps #1 > and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how > many replicas are replicated, hence failing the test. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12116: - Status: Patch Available (was: Open) Ran the patch with dist-test 100 times, did not see test failures. > BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail > -- > > Key: HDFS-12116 > URL: https://issues.apache.org/jira/browse/HDFS-12116 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12116.01.patch, > TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml > > > This seems to be long-standing, but the failure rate (~10%) is slightly > higher in dist-test run in using cdh. > In both _08 and _09 tests: > # an attempt is made to make a replica in {{TEMPORARY}} > state, by {{waitForTempReplica}}. > # Once that's returned, the test goes on to verify block reports shows > correct pending replication blocks. > But there's a race condition. If the replica is replicated between steps #1 > and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how > many replicas are replicated, hence failing the test. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12116: - Attachment: HDFS-12116.01.patch Attaching a patch to fix the test. The intervals has to be updated on the fly, because we need IBRs to trigger the replica to be {{TEMPORARY}}, but later we don't want the IBRs to update when that replica is actually replicated. > BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail > -- > > Key: HDFS-12116 > URL: https://issues.apache.org/jira/browse/HDFS-12116 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: HDFS-12116.01.patch, > TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml > > > This seems to be long-standing, but the failure rate (~10%) is slightly > higher in dist-test run in using cdh. > In both _08 and _09 tests: > # an attempt is made to make a replica in {{TEMPORARY}} > state, by {{waitForTempReplica}}. > # Once that's returned, the test goes on to verify block reports shows > correct pending replication blocks. > But there's a race condition. If the replica is replicated between steps #1 > and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how > many replicas are replicated, hence failing the test. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
[ https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12116: - Attachment: TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml Attaching a failure log > BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail > -- > > Key: HDFS-12116 > URL: https://issues.apache.org/jira/browse/HDFS-12116 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.22.0 >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: > TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml > > > This seems to be long-standing, but the failure rate (~10%) is slightly > higher in dist-test run in using cdh. > In both _08 and _09 tests: > # an attempt is made to make a replica in {{TEMPORARY}} > state, by {{waitForTempReplica}}. > # Once that's returned, the test goes on to verify block reports shows > correct pending replication blocks. > But there's a race condition. If the replica is replicated between steps #1 > and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how > many replicas are replicated, hence failing the test. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org