[jira] [Comment Edited] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced

2019-10-06 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16945301#comment-16945301
 ] 

Surendra Singh Lilhore edited comment on HDFS-14754 at 10/6/19 11:00 AM:
-

Tested it with fix and without fix

*With fix*
{noformat}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.686 s 
- in org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 {noformat}
*Wihout fix*
{noformat}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.605 s 
<<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks
[ERROR] 
testProcessOverReplicatedAndRedudantBlock(org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks)
  Time elapsed: 6.509 s  <<< FAILURE!
java.lang.AssertionError: expected:<5> but was:<4>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks.testProcessOverReplicatedAndRedudantBlock(TestRedudantBlocks.java:135)
 {noformat}
 


was (Author: surendrasingh):
Test it with fix and without fix

*With fix*
{noformat}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.686 s 
- in org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 {noformat}
*Wihout fix*
{noformat}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.605 s 
<<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks
[ERROR] 
testProcessOverReplicatedAndRedudantBlock(org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks)
  Time elapsed: 6.509 s  <<< FAILURE!
java.lang.AssertionError: expected:<5> but was:<4>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.hadoop.hdfs.server.namenode.TestRedudantBlocks.testProcessOverReplicatedAndRedudantBlock(TestRedudantBlocks.java:135)
 {noformat}
 

> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-14754-addendum.001.patch, 
> HDFS-14754-addendum.002.patch, HDFS-14754.001.patch, HDFS-14754.002.patch, 
> HDFS-14754.003.patch, HDFS-14754.004.patch, HDFS-14754.005.patch, 
> HDFS-14754.006.patch, HDFS-14754.007.patch, HDFS-14754.008.patch, 
> HDFS-14754.branch-3.1.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced

2019-09-30 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941376#comment-16941376
 ] 

Wei-Chiu Chuang edited comment on HDFS-14754 at 9/30/19 10:20 PM:
--

Too bad this one didn't land in 3.2.1 and 3.1.3. Let's make sure it gets added 
to lower releases.


was (Author: jojochuang):
Too bad this one didn't land in 3.2.1 and 3.1.3. Let's make sure they get added 
to lower releases.

> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: HDFS-14754-addendum.001.patch, HDFS-14754.001.patch, 
> HDFS-14754.002.patch, HDFS-14754.003.patch, HDFS-14754.004.patch, 
> HDFS-14754.005.patch, HDFS-14754.006.patch, HDFS-14754.007.patch, 
> HDFS-14754.008.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced

2019-09-11 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928202#comment-16928202
 ] 

Surendra Singh Lilhore edited comment on HDFS-14754 at 9/12/19 5:06 AM:


[~hemanthboyina], pls attach the patch again with proper comment in test case.
{code:java}
// update blocksMap
cluster.triggerBlockReports();
// add to invalidates
cluster.triggerHeartbeats();
// datanode delete block
cluster.triggerHeartbeats();
// update blocksMap
cluster.triggerBlockReports();  {code}


was (Author: surendrasingh):
[~hemanthboyina], pls attach the patch again with proper comment in test case.

> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Attachments: HDFS-14754.001.patch, HDFS-14754.002.patch, 
> HDFS-14754.003.patch, HDFS-14754.004.patch, HDFS-14754.005.patch, 
> HDFS-14754.006.patch, HDFS-14754.007.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced

2019-09-08 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925157#comment-16925157
 ] 

Ayush Saxena edited comment on HDFS-14754 at 9/8/19 12:17 PM:
--

That seems a little overkill to me. No need to proof that in multiple BR also 
it doesn’t get corrected. IMHO just verifying that your new introduced logic 
now works should be enough.
Anyway, there is a single test, Don't think we need a separate test class for 
it?
May be better if you can use existing test classes. Since the change is in 
BlockManager, can try {{TestBlockManager}} or maybe some other.


was (Author: ayushtkn):
That seems a little overkill to me. No need to proof that in multiple BR also 
it doesn’t get corrected. IMHO just verifying that your new introduced logic 
now works should be enough.

> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Attachments: HDFS-14754.001.patch, HDFS-14754.002.patch, 
> HDFS-14754.003.patch, HDFS-14754.004.patch, HDFS-14754.005.patch, 
> HDFS-14754.006.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced

2019-09-08 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925090#comment-16925090
 ] 

Surendra Singh Lilhore edited comment on HDFS-14754 at 9/8/19 7:01 AM:
---

[~ayushtkn]
{quote}HDFS-14699 seems to be handling a quite similar scenario. Whether post 
that gets in, still this problem be there. Whether reconstruction will still 
not happen, without this?
{quote}
Both the issue are different, as I mentioned in comment , HDFS-14699 is 
handling the busy DN scenario not duplicate block scenario. While reviewing I 
checked this already.


was (Author: surendrasingh):
[~ayushtkn]
{quote}HDFS-14699 seems to be handling a quite similar scenario. Whether post 
that gets in, still this problem be there. Whether reconstruction will still 
not happen, without this?
{quote}
Both the issue are different, as I mentioned in comment , HDFS-14699 is 
handling the busy DN scenario not duplicate block scenario. While reviewing I 
checked this already.

> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Attachments: HDFS-14754.001.patch, HDFS-14754.002.patch, 
> HDFS-14754.003.patch, HDFS-14754.004.patch, HDFS-14754.005.patch, 
> HDFS-14754.006.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14754) Erasure Coding : The number of Under-Replicated Blocks never reduced

2019-09-07 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924916#comment-16924916
 ] 

Ayush Saxena edited comment on HDFS-14754 at 9/7/19 4:02 PM:
-

Thanx [~hemanthboyina] for the patch and [~surendrasingh] for the review..
bq. Replicated block was not deleting and missing block is not able to 
ReConstruct.
HDFS-14699 seems to be handling a quite similar scenario. Whether post that 
gets in, still this problem be there. Whether reconstruction will still not 
happen, without this?  [~hemanthboyina] can you give  a check once.
Anyway on a quick look, For the UT,  Can you add some comments regarding what 
actually the process you are trying to validate, for better undesirability. 

{code:java}
+cluster.triggerBlockReports();
+cluster.triggerHeartbeats();
+cluster.triggerHeartbeats();
+cluster.triggerBlockReports();
{code}

Can you add a line explaining why it is required twice, here.



was (Author: ayushtkn):
Thanx [~hemanthboyina] for the patch and [~surendrasingh] for the review..
bq. Replicated block was not deleting and missing block is not able to 
ReConstruct.
HDFS-14699 seems to be handling a quite similar scenario. Whether post that 
gets in, still this problem be there. Whether reconstruction will still not 
happen, without this?  [~hemanthboyina] can you give  a check once.
Anyway on a quick look, For the UT,  Can you add some comments regarding what 
actually the process you are trying to validate, for better undesirability. 

{code:java}
+int i = 0;
+// one missing block
+for (; i < groupSize - 1; i++)
{code}

Any reasons for making i out?


{code:java}
+cluster.triggerBlockReports();
+cluster.triggerHeartbeats();
+cluster.triggerHeartbeats();
+cluster.triggerBlockReports();
{code}

Can you add a line explaining why it is required twice, here.


> Erasure Coding :  The number of Under-Replicated Blocks never reduced
> -
>
> Key: HDFS-14754
> URL: https://issues.apache.org/jira/browse/HDFS-14754
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Critical
> Attachments: HDFS-14754.001.patch, HDFS-14754.002.patch, 
> HDFS-14754.003.patch, HDFS-14754.004.patch, HDFS-14754.005.patch, 
> HDFS-14754.006.patch
>
>
> Using EC RS-3-2, 6 DN 
> We came accross a scenario where in the EC 5 blocks , same block is 
> replicated thrice and two blocks got missing
> Replicated block was not deleting and missing block is not able to ReConstruct



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org