subject:"\[jira\] \[Commented\] \(HDFS\-13770\) dfsadmin \-report does not always decrease \"missing blocks \(with replication factor 1\)\" metrics when file is deleted"

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2019-06-18 Thread Wei-Chiu Chuang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16866316#comment-16866316
 ] 

Wei-Chiu Chuang commented on HDFS-13770:


Test failure doesn't apply to me. I updated 004 patch to address the checkstyle 
warning (adding a . to keep it happy) as the 005 patch. 

+1 the 005 patch and will commit now.

> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2-005.patch, 
> HDFS-13770-branch-2.001.patch, HDFS-13770-branch-2.002.patch, 
> HDFS-13770-branch-2.003.patch, HDFS-13770-branch-2.004.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2019-06-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16866236#comment-16866236
 ] 

Hadoop QA commented on HDFS-13770:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
45s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} branch-2 passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} branch-2 passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 43 unchanged - 2 fixed = 44 total (was 45) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 75m 48s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:da67579 |
| JIRA Issue | HDFS-13770 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12972035/HDFS-13770-branch-2.004.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 734e70f632d9 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provi

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2019-06-17 Thread Wei-Chiu Chuang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16866188#comment-16866188
 ] 

Wei-Chiu Chuang commented on HDFS-13770:


+1 The patch still applies. Updated timeout to 60 seconds and uploaded patch to 
 trigger precommit.

> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2.001.patch, 
> HDFS-13770-branch-2.002.patch, HDFS-13770-branch-2.003.patch, 
> HDFS-13770-branch-2.004.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2018-08-14 Thread Xiao Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16580319#comment-16580319
 ] 

Xiao Chen commented on HDFS-13770:
--

Thanks Kitti for the new rev and Zsolt for reviewing!

+1 on patch 3 pending 1 final thing:

Sorry I didn't make it clear - in general the test timeout is to prevent a 
stuck test to block the jenkins job. But because the jenkins slaves could be 
slow, the test timeout is better to be conservative so we don't get false 
negatives. So I suggest we bump the timeout to 60 seconds.

 

Since branch-2's pre-commit is pretty much broken... could you clarify what 
tests you have run for the latest patch?

> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2.001.patch, 
> HDFS-13770-branch-2.002.patch, HDFS-13770-branch-2.003.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2018-08-13 Thread Zsolt Venczel (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578169#comment-16578169
 ] 

Zsolt Venczel commented on HDFS-13770:
--

Thanks for the update [~knanasi], patch v003 looks good, +1 (non-binding) from 
me.

> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2.001.patch, 
> HDFS-13770-branch-2.002.patch, HDFS-13770-branch-2.003.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2018-08-13 Thread Kitti Nanasi (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16578073#comment-16578073
 ] 

Kitti Nanasi commented on HDFS-13770:
-

Thanks [~zvenczel] for the findings! I fixed them in patch v003.

> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2.001.patch, 
> HDFS-13770-branch-2.002.patch, HDFS-13770-branch-2.003.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2018-08-10 Thread Zsolt Venczel (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576460#comment-16576460
 ] 

Zsolt Venczel commented on HDFS-13770:
--

Hi [~knanasi],

Thanks for working on this and thank you for the latest patch. Your changes 
seem to be fine for me.

I've checked, the test does fail without the fix and passes with the fix 
applied.

I found a few checkstyle issues for UnderReplicatedBlocks line 265 and 
TestUnderReplicatedBlocks line 164, 175 and 186.

Best regards,
Zsolt

> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2.001.patch, 
> HDFS-13770-branch-2.002.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2018-07-27 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559547#comment-16559547
 ] 

genericqa commented on HDFS-13770:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 17m  
6s{color} | {color:red} Docker failed to build yetus/hadoop:f667ef1. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-13770 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933340/HDFS-13770-branch-2.002.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24667/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2.001.patch, 
> HDFS-13770-branch-2.002.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2018-07-27 Thread Kitti Nanasi (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16559534#comment-16559534
 ] 

Kitti Nanasi commented on HDFS-13770:
-

Thanks for the comments [~xiaochen]! I fixed them in patch v002 and added some 
sanity tests to make sure that the counter is not incremented/decremented in 
cases when the condition does not apply.

> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2.001.patch, 
> HDFS-13770-branch-2.002.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2018-07-26 Thread Xiao Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558883#comment-16558883
 ] 

Xiao Chen commented on HDFS-13770:
--

Thanks Kitti for identifying this and providing a fix! Patch looks pretty good, 
some minor comments:
- We can extract a method {{decrementBlockStat}} in 
{{UnderReplicatedBlocks#remove}} for less duplication.
- We can tidy up the new 3-param {{remove}}: make it private, and point its 
javadoc to the 2-param one. Some thing like:
{code}* For details, see {@link #remove(BlockInfo, int)}  {code} and 
explain the difference only (i.e. how oldExpectedReplicas is used).
- Original javadoc had a typo: s/attmpted/attempted/g.
- Test should have a timeout
- Do you think it's helpful to add a few other sanity tests in the same test 
case? For example, oldExpectedReplica of 2 doesn't trigger a counter decrease. 
From code it's pretty clear, so this is really just adding some extra coverage. 
Up to you. :)

> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2.001.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2018-07-26 Thread Kitti Nanasi (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558601#comment-16558601
 ] 

Kitti Nanasi commented on HDFS-13770:
-

[~jojochuang], I ran the same tests and 3.x does not have this bug.

> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2.001.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2018-07-26 Thread Wei-Chiu Chuang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558509#comment-16558509
 ] 

Wei-Chiu Chuang commented on HDFS-13770:


Thanks [~knanasi] really good finding!
HDFS-10999 pertains to erasure coding, so no way to backport it in branch-2.

That said, because HDFS-10999 is a huge internal refactor, have you run the 
same test (without some modification) and verified the test is not reproducible 
in 3.x?

> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2.001.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

2018-07-26 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558483#comment-16558483
 ] 

genericqa commented on HDFS-13770:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 21m  
6s{color} | {color:red} Docker failed to build yetus/hadoop:f667ef1. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-13770 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12933221/HDFS-13770-branch-2.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24661/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> dfsadmin -report does not always decrease "missing blocks (with replication 
> factor 1)" metrics when file is deleted
> ---
>
> Key: HDFS-13770
> URL: https://issues.apache.org/jira/browse/HDFS-13770
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.7.7
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13770-branch-2.001.patch
>
>
> Missing blocks (with replication factor 1) metric is not always decreased 
> when file is deleted.
> If a file is deleted, the remove function of UnderReplicatedBlocks can be 
> called with the wrong priority (UnderReplicatedBlocks.LEVEL), if it is called 
> with the wrong priority the corruptReplOneBlocks metric is not decreased, 
> however the block is removed from the priority queue which contains it.
> The corresponding code:
> {code:java}
> /** remove a block from a under replication queue */
> synchronized boolean remove(BlockInfo block,
>  int oldReplicas,
>  int oldReadOnlyReplicas,
>  int decommissionedReplicas,
>  int oldExpectedReplicas) {
>  final int priLevel = getPriority(oldReplicas, oldReadOnlyReplicas,
>  decommissionedReplicas, oldExpectedReplicas);
>  boolean removedBlock = remove(block, priLevel);
>  if (priLevel == QUEUE_WITH_CORRUPT_BLOCKS &&
>  oldExpectedReplicas == 1 &&
>  removedBlock) {
>  corruptReplOneBlocks--;
>  assert corruptReplOneBlocks >= 0 :
>  "Number of corrupt blocks with replication factor 1 " +
>  "should be non-negative";
>  }
>  return removedBlock;
> }
> /**
>  * Remove a block from the under replication queues.
>  *
>  * The priLevel parameter is a hint of which queue to query
>  * first: if negative or >= \{@link #LEVEL} this shortcutting
>  * is not attmpted.
>  *
>  * If the block is not found in the nominated queue, an attempt is made to
>  * remove it from all queues.
>  *
>  * Warning: This is not a synchronized method.
>  * @param block block to remove
>  * @param priLevel expected privilege level
>  * @return true if the block was found and removed from one of the priority 
> queues
>  */
> boolean remove(BlockInfo block, int priLevel) {
>  if(priLevel >= 0 && priLevel < LEVEL
>  && priorityQueues.get(priLevel).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block {}" +
>  " from priority queue {}", block, priLevel);
>  return true;
>  } else {
>  // Try to remove the block from all queues if the block was
>  // not found in the queue for the given priority level.
>  for (int i = 0; i < LEVEL; i++) {
>  if (i != priLevel && priorityQueues.get(i).remove(block)) {
>  NameNode.blockStateChangeLog.debug(
>  "BLOCK* NameSystem.UnderReplicationBlock.remove: Removing block" +
>  " {} from priority queue {}", block, i);
>  return true;
>  }
>  }
>  }
>  return false;
> }
> {code}
> It is already fixed on trunk by this jira: HDFS-10999, but that ticket 
> introduces new metrics, which I think should't be backported to branch-2.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

[jira] [Commented] (HDFS-13770) dfsadmin -report does not always decrease "missing blocks (with replication factor 1)" metrics when file is deleted

13 matches

Site Navigation

Mail list logo

Footer information