[jira] [Commented] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-13 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375756#comment-15375756
 ] 

Staffan Friberg commented on HDFS-10620:


To avoid all allocation.

{noformat}
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
index 1a76e09..349b018 100644
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
@@ -1319,7 +1319,8 @@ private void addToInvalidates(BlockInfo storedBlock) {
 if (!isPopulatingReplQueues()) {
   return;
 }
-StringBuilder datanodes = new StringBuilder();
+StringBuilder datanodes = blockLog.isDebugEnabled()
+? new StringBuilder() : null;
 for (DatanodeStorageInfo storage : blocksMap.getStorages(storedBlock)) {
   if (storage.getState() != State.NORMAL) {
 continue;
@@ -1328,10 +1329,12 @@ private void addToInvalidates(BlockInfo storedBlock) {
   final Block b = getBlockOnStorage(storedBlock, storage);
   if (b != null) {
 invalidateBlocks.add(b, node, false);
-datanodes.append(node).append(" ");
+if (datanodes != null) {
+  datanodes.append(node).append(" ");
+}
   }
 }
-if (datanodes.length() != 0) {
+if (datanodes != null && datanodes.length() != 0) {
   blockLog.debug("BLOCK* addToInvalidates: {} {}", storedBlock, datanodes);
 }
   }
{noformat}


> StringBuilder created and appended even if logging is disabled
> --
>
> Key: HDFS-10620
> URL: https://issues.apache.org/jira/browse/HDFS-10620
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.4
>Reporter: Staffan Friberg
> Attachments: HDFS-10620.001.patch
>
>
> In BlockManager.addToInvalidates the StringBuilder is appended to during the 
> delete even if logging isn't active.
> Could avoid allocating the StringBuilder as well, but not sure if it is 
> really worth it to add null handling in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-13 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-10620:
---
Fix Version/s: (was: 3.0.0-alpha1)

> StringBuilder created and appended even if logging is disabled
> --
>
> Key: HDFS-10620
> URL: https://issues.apache.org/jira/browse/HDFS-10620
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.4
>Reporter: Staffan Friberg
> Attachments: HDFS-10620.001.patch
>
>
> In BlockManager.addToInvalidates the StringBuilder is appended to during the 
> delete even if logging isn't active.
> Could avoid allocating the StringBuilder as well, but not sure if it is 
> really worth it to add null handling in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-13 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-10620:
---
Attachment: HDFS-10620.001.patch

> StringBuilder created and appended even if logging is disabled
> --
>
> Key: HDFS-10620
> URL: https://issues.apache.org/jira/browse/HDFS-10620
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.4
>Reporter: Staffan Friberg
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10620.001.patch
>
>
> In BlockManager.addToInvalidates the StringBuilder is appended to during the 
> delete even if logging isn't active.
> Could avoid allocating the StringBuilder as well, but not sure if it is 
> really worth it to add null handling in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-13 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-10620:
---
Fix Version/s: 3.0.0-alpha1
   Status: Patch Available  (was: Open)

> StringBuilder created and appended even if logging is disabled
> --
>
> Key: HDFS-10620
> URL: https://issues.apache.org/jira/browse/HDFS-10620
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.4
>Reporter: Staffan Friberg
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10620.001.patch
>
>
> In BlockManager.addToInvalidates the StringBuilder is appended to during the 
> delete even if logging isn't active.
> Could avoid allocating the StringBuilder as well, but not sure if it is 
> really worth it to add null handling in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-13 Thread Staffan Friberg (JIRA)
Staffan Friberg created HDFS-10620:
--

 Summary: StringBuilder created and appended even if logging is 
disabled
 Key: HDFS-10620
 URL: https://issues.apache.org/jira/browse/HDFS-10620
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.4
Reporter: Staffan Friberg


In BlockManager.addToInvalidates the StringBuilder is appended to during the 
delete even if logging isn't active.

Could avoid allocating the StringBuilder as well, but not sure if it is really 
worth it to add null handling in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-10563) Block reports could be silently dropped by NN

2016-06-23 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg resolved HDFS-10563.

Resolution: Invalid

> Block reports could be silently dropped by NN
> -
>
> Key: HDFS-10563
> URL: https://issues.apache.org/jira/browse/HDFS-10563
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Staffan Friberg
>Priority: Critical
>
> Reading through the block reporting code I think I've spotted a case when 
> block reports can silently be dropped and leave thread waiting indefinitely 
> on a FutureTask that never will be executed.
> The BlockReportProcessingThread.enqueue method doesn't return any status on 
> if the enqueuing of the task was successful and does not handle the case when 
> the queue is full and offer return false.
> Going back through the call stack to BlockManager.runBlockOp, which 
> indirectly calls enqueue with a FutureTask and then proceeds to do get() om 
> the task.
> So if the internal queue in the BlockReportingProcessingThread is full, the 
> BR would never be handled and the thread queuing the task would wait 
> indefinitely on the FutureTask that will never be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10563) Block reports could be silently dropped by NN

2016-06-23 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346510#comment-15346510
 ] 

Staffan Friberg commented on HDFS-10563:


Now I see it. Completely missed the !.

> Block reports could be silently dropped by NN
> -
>
> Key: HDFS-10563
> URL: https://issues.apache.org/jira/browse/HDFS-10563
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Staffan Friberg
>Priority: Critical
>
> Reading through the block reporting code I think I've spotted a case when 
> block reports can silently be dropped and leave thread waiting indefinitely 
> on a FutureTask that never will be executed.
> The BlockReportProcessingThread.enqueue method doesn't return any status on 
> if the enqueuing of the task was successful and does not handle the case when 
> the queue is full and offer return false.
> Going back through the call stack to BlockManager.runBlockOp, which 
> indirectly calls enqueue with a FutureTask and then proceeds to do get() om 
> the task.
> So if the internal queue in the BlockReportingProcessingThread is full, the 
> BR would never be handled and the thread queuing the task would wait 
> indefinitely on the FutureTask that will never be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10563) Block reports could be silently dropped by NN

2016-06-23 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346498#comment-15346498
 ] 

Staffan Friberg commented on HDFS-10563:


Unless I'm counting {} wrong it looks like the put is inside the if statement.

> Block reports could be silently dropped by NN
> -
>
> Key: HDFS-10563
> URL: https://issues.apache.org/jira/browse/HDFS-10563
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0, 2.7.3
>Reporter: Staffan Friberg
>Priority: Critical
>
> Reading through the block reporting code I think I've spotted a case when 
> block reports can silently be dropped and leave thread waiting indefinitely 
> on a FutureTask that never will be executed.
> The BlockReportProcessingThread.enqueue method doesn't return any status on 
> if the enqueuing of the task was successful and does not handle the case when 
> the queue is full and offer return false.
> Going back through the call stack to BlockManager.runBlockOp, which 
> indirectly calls enqueue with a FutureTask and then proceeds to do get() om 
> the task.
> So if the internal queue in the BlockReportingProcessingThread is full, the 
> BR would never be handled and the thread queuing the task would wait 
> indefinitely on the FutureTask that will never be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10563) Block reports could be silently dropped by NN

2016-06-22 Thread Staffan Friberg (JIRA)
Staffan Friberg created HDFS-10563:
--

 Summary: Block reports could be silently dropped by NN
 Key: HDFS-10563
 URL: https://issues.apache.org/jira/browse/HDFS-10563
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0-beta1
Reporter: Staffan Friberg


Reading through the block reporting code I think I've spotted a case when block 
reports can silently be dropped and leave thread waiting indefinitely on a 
FutureTask that never will be executed.

The BlockReportProcessingThread.enqueue method doesn't return any status on if 
the enqueuing of the task was successful and does not handle the case when the 
queue is full and offer return false.

Going back through the call stack to BlockManager.runBlockOp, which indirectly 
calls enqueue with a FutureTask and then proceeds to do get() om the task.

So if the internal queue in the BlockReportingProcessingThread is full, the BR 
would never be handled and the thread queuing the task would wait indefinitely 
on the FutureTask that will never be executed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-02-01 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.018.patch

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFS-9260.017.patch, HDFS-9260.018.patch, 
> HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-29 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124362#comment-15124362
 ] 

Staffan Friberg commented on HDFS-9260:
---

Patch17
   Added timeout as a tuneable
   Smaller read-lock region and optimized the fill ratio calculation so not all 
nodes are required to be iterated (still need to find the last node).
   Updated tuneable names as per Colin's suggestion

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFS-9260.017.patch, HDFSBenchmarks.zip, 
> HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-29 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.017.patch

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFS-9260.017.patch, HDFSBenchmarks.zip, 
> HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-28 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122433#comment-15122433
 ] 

Staffan Friberg commented on HDFS-9260:
---

Thanks for the comments

Patch 15 (and 16) should address all your comments.

I did not change the protected to private as there are some direct access in 
the two subclasses.

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-28 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.016.patch

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-28 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122460#comment-15122460
 ] 

Staffan Friberg commented on HDFS-9260:
---

Hi [~jingzhao],

Thank you for your comments! Updated with the patch (version 16).

1. Done, moved to context
2. Done
3. Done, removed
4. I have started to look at this multiple times as I have been working on the 
patch, but have so far failed to find a simple way to separate it. The remove 
methods are so deeply linked when removing a block that I can't really figure 
out a clean way to lift it out, and if it was possible it would in itself be a 
fairly large change I believe. Let me know if you have any ideas.
5. Done, locking up directly in the map with a new Block(replicaID).
6. Done, removed
7. The reason for duplicating it is basically to avoid that the NN allocates 4 
LinkedLists as part of each block that is being reported in an IBR. Potentially 
one could change the fullBR to not rely on lists and simply add/remove as it 
finds entries. Two issues that needs to be thought about for this, how should 
logging be handled since some counting is done as part of number of handled 
blocks, and, is it better to have multiple loops with smaller code footprint 
than expanding the already large one with even more code to handle each case 
directly. I agree with you that it is bad with the two code paths, but I think 
it the reduction in allocation for IBRs could be worth it.
8. Done, I do the same checks I do in removeLeft/Right
9, 10. Good point. Is it required to hold the readlock around the loops, or 
would it be enough to just hold it around the inner most iteration that 
calculates the fragmentation for a storage. Would help reduce time 
significantly for the first iteration. Need to think a bit for about the second 
part when actually doing defragmentation on abort mechanism. What is an OK time 
limit? I saw 4ms being mentioned in HDFS-9198.

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-27 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.015.patch

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-25 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.014.patch

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFSBenchmarks.zip, 
> HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-25 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115356#comment-15115356
 ] 

Staffan Friberg commented on HDFS-9260:
---

Fixed checkstyle on TreeSet.

Should I convert storages field to private? (The triplets field was protected)

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFSBenchmarks.zip, 
> HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-22 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.012.patch

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-22 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.013.patch

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-21 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110816#comment-15110816
 ] 

Staffan Friberg commented on HDFS-9260:
---

Thanks for the comments [~cmccabe]! Did all your suggested changes, except the 
two below which needed some further discussion.

{quote}
{code}
 >   if (shouldPostponeBlocksFromFuture) {
 > // If the block is an out-of-date generation stamp or state,
 > // but we're the standby, we shouldn't treat it as corrupt,
 > // but instead just queue it for later processing.
 > // TODO: Pretty confident this should be s/storedBlock/block below,
 > // since we should be postponing the info of the reported block, not
 > // the stored block. See HDFS-6289 for more context.
 > queueReportedBlock(storageInfo, storedBlock, reportedState,
 > QUEUE_REASON_CORRUPT_STATE);
 >   } else {
{code}
If we're really confident that this should be "block" rather than 
"storedBlock", let's fix it.
{quote}

This comment is simply copied from the method "processAndHandleReportedBlock" 
in the same class and not mine (doesn't show up since I didn't edit that 
method). I kept it as part of the structure since I wanted to make sure the 
algorithm behaves in the same way. So might be best to address it in a separate 
bug.


{quote}
{code}
   /**
* Add a replica's meta information into the map
*
* @param bpid block pool id
* @param replicaInfo a replica's meta information
-   * @return previous meta information of the replica
+   * @return true if inserted into the set
* @throws IllegalArgumentException if the input parameter is null
*/
-  ReplicaInfo add(String bpid, ReplicaInfo replicaInfo) {
+  boolean add(String bpid, ReplicaInfo replicaInfo) {
{code}
I would like to see some clear comments in this function on what happens if 
there is already a copy of the replicaInfo in the ReplicaMap. I might be wrong, 
but based on my reading of TreeSet.java, it seems like the new entry won't be 
added, which is a behavior change from what we did earlier. Unless I'm missing 
something, this doesn't seem quite right since the new ReplicaInfo might have a 
different genstamp, etc.
{quote}
Yes this is a change in behavior compared to earlier. Started down this path 
since add on a Set doesn't replace, which unfortunately doesn't match what the 
Map API does. I added a "replace" method in the class to be used when a replace 
behavior is needed and went through the code to ensure the right method is 
called when needed. Not really happy about this choice, perhaps a cleaner way 
would be to have a addWithReplace method on the TreeSet and keep the old add 
behavior of the ReplicaMap. I believe it would reduce the size of the patch and 
only add one "ugly" method on the TreeSet.

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-21 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.011.patch

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFSBenchmarks.zip, 
> HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-11-25 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15027676#comment-15027676
 ] 

Staffan Friberg commented on HDFS-9260:
---

Added a new benchmark that does IBRs on a blockmap/datastorageinfo that 
contains 2M entries and deletes/re-adds 20% of those entries. The updates are 
spread out over multiple IBRs and each IBR contains between 50-350 changed 
blocks. The IntMapping version is again the patch from HDFS-6658.


{noformat}
Some further benchmarking of Incremental BR.

==> benchmarks_trunkMarch11_intMapping.jar.output <==

Benchmark  Mode  Cnt ScoreError  Units
IncrementalBlockReport.receivedAndDeleted  avgt   50  3969.207 ± 14.979  ms/op

==> benchmarks_treeset_baseline.jar.output <==

Benchmark  Mode  CntScoreError  Units
IncrementalBlockReport.receivedAndDeleted  avgt   50  387.936 ± 25.634  ms/op

==> benchmarks_treeset.jar.output <==

Benchmark  Mode  Cnt ScoreError  Units
IncrementalBlockReport.receivedAndDeleted  avgt   50  1205.779 ± 75.464  ms/op
{noformat}


> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFSBenchmarks.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-11-25 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFSBenchmarks2.zip

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-11-25 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.010.patch

Avoid LinkedList allocations

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFSBenchmarks.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-11-04 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990758#comment-14990758
 ] 

Staffan Friberg commented on HDFS-9260:
---

Hi Daryn,

Thanks for the comments and the additional data points. Interesting to learn 
more about the scale of HDFS instances. I wonder if the NN was running on older 
and slower hardware in my case compared to your setup, the cluster I was able 
to get my hands on for these runs has fairly old machines.

Adds of new blocks are relatively fast since they will be at the far right of 
the Tree the number of lookups will be minimal. However the current 
implementation only needs to do around two writes to insert something at the 
head/end of the list nothing that has a more complicated datastructure will be 
able to match it. It will be a question of trade-off.

Also to clarify, the microbenchmarks only measures the actual remove and insert 
of random values not the whole process of copying files etc. I would expect the 
other parts to far outweigh the time it takes to update the datastructures, so 
while the 4x sounds scary it should be a minor part of the whole transaction.

I think the patch you are referring to is HDFS-6658. I applied it to the 3.0.0 
branch from March 11 2015 which was from when the patch was created and ran it 
on the same microbenchmarks I built to test my patch. I will attach the source 
code for the benchmarks so you can check that I used the right APIs for it to 
be comparable. From what I can tell the benchmarks should do the same thing on 
a high level. The performance overhead for adding and removing are similar 
between our two implementations. 

{noformat}
fbrAllExisting  - Do a Full Block Report with the same 2M entries that are 
already registered for the Storage in the NN.
addRemoveBulk   - Remove 32k random blocks from a StorageInfo that has 64k 
entries, then re-add them all.
addRemoveRandom - Remove and directly re-add a block from a Storage entry, 
repeat for 32k blocks from a StorageInfo with 64k blocks
iterate - Iterate and get blockID for 64k blocks associated with a 
particular StorageInfo

==> benchmarks_trunkMarch11_intMapping.jar.output <==
Benchmark  Mode  CntScore   Error  Units
FullBlockReport.fbrAllExisting avgt   25  379.659 ± 5.463  ms/op
StorageInfoAccess.addRemoveBulkavgt   25   16.426 ± 0.380  ms/op
StorageInfoAccess.addRemoveRandom  avgt   25   15.401 ± 0.196  ms/op
StorageInfoAccess.iterate  avgt   251.496 ± 0.004  ms/op

==> benchmarks_trunk_baseline.jar.output <==
Benchmark  Mode  CntScore   Error  Units
FullBlockReport.fbrAllExisting avgt   25  288.974 ± 3.970  ms/op
StorageInfoAccess.addRemoveBulkavgt   253.157 ± 0.046  ms/op
StorageInfoAccess.addRemoveRandom  avgt   252.815 ± 0.012  ms/op
StorageInfoAccess.iterate  avgt   250.788 ± 0.006  ms/op

==> benchmarks_trunk_treeset.jar.output <==
Benchmark  Mode  CntScore   Error  Units
FullBlockReport.fbrAllExisting avgt   25  231.270 ± 3.450  ms/op
StorageInfoAccess.addRemoveBulkavgt   25   11.596 ± 0.521  ms/op
StorageInfoAccess.addRemoveRandom  avgt   25   11.249 ± 0.101  ms/op
StorageInfoAccess.iterate  avgt   250.385 ± 0.010  ms/op
{noformat}

Do you have a good suggestion for some other perf test/stress test that would 
be good to try out? Any stress load you have on your end that would be possible 
to try it out on?

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-11-04 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFSBenchmarks.zip

Microbenchmarks

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFSBenchmarks.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-11-04 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Description: 
This patch changes the datastructures used for BlockInfos and Replicas to keep 
them sorted. This allows faster and more GC friendly handling of full block 
reports.

Would like to hear peoples feedback on this change.





  was:
This patch changes the datastructures used for BlockInfos and Replicas to keep 
them sorted. This allows faster and more GC friendly handling of full block 
reports.

Would like to hear peoples feedback on this change and also some help 
investigating/understanding a few outstanding issues if we are interested in 
moving forward with this.






> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9350) Avoid creating temprorary strings in Block.toString() and getBlockName()

2015-11-02 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985738#comment-14985738
 ] 

Staffan Friberg commented on HDFS-9350:
---

It does, but the problem is that the call to Long.toString will actually 
allocate a new String to be appended, and the call to getBlockname sometimes 
(unless correctly inlined will do the same). So you don't get a single append 
chain with SB in this case.

> Avoid creating temprorary strings in Block.toString() and getBlockName()
> 
>
> Key: HDFS-9350
> URL: https://issues.apache.org/jira/browse/HDFS-9350
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
>Priority: Minor
> Attachments: HDFS-9350.001.patch
>
>
> Minor change to use StringBuilders directly to avoid creating temporary 
> strings of Long and Block name when doing toString on a Block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-11-02 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986062#comment-14986062
 ] 

Staffan Friberg commented on HDFS-9260:
---

Hi Daryn,

Thanks for taking a look at the patch.

1. FBR and startup improves, please see the attached PDF.
2. Will need to check what we do here (and if I still have the old logs), but 
doesn't feel like it should be affected
3. We will be slightly slower when deleting a file or removing with the current 
algorithms as it goes through the LightWeightGSet to first lookup/remove each 
affected blockinfo, and after that remove it from the linked list. In my case 
it will be removed from treeset which requires a new lookup. However while this 
is slower I think the time it takes to that process is far outweighed by the 
time it takes for deleting or redistributing blocks on all DN. Deleting files 
with a large number of blocks seems to take on the order of hours since we only 
send small parts of the total block list to each node on every heartbeat. No to 
familiar with how aggressive the redistribution is in the event of a DN 
decommission.
4. It will decrease as long as the TreeSet is kept above ~50% fill ratio, since 
the reference to each blockinfo no is a single pointer from the treeset instead 
of the double linked list.

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9350) Avoid creating temprorary strings in Block.toString() and getBlockName()

2015-10-30 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9350:
--
Status: Patch Available  (was: Open)

> Avoid creating temprorary strings in Block.toString() and getBlockName()
> 
>
> Key: HDFS-9350
> URL: https://issues.apache.org/jira/browse/HDFS-9350
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
>Priority: Minor
> Attachments: HDFS-9350.001.patch
>
>
> Minor change to use StringBuilders directly to avoid creating temporary 
> strings of Long and Block name when doing toString on a Block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9350) Avoid creating temprorary strings in Block.toString() and getBlockName()

2015-10-30 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9350:
--
Attachment: HDFS-9350.001.patch

> Avoid creating temprorary strings in Block.toString() and getBlockName()
> 
>
> Key: HDFS-9350
> URL: https://issues.apache.org/jira/browse/HDFS-9350
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Priority: Minor
> Attachments: HDFS-9350.001.patch
>
>
> Minor change to use StringBuilders directly to avoid creating temporary 
> strings of Long and Block name when doing toString on a Block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9350) Avoid creating temprorary strings in Block.toString() and getBlockName()

2015-10-30 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg reassigned HDFS-9350:
-

Assignee: Staffan Friberg

> Avoid creating temprorary strings in Block.toString() and getBlockName()
> 
>
> Key: HDFS-9350
> URL: https://issues.apache.org/jira/browse/HDFS-9350
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
>Priority: Minor
> Attachments: HDFS-9350.001.patch
>
>
> Minor change to use StringBuilders directly to avoid creating temporary 
> strings of Long and Block name when doing toString on a Block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9350) Avoid creating temprorary strings in Block.toString() and getBlockName()

2015-10-30 Thread Staffan Friberg (JIRA)
Staffan Friberg created HDFS-9350:
-

 Summary: Avoid creating temprorary strings in Block.toString() and 
getBlockName()
 Key: HDFS-9350
 URL: https://issues.apache.org/jira/browse/HDFS-9350
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: performance
Affects Versions: 2.7.1
Reporter: Staffan Friberg
Priority: Minor


Minor change to use StringBuilders directly to avoid creating temporary strings 
of Long and Block name when doing toString on a Block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-29 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981442#comment-14981442
 ] 

Staffan Friberg commented on HDFS-9260:
---

I have been running through the other failed tests without being able to 
reproduce them locally. Was able to reproduce the failed test in  
hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes once on the trunk, but 
not yet with my branch. So it seems like these might be intermittent issues.

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-27 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.009.patch

Fix for timed out test 
org.apache.hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks 

Need to remove from iterator and not from tree during iteration to avoid 
concurrent modification exception

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-26 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-9260.008.patch

Using the right name of the bug on the patch...
Fixed white spaces and findbugs

The remaining should hopefully be OK.

./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java:60:35:
 Variable 'storages' must be private and have accessor methods.
Same as triplets was before

./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:204:16:
 Variable 'storageInfoMonitorThread' must be private and have accessor methods.
Same as replicationMonitor

./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:1:
 File length is 4,427 lines (max allowed is 2,000).
./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:2487:
 Comment matches to-do format 'TODO:'.
./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:2501:
 Comment matches to-do format 'TODO:'.
File was long before already, and the TODOs are kept from the earlier 
version of diffReport

./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/TreeSet.java:221:19:
 Inner assignments should be avoided.
./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/TreeSet.java:221:28:
 Inner assignments should be avoided.
./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/TreeSet.java:221:35:
 Inner assignments should be avoided.
./hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/TreeSet.java:221:43:
 Inner assignments should be avoided.
Can change to separate lines writing null, but the current version is more 
compact in the clear method setting them all to null

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-26 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-7435.007.patch

Fixed comments, white spaces and most of the 80 width warnings

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9299) Give ReplicationMonitor a readable thread name

2015-10-23 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9299:
--
Attachment: HDFS-9299.001.patch

> Give ReplicationMonitor a readable thread name
> --
>
> Key: HDFS-9299
> URL: https://issues.apache.org/jira/browse/HDFS-9299
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Priority: Trivial
> Attachments: HDFS-9299.001.patch
>
>
> Currently the log output from the Replication Monitor is the class name, by 
> setting the name on the thread the output will be easier to read.
> Current
> 2015-10-23 11:07:53,344 
> [org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor@2fbdc5dd]
>  INFO  blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.
> After
> 2015-10-23 11:07:53,344 [ReplicationMonitor] INFO  
> blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9299) Give ReplicationMonitor a readable thread name

2015-10-23 Thread Staffan Friberg (JIRA)
Staffan Friberg created HDFS-9299:
-

 Summary: Give ReplicationMonitor a readable thread name
 Key: HDFS-9299
 URL: https://issues.apache.org/jira/browse/HDFS-9299
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.7.1
Reporter: Staffan Friberg
Priority: Trivial


Currently the log output from the Replication Monitor is the class name, by 
setting the name on the thread the output will be easier to read.

Current
2015-10-23 11:07:53,344 
[org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor@2fbdc5dd]
 INFO  blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
ReplicationMonitor.


After
2015-10-23 11:07:53,344 [ReplicationMonitor] INFO  blockmanagement.BlockManager 
(BlockManager.java:run(4125)) - Stopping ReplicationMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9299) Give ReplicationMonitor a readable thread name

2015-10-23 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg reassigned HDFS-9299:
-

Assignee: Staffan Friberg

> Give ReplicationMonitor a readable thread name
> --
>
> Key: HDFS-9299
> URL: https://issues.apache.org/jira/browse/HDFS-9299
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
>Priority: Trivial
> Attachments: HDFS-9299.001.patch
>
>
> Currently the log output from the Replication Monitor is the class name, by 
> setting the name on the thread the output will be easier to read.
> Current
> 2015-10-23 11:07:53,344 
> [org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor@2fbdc5dd]
>  INFO  blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.
> After
> 2015-10-23 11:07:53,344 [ReplicationMonitor] INFO  
> blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-23 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-7435.005.patch

Fix the last todos

Handles New NN and Old DN (unsorted entries), it is ineffiecient since the NN 
needs to sort entries. However it should only be a problem during the upgrade 
cycle, and avoidable if DNs are updated first.

StorageInfoMonitor thread that can compact the TreeSet if the fill ratio gets 
too low.

Added test to check that unsorted entries are handled correctly.

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9299) Give ReplicationMonitor a readable thread name

2015-10-23 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9299:
--
Status: Patch Available  (was: Open)

> Give ReplicationMonitor a readable thread name
> --
>
> Key: HDFS-9299
> URL: https://issues.apache.org/jira/browse/HDFS-9299
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Priority: Trivial
> Attachments: HDFS-9299.001.patch
>
>
> Currently the log output from the Replication Monitor is the class name, by 
> setting the name on the thread the output will be easier to read.
> Current
> 2015-10-23 11:07:53,344 
> [org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor@2fbdc5dd]
>  INFO  blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.
> After
> 2015-10-23 11:07:53,344 [ReplicationMonitor] INFO  
> blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-23 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-7435.006.patch

Merged and diff:ed again due to conflict

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-23 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Status: Patch Available  (was: Open)

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-23 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14971575#comment-14971575
 ] 

Staffan Friberg commented on HDFS-9260:
---

Also handles negative non-striped (EC) entries as efficiently as possible.

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-21 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-7435.004.patch

All HDFS tests should now pass.

Fixes include

Correctly handle EC blocks which are negative, and need to be masked
 Initial report might sometime report into a storage containing new blocks 
reported by incremental block report, addStoredLast will fallback on regular 
add if not sorted
remove debugging output
remove unused GSet import
invalidate list must be a Block and not a Replica

Left todo
   Handle old nodes which don't send data sorted
  Add a 'sorted' field in the report PB
   Figure out how how to be able to handle reports when cluster contains 
negative entries that are not EC blocks


> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks are not on any storage, so no 
> replication can occur causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-21 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Description: 
This patch changes the datastructures used for BlockInfos and Replicas to keep 
them sorted. This allows faster and more GC friendly handling of full block 
reports.

Would like to hear peoples feedback on this change and also some help 
investigating/understanding a few outstanding issues if we are interested in 
moving forward with this.





  was:
This patch changes the datastructures used for BlockInfos and Replicas to keep 
them sorted. This allows faster and more GC friendly handling of full block 
reports.

Would like to hear peoples feedback on this change and also some help 
investigating/understanding a few outstanding issues if we are interested in 
moving forward with this.

There seems to be some timing issues I hit when testing the patch, not sure if 
it is a bug in the patch or something else (most likely the earlier)...

Tests that fail for me:

   The issues seems to be that the blocks are not on any storage, so no 
replication can occur causing the tests to fail in different ways.

   TestDecomission.testDecommision
   If I add a little sleep after the cleanup/delete things seem to work
   TestDFSStripedOutputStreamWithFailure
   A couple of tests fails in this class.




> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-20 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14965242#comment-14965242
 ] 

Staffan Friberg commented on HDFS-9260:
---

The TreeSet reduces the number of reference compared to the Double-LinkedList 
currently built using the the triplets datastructure.

If we would move things out of the heap the TreeSet, if still used, would 
contain memory addresses rather than longs which are trivial for the GC to 
handle (no need to scan the array). Potentially the BlockMap could be the same 
way a large long array on heap that contains memory addresses of the blockinfos 
that are off heap, and collisions could be handled by the blockinfo's off heap 
(linking in the same way they are now).

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks are not on any storage, so no 
> replication can occur causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-19 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963610#comment-14963610
 ] 

Staffan Friberg commented on HDFS-9260:
---

Hi Walter

1)
I think this is part could reduce the need to go off heap. If we still see 
scalability issues and need to put the block map and blockinfo off heap, this 
could potenially serve as an idea on how to structure the data of heap, since 
even if the data is off heap continuously updating reference will be costly 
since we will invalidate the CPU cache. Potentially a version of the TreeSet 
holding primitives (blockinfo address) could be used for fast iteration, but 
need to think a bit further about.

2)
Interesting idea, I think the key point would be that you could do quick lookup 
directly on the serialized data so you don't need to instantiate the whole map 
since it might be rather large. Not sure if this is easily doable with ProtoBuf 
and still keeping the message as compact as possible?

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks are not on any storage, so no 
> replication can occur causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-7435.002.patch

Merged with latest head

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks is not on any storage, so no 
> replication can occurs causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS Block and Replica Management 20151013.pdf

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks is not on any storage, so no 
> replication can occurs causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS Block and Replica Management 20151013.pdf

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks is not on any storage, so no 
> replication can occurs causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: (was: HDFS Block and Replica Management 20151013.pdf)

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks is not on any storage, so no 
> replication can occurs causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Description: 
This patch changes the datastructures used for BlockInfos and Replicas to keep 
them sorted. This allows faster and more GC friendly handling of full block 
reports.

Would like to hear peoples feedback on this change and also some help 
investigating/understanding a few outstanding issues if we are interested in 
moving forward with this.

There seems to be some timing issues I hit when testing the patch, not sure if 
it is a bug in the patch or something else (most likely the earlier)...

Tests that fail for me:

   The issues seems to be that the blocks are not on any storage, so no 
replication can occur causing the tests to fail in different ways.

   TestDecomission.testDecommision
   If I add a little sleep after the cleanup/delete things seem to work
   TestDFSStripedOutputStreamWithFailure
   A couple of tests fails in this class.



  was:
This patch changes the datastructures used for BlockInfos and Replicas to keep 
them sorted. This allows faster and more GC friendly handling of full block 
reports.

Would like to hear peoples feedback on this change and also some help 
investigating/understanding a few outstanding issues if we are interested in 
moving forward with this.

There seems to be some timing issues I hit when testing the patch, not sure if 
it is a bug in the patch or something else (most likely the earlier)...

Tests that fail for me:

   The issues seems to be that the blocks is not on any storage, so no 
replication can occurs causing the tests to fail in different ways.

   TestDecomission.testDecommision
   If I add a little sleep after the cleanup/delete things seem to work
   TestDFSStripedOutputStreamWithFailure
   A couple of tests fails in this class.




> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks are not on any storage, so no 
> replication can occur causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-7435.003.patch

Add null check when creating iterator of storages

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks are not on any storage, so no 
> replication can occur causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)
Staffan Friberg created HDFS-9260:
-

 Summary: Improve performance and GC friendliness of startup and 
FBRs
 Key: HDFS-9260
 URL: https://issues.apache.org/jira/browse/HDFS-9260
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, namenode, performance
Affects Versions: 2.7.1
Reporter: Staffan Friberg


This patch changes the datastructures used for BlockInfos and Replicas to keep 
them sorted. This allows faster and more GC friendly handling of full block 
reports.

Would like to hear peoples feedback on this change and also some help 
investigating/understanding a few outstanding issues if we are interested in 
moving forward with this.

There seems to be some timing issues I hit when testing the patch, not sure if 
it is a bug in the patch or something else (most likely the earlier)...

Tests that fail for me:

   The issues seems to be that the blocks is not on any storage, so no 
replication can occurs causing the tests to fail in different ways.

   TestDecomission.testDecommision
   If I add a little sleep after the cleanup/delete things seem to work
   TestDFSStripedOutputStreamWithFailure
   A couple of tests fails in this class.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-16 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-7435.001.patch

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
> Attachments: HDFS-7435.001.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.
> There seems to be some timing issues I hit when testing the patch, not sure 
> if it is a bug in the patch or something else (most likely the earlier)...
> Tests that fail for me:
>The issues seems to be that the blocks is not on any storage, so no 
> replication can occurs causing the tests to fail in different ways.
>TestDecomission.testDecommision
>If I add a little sleep after the cleanup/delete things seem to work
>TestDFSStripedOutputStreamWithFailure
>A couple of tests fails in this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9221) HdfsServerConstants#ReplicaState#getState should avoid calling values() since it creates a temporary array

2015-10-09 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9221:
--
Status: Patch Available  (was: Open)

> HdfsServerConstants#ReplicaState#getState should avoid calling values() since 
> it creates a temporary array
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HADOOP-9221.001.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9221) HdfsServerConstants#ReplicaState#getState should avoid calling values() since it creates a temporary array

2015-10-09 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9221:
--
Attachment: (was: ReplicaState.patch)

> HdfsServerConstants#ReplicaState#getState should avoid calling values() since 
> it creates a temporary array
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HADOOP-9221.001.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9221) HdfsServerConstants#ReplicaState#getState should avoid calling values() since it creates a temporary array

2015-10-09 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9221:
--
Attachment: HADOOP-9221.001.patch

> HdfsServerConstants#ReplicaState#getState should avoid calling values() since 
> it creates a temporary array
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HADOOP-9221.001.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9221) HdfsServerConstants#ReplicaState#getState should avoid calling values() since it creates a temporary array

2015-10-09 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951127#comment-14951127
 ] 

Staffan Friberg commented on HDFS-9221:
---

Uploaded new version of the patch as the old one was accidentally using the SVN 
repository.

> HdfsServerConstants#ReplicaState#getState should avoid calling values() since 
> it creates a temporary array
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HADOOP-9221.001.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9221) HdfsServerConstants#ReplicaState#getState should avoid calling values() since it creates a temporary array

2015-10-09 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9221:
--
Status: Open  (was: Patch Available)

> HdfsServerConstants#ReplicaState#getState should avoid calling values() since 
> it creates a temporary array
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HADOOP-9221.001.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9201) Namenode Performance Improvement : Using for loop without iterator

2015-10-09 Thread Staffan Friberg (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951153#comment-14951153
 ] 

Staffan Friberg commented on HDFS-9201:
---

For Lists you need to make sure you have an ArrayList, other lists might need 
to iterate to access each entry.

> Namenode Performance Improvement : Using for loop without iterator
> --
>
> Key: HDFS-9201
> URL: https://issues.apache.org/jira/browse/HDFS-9201
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: nijel
>Assignee: nijel
>  Labels: namenode, performance
> Attachments: HDFS-9201_draft.patch
>
>
> As discussed in HBASE-12023, the for each loop syntax will create few extra 
> objects and garbage.
> For arrays and Lists can change to the traditional syntax. 
> This can improve memory foot print and can result in performance gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9221) Reduce allocation during block reports

2015-10-09 Thread Staffan Friberg (JIRA)
Staffan Friberg created HDFS-9221:
-

 Summary: Reduce allocation during block reports
 Key: HDFS-9221
 URL: https://issues.apache.org/jira/browse/HDFS-9221
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: performance
Reporter: Staffan Friberg


When the BufferDecoder in BlockListAsLongs converts the stored value to a 
ReplicaState enum it calls ReplicaState.getState(int) unfortunately this method 
creates a ReplicaState[] for each call since it calls ReplicaState.values().

This patch creates a cached version of the values and thus avoid all allocation 
when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9221) Reduce allocation during block reports

2015-10-09 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg reassigned HDFS-9221:
-

Assignee: Staffan Friberg

> Reduce allocation during block reports
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: ReplicaState.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9221) Reduce allocation during block reports

2015-10-09 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9221:
--
Status: Patch Available  (was: Open)

Small change so could hopefully go into 2.8.0

> Reduce allocation during block reports
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: ReplicaState.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9221) Reduce allocation during block reports

2015-10-09 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9221:
--
Target Version/s: 2.8.0

> Reduce allocation during block reports
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: ReplicaState.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9221) Reduce allocation during block reports

2015-10-09 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9221:
--
Attachment: ReplicaState.patch

> Reduce allocation during block reports
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Reporter: Staffan Friberg
> Attachments: ReplicaState.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9221) Reduce allocation during block reports

2015-10-09 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9221:
--
Affects Version/s: 2.7.1

> Reduce allocation during block reports
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: ReplicaState.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9221) Reduce allocation during block reports

2015-10-09 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9221:
--
Status: Open  (was: Patch Available)

> Reduce allocation during block reports
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: ReplicaState.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)