[jira] [Updated] (HDFS-7663) Erasure Coding: Append on striped file

2015-09-29 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-7663:

Summary: Erasure Coding: Append on striped file  (was: Erasure Coding: 
lease recovery / append on striped file)

> Erasure Coding: Append on striped file
> --
>
> Key: HDFS-7663
> URL: https://issues.apache.org/jira/browse/HDFS-7663
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
> Attachments: HDFS-7663.00.txt
>
>
> Append should be easy if we have variable length block support from 
> HDFS-3689, i.e., the new data will be appended to a new block. We need to 
> revisit whether and how to support appending data to the original last block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9172) Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client

2015-09-29 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934747#comment-14934747
 ] 

Rakesh R commented on HDFS-9172:


Great, that looks fine to me.

> Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client
> --
>
> Key: HDFS-9172
> URL: https://issues.apache.org/jira/browse/HDFS-9172
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>
> The idea of this jira is to move the striped stream related classes to 
> {{hadoop-hdfs-client}} project. This will help to be in sync with the 
> HDFS-6200 proposal.
> - DFSStripedInputStream
> - DFSStripedOutputStream
> - StripedDataStreamer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934768#comment-14934768
 ] 

Uma Maheswara Rao G commented on HDFS-8859:
---

Thanks Yi,  +1 on the latest patch

> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2015-09-29 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-8449:

Attachment: HDFS-8449-002.patch

reduce metrics number from 3 to 2. The count of failed tasks can be calculated 
by total count and successful count.

> Add tasks count metrics to datanode for ECWorker
> 
>
> Key: HDFS-8449
> URL: https://issues.apache.org/jira/browse/HDFS-8449
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, 
> HDFS-8449-002.patch
>
>
> This sub task try to record ec recovery tasks that a datanode has done, 
> including total tasks, failed tasks and sucessful tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8632) Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes

2015-09-29 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934693#comment-14934693
 ] 

Rakesh R commented on HDFS-8632:


[~zhz], any more comments,  would you please review the latest patch again. 
Thank you!

> Erasure Coding: Add InterfaceAudience annotation to the erasure coding classes
> --
>
> Key: HDFS-8632
> URL: https://issues.apache.org/jira/browse/HDFS-8632
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-8632-HDFS-7285-00.patch, 
> HDFS-8632-HDFS-7285-01.patch, HDFS-8632-HDFS-7285-02.patch, 
> HDFS-8632-HDFS-7285-03.patch, HDFS-8632-HDFS-7285-04.patch
>
>
> I've noticed some of the erasure coding classes missing 
> {{@InterfaceAudience}} annotation. It would be good to identify the classes 
> and add proper annotation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9053) Support large directories efficiently using B-Tree

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934689#comment-14934689
 ] 

Yi Liu edited comment on HDFS-9053 at 9/29/15 6:18 AM:
---

[~jingzhao], you have given a great review, thanks a lot, and it hits the two 
places I ever considered carefully how to do it better.

{quote}
In INodeDirectory#replaceChild, can we directly call addOrReplace instead of 
calling get first?
{quote}
I think we can use {{addOrReplace}} directly.  Since there should be an INode 
with the name existing.  To keep original behavior, I will remove the added one 
if {{addOrReplace}} returns null.

{quote}
Do you think we can avoid the following code? Maybe we can add the EK type to 
the ReadOnlyCollection/ReadOnlyList level?
{quote}
That's a good comment, I ever considered this carefully.  I also thought adding 
the EK type as one of generic type to the ReadOnlyCollection/ReadOnlyList 
level, but I felt it looked not natural for a collection/list, and not all 
implementations of ReadOnlyList need to implement iterating starting from 
specified element, also I though it was OK since it's a private interface we 
use in HDFS.  I will leave this comment in next version of patch, if you feel 
we'd better to do this, I will update it, I am OK with the both ways.

{quote}
DirectoryWithSnapshotFeature#getChildrenList#iterator(EK) forgot to increase 
pos? Maybe also add a new test for this (e.g., set a small ls limit and list a 
snapshot of a directory)?
{quote}
Great catch, let me update it, and add a new test in {{TestLargeDirectory}} to 
cover.

{quote}
In getListing, instead of continuing the iteration, can we just call size() to 
calculate the number of the remaining items?
{quote}
I ever tried to find a better way. {{size()}} will return the total number of 
elements in B-Tree, but we don't know the current index, so seems not able to 
calculate the number of the remaining items.




was (Author: hitliuyi):
[~jingzhao], you have given a good review, and it hits the two places I ever 
considered carefully how to do it better. Thanks.

{quote}
In INodeDirectory#replaceChild, can we directly call addOrReplace instead of 
calling get first?
{quote}
I think we can use {{addOrReplace}} directly.  Since there should be an INode 
with the name existing.  To keep original behavior, I will remove the added one 
if {{addOrReplace}} returns null.

{quote}
Do you think we can avoid the following code? Maybe we can add the EK type to 
the ReadOnlyCollection/ReadOnlyList level?
{quote}
That's a good comment, I ever considered this carefully.  I also thought adding 
the EK type as one of generic type to the ReadOnlyCollection/ReadOnlyList 
level, but I felt it looked not natural for a collection/list, and not all 
implementations of ReadOnlyList need to implement iterating starting from 
specified element, also I though it was OK since it's a private interface we 
use in HDFS.  I will leave this comment in next version of patch, if you feel 
we'd better to do this, I will update it, I am OK with the both ways.

{quote}
DirectoryWithSnapshotFeature#getChildrenList#iterator(EK) forgot to increase 
pos? Maybe also add a new test for this (e.g., set a small ls limit and list a 
snapshot of a directory)?
{quote}
Great catch, let me update it, and add a new test in {{TestLargeDirectory}} to 
cover.

{quote}
In getListing, instead of continuing the iteration, can we just call size() to 
calculate the number of the remaining items?
{quote}
I ever tried to find a better way. {{size()}} will return the total number of 
elements in B-Tree, but we don't know the current index, so seems not able to 
calculate the number of the remaining items.



> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch, HDFS-9053.002.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete/search, the time 
> complexity is O(log n), but insertion/deleting causes re-allocations and 
> copies of big arrays, so the operations are costly.  For example, if the 
> children grow to 1M size, the ArrayList will resize to > 1M capacity, so need 
> > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS 
> cluster where namenode heap memory is already highly used.  I recap the 3 
> main issues:
> # Insertion/deletion 

[jira] [Updated] (HDFS-8941) DistributedFileSystem listCorruptFileBlocks API should resolve relative path

2015-09-29 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8941:
---
Attachment: HDFS-8941-03.patch

Rebased the patch and attached to the jira, as HDFS-8740 has moved the 
{{DistributedFileSystem.java}} class to {{hadoop-hdfs-client}} module.

> DistributedFileSystem listCorruptFileBlocks API should resolve relative path
> 
>
> Key: HDFS-8941
> URL: https://issues.apache.org/jira/browse/HDFS-8941
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-8941-00.patch, HDFS-8941-01.patch, 
> HDFS-8941-02.patch, HDFS-8941-03.patch
>
>
> Presently {{DFS#listCorruptFileBlocks(path)}} API is not resolving the given 
> path relative to the workingDir. This jira is to discuss and provide the 
> implementation of the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934836#comment-14934836
 ] 

Yi Liu commented on HDFS-8859:
--

Thanks Uma, since the new patch only removes the unused import, based on the 
above Jenkins result, I will commit it shortly.

> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9092) Nfs silently drops overlapping write requests and causes data copying to fail

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934677#comment-14934677
 ] 

Hudson commented on HDFS-9092:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #432 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/432/])
HDFS-9092. Nfs silently drops overlapping write requests and causes data 
copying to fail. Contributed by Yongjun Zhang. (yzhang: rev 
151fca5032719e561226ef278e002739073c23ec)
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteCtx.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OpenFileCtx.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/OffsetRange.java
* 
hadoop-hdfs-project/hadoop-hdfs-nfs/src/test/java/org/apache/hadoop/hdfs/nfs/nfs3/TestWrites.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Nfs silently drops overlapping write requests and causes data copying to fail
> -
>
> Key: HDFS-9092
> URL: https://issues.apache.org/jira/browse/HDFS-9092
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nfs
>Affects Versions: 2.7.1
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
> Fix For: 2.8.0
>
> Attachments: HDFS-9092.001.patch, HDFS-9092.002.patch
>
>
> When NOT using 'sync' option, the NFS writes may issue the following warning:
> org.apache.hadoop.hdfs.nfs.nfs3.OpenFileCtx: Got an overlapping write 
> (1248751616, 1249677312), nextOffset=1248752400. Silently drop it now
> and the size of data copied via NFS will stay at 1248752400.
> Found what happened is:
> 1. The write requests from client are sent asynchronously. 
> 2. The NFS gateway has handler to handle the incoming requests by creating an 
> internal write request structuire and put it into cache;
> 3. In parallel, a separate thread in NFS gateway takes requests out from the 
> cache and writes the data to HDFS.
> The current offset is how much data has been written by the write thread in 
> 3. The detection of overlapping write request happens in 2, but it only 
> checks the write request against the curent offset, and trim the request if 
> necessary. Because the write requests are sent asynchronously, if two 
> requests are beyond the current offset, and they overlap, it's not detected 
> and both are put into the cache. This cause the symptom reported in this case 
> at step 3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9172) Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client

2015-09-29 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934740#comment-14934740
 ] 

Zhe Zhang commented on HDFS-9172:
-

Thanks for initiating the work Rakesh. I'm doing a 'git merge' with trunk now 
and will likely make the proposed change in this JIRA. I will push the result 
of the merge to my personal github repo before pushing to upstream. Would be 
great to have your advice there.

> Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client
> --
>
> Key: HDFS-9172
> URL: https://issues.apache.org/jira/browse/HDFS-9172
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>
> The idea of this jira is to move the striped stream related classes to 
> {{hadoop-hdfs-client}} project. This will help to be in sync with the 
> HDFS-6200 proposal.
> - DFSStripedInputStream
> - DFSStripedOutputStream
> - StripedDataStreamer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9170) Move libhdfs / fuse-dfs / libwebhdfs to a separate module

2015-09-29 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934782#comment-14934782
 ] 

Colin Patrick McCabe commented on HDFS-9170:


I don't have any objections to moving libhdfs, fuse-dfs, and libwebhdfs to a 
separate hadoop-hdfs-native-client module.  However, if the unit tests are 
moving with them, they will still be depending on MiniDFSCluster in a different 
module (i.e.  hadoop-hdfs module).  So I'm not sure how creating a separate 
module is better than moving everything to hadoop-hdfs-client.  Perhaps I'm 
misunderstanding something.

> Move libhdfs / fuse-dfs / libwebhdfs to a separate module
> -
>
> Key: HDFS-9170
> URL: https://issues.apache.org/jira/browse/HDFS-9170
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> After HDFS-6200 the Java implementation of hdfs-client has be moved to a 
> separate hadoop-hdfs-client module.
> libhdfs, fuse-dfs and libwebhdfs still reside in the hadoop-hdfs module. 
> Ideally these modules should reside in the hadoop-hdfs-client. However, to 
> write unit tests for these components, it is often necessary to run 
> MiniDFSCluster which resides in the hadoop-hdfs module.
> This jira is to discuss how these native modules should layout after 
> HDFS-6200.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-1172) Blocks in newly completed files are considered under-replicated too quickly

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934826#comment-14934826
 ] 

Hadoop QA commented on HDFS-1172:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 26s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   9m  6s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 51s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 28s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 37s | The applied patch generated  1 
new checkstyle issues (total was 201, now 201). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 40s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 51s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 36s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 170m 51s | Tests failed in hadoop-hdfs. |
| | | 223m 11s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.ha.TestDNFencing |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764169/HDFS-1172.010.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 151fca5 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12728/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12728/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12728/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12728/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12728/console |


This message was automatically generated.

> Blocks in newly completed files are considered under-replicated too quickly
> ---
>
> Key: HDFS-1172
> URL: https://issues.apache.org/jira/browse/HDFS-1172
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 0.21.0
>Reporter: Todd Lipcon
> Attachments: HDFS-1172-150907.patch, HDFS-1172.008.patch, 
> HDFS-1172.009.patch, HDFS-1172.010.patch, HDFS-1172.patch, hdfs-1172.txt, 
> hdfs-1172.txt, replicateBlocksFUC.patch, replicateBlocksFUC1.patch, 
> replicateBlocksFUC1.patch
>
>
> I've seen this for a long time, and imagine it's a known issue, but couldn't 
> find an existing JIRA. It often happens that we see the NN schedule 
> replication on the last block of files very quickly after they're completed, 
> before the other DNs in the pipeline have a chance to report the new block. 
> This results in a lot of extra replication work on the cluster, as we 
> replicate the block and then end up with multiple excess replicas which are 
> very quickly deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8859:
-
Attachment: HDFS-8859.006.patch

Update patch to remove unnecessary import.

> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934697#comment-14934697
 ] 

Yi Liu commented on HDFS-8859:
--

Thanks Uma.  
There is an unused import, I will remove it in the new version of patch.
{quote}
./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java:69:29:
 Variable 'entries' must be private and have accessor methods.
./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java:71:17:
 Variable 'hash_mask' must be private and have accessor methods.
./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java:73:17:
 Variable 'size' must be private and have accessor methods.
./hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java:77:17:
 Variable 'modification' must be private and have accessor methods.
{quote}
Making the variables of super class 'protected' and modify them in sub classes 
is a natural behavior, I don't know why checkstype reports we should use 
private and access through methods.  We always access the protected variables 
in the super class directly in other hadoop code.  
So I will leave these checkstyle items. 

> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9173) Erasure Coding: Lease recovery for striped file

2015-09-29 Thread Walter Su (JIRA)
Walter Su created HDFS-9173:
---

 Summary: Erasure Coding: Lease recovery for striped file
 Key: HDFS-9173
 URL: https://issues.apache.org/jira/browse/HDFS-9173
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Walter Su
Assignee: Walter Su






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9053) Support large directories efficiently using B-Tree

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934689#comment-14934689
 ] 

Yi Liu edited comment on HDFS-9053 at 9/29/15 6:55 AM:
---

[~jingzhao], you have given a great review, thanks a lot, and it hits the two 
places I ever considered carefully how to do it better.

{quote}
In INodeDirectory#replaceChild, can we directly call addOrReplace instead of 
calling get first?
{quote}
Good point, I think we can use {{addOrReplace}} directly.  Since there should 
be an INode with the name existing.  To keep original behavior, I will remove 
the added one if {{addOrReplace}} returns null.

{quote}
Do you think we can avoid the following code? Maybe we can add the EK type to 
the ReadOnlyCollection/ReadOnlyList level?
{quote}
That's a good comment, I ever considered this carefully.  I also thought adding 
the EK type as one of generic type to the ReadOnlyCollection/ReadOnlyList 
level, but I felt it looked not natural for a collection/list, and not all 
implementations of ReadOnlyList need to implement iterating starting from 
specified element, also I though it was OK since it's a private interface we 
use in HDFS. 
I will update according to your comment, if you feel it's not good later, I can 
change it back.

{quote}
DirectoryWithSnapshotFeature#getChildrenList#iterator(EK) forgot to increase 
pos? Maybe also add a new test for this (e.g., set a small ls limit and list a 
snapshot of a directory)?
{quote}
Great catch, let me update it, and add a new test in {{TestLargeDirectory}} to 
cover.

{quote}
In getListing, instead of continuing the iteration, can we just call size() to 
calculate the number of the remaining items?
{quote}
I ever tried to find a better way. {{size()}} will return the total number of 
elements in B-Tree, but we don't know the current index, so seems not able to 
calculate the number of the remaining items.




was (Author: hitliuyi):
[~jingzhao], you have given a great review, thanks a lot, and it hits the two 
places I ever considered carefully how to do it better.

{quote}
In INodeDirectory#replaceChild, can we directly call addOrReplace instead of 
calling get first?
{quote}
Good point, I think we can use {{addOrReplace}} directly.  Since there should 
be an INode with the name existing.  To keep original behavior, I will remove 
the added one if {{addOrReplace}} returns null.

{quote}
Do you think we can avoid the following code? Maybe we can add the EK type to 
the ReadOnlyCollection/ReadOnlyList level?
{quote}
That's a good comment, I ever considered this carefully.  I also thought adding 
the EK type as one of generic type to the ReadOnlyCollection/ReadOnlyList 
level, but I felt it looked not natural for a collection/list, and not all 
implementations of ReadOnlyList need to implement iterating starting from 
specified element, also I though it was OK since it's a private interface we 
use in HDFS.  I will leave this comment in next version of patch, if you feel 
we'd better to do this, I will update it, I am OK with the both ways.

{quote}
DirectoryWithSnapshotFeature#getChildrenList#iterator(EK) forgot to increase 
pos? Maybe also add a new test for this (e.g., set a small ls limit and list a 
snapshot of a directory)?
{quote}
Great catch, let me update it, and add a new test in {{TestLargeDirectory}} to 
cover.

{quote}
In getListing, instead of continuing the iteration, can we just call size() to 
calculate the number of the remaining items?
{quote}
I ever tried to find a better way. {{size()}} will return the total number of 
elements in B-Tree, but we don't know the current index, so seems not able to 
calculate the number of the remaining items.



> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch, HDFS-9053.002.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete/search, the time 
> complexity is O(log n), but insertion/deleting causes re-allocations and 
> copies of big arrays, so the operations are costly.  For example, if the 
> children grow to 1M size, the ArrayList will resize to > 1M capacity, so need 
> > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS 
> cluster where namenode heap memory is already highly used.  I recap the 3 
> main issues:
> # Insertion/deletion operations 

[jira] [Updated] (HDFS-8676) Delayed rolling upgrade finalization can cause heartbeat expiration

2015-09-29 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8676:

Attachment: HDFS-8676.01.patch

> Delayed rolling upgrade finalization can cause heartbeat expiration
> ---
>
> Key: HDFS-8676
> URL: https://issues.apache.org/jira/browse/HDFS-8676
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-8676.01.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks 
> can pile up in the datanode trash directories until an upgrade is finalized.  
> When it is finally finalized, the deletion of trash is done in the service 
> actor thread's context synchronously.  This blocks the heartbeat and can 
> cause heartbeat expiration.  
> We have seen a namenode losing hundreds of nodes after a delayed upgrade 
> finalization.  The deletion of trash directories should be made asynchronous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9159) [OIV] : return value of the command is not correct if invalid value specified in "-p (processor)" option

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934702#comment-14934702
 ] 

Hadoop QA commented on HDFS-9159:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 58s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 57s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  9s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 25s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 27s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 162m 57s | Tests failed in hadoop-hdfs. |
| | | 208m 34s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestListCorruptFileBlocks |
|   | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764164/HDFS-9159_02.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 151fca5 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12726/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12726/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12726/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12726/console |


This message was automatically generated.

> [OIV] : return value of the command is not correct if invalid value specified 
> in "-p (processor)" option
> 
>
> Key: HDFS-9159
> URL: https://issues.apache.org/jira/browse/HDFS-9159
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: nijel
>Assignee: nijel
> Attachments: HDFS-9159_01.patch, HDFS-9159_02.patch
>
>
> Return value of the IOV command is not correct if invalid value specified in 
> "-p (processor)" option
> this needs to return error to user.
> code change will be in switch statement of
> {code}
>  try (PrintStream out = outputFile.equals("-") ?
> System.out : new PrintStream(outputFile, "UTF-8")) {
>   switch (processor) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9053) Support large directories efficiently using B-Tree

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934689#comment-14934689
 ] 

Yi Liu edited comment on HDFS-9053 at 9/29/15 6:20 AM:
---

[~jingzhao], you have given a great review, thanks a lot, and it hits the two 
places I ever considered carefully how to do it better.

{quote}
In INodeDirectory#replaceChild, can we directly call addOrReplace instead of 
calling get first?
{quote}
Good point, I think we can use {{addOrReplace}} directly.  Since there should 
be an INode with the name existing.  To keep original behavior, I will remove 
the added one if {{addOrReplace}} returns null.

{quote}
Do you think we can avoid the following code? Maybe we can add the EK type to 
the ReadOnlyCollection/ReadOnlyList level?
{quote}
That's a good comment, I ever considered this carefully.  I also thought adding 
the EK type as one of generic type to the ReadOnlyCollection/ReadOnlyList 
level, but I felt it looked not natural for a collection/list, and not all 
implementations of ReadOnlyList need to implement iterating starting from 
specified element, also I though it was OK since it's a private interface we 
use in HDFS.  I will leave this comment in next version of patch, if you feel 
we'd better to do this, I will update it, I am OK with the both ways.

{quote}
DirectoryWithSnapshotFeature#getChildrenList#iterator(EK) forgot to increase 
pos? Maybe also add a new test for this (e.g., set a small ls limit and list a 
snapshot of a directory)?
{quote}
Great catch, let me update it, and add a new test in {{TestLargeDirectory}} to 
cover.

{quote}
In getListing, instead of continuing the iteration, can we just call size() to 
calculate the number of the remaining items?
{quote}
I ever tried to find a better way. {{size()}} will return the total number of 
elements in B-Tree, but we don't know the current index, so seems not able to 
calculate the number of the remaining items.




was (Author: hitliuyi):
[~jingzhao], you have given a great review, thanks a lot, and it hits the two 
places I ever considered carefully how to do it better.

{quote}
In INodeDirectory#replaceChild, can we directly call addOrReplace instead of 
calling get first?
{quote}
I think we can use {{addOrReplace}} directly.  Since there should be an INode 
with the name existing.  To keep original behavior, I will remove the added one 
if {{addOrReplace}} returns null.

{quote}
Do you think we can avoid the following code? Maybe we can add the EK type to 
the ReadOnlyCollection/ReadOnlyList level?
{quote}
That's a good comment, I ever considered this carefully.  I also thought adding 
the EK type as one of generic type to the ReadOnlyCollection/ReadOnlyList 
level, but I felt it looked not natural for a collection/list, and not all 
implementations of ReadOnlyList need to implement iterating starting from 
specified element, also I though it was OK since it's a private interface we 
use in HDFS.  I will leave this comment in next version of patch, if you feel 
we'd better to do this, I will update it, I am OK with the both ways.

{quote}
DirectoryWithSnapshotFeature#getChildrenList#iterator(EK) forgot to increase 
pos? Maybe also add a new test for this (e.g., set a small ls limit and list a 
snapshot of a directory)?
{quote}
Great catch, let me update it, and add a new test in {{TestLargeDirectory}} to 
cover.

{quote}
In getListing, instead of continuing the iteration, can we just call size() to 
calculate the number of the remaining items?
{quote}
I ever tried to find a better way. {{size()}} will return the total number of 
elements in B-Tree, but we don't know the current index, so seems not able to 
calculate the number of the remaining items.



> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch, HDFS-9053.002.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete/search, the time 
> complexity is O(log n), but insertion/deleting causes re-allocations and 
> copies of big arrays, so the operations are costly.  For example, if the 
> children grow to 1M size, the ArrayList will resize to > 1M capacity, so need 
> > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS 
> cluster where namenode heap memory is already highly used.  I recap the 3 
> main issues:
> # 

[jira] [Updated] (HDFS-8676) Delayed rolling upgrade finalization can cause heartbeat expiration

2015-09-29 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-8676:

Assignee: Walter Su
  Status: Patch Available  (was: Open)

> Delayed rolling upgrade finalization can cause heartbeat expiration
> ---
>
> Key: HDFS-8676
> URL: https://issues.apache.org/jira/browse/HDFS-8676
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Critical
> Attachments: HDFS-8676.01.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks 
> can pile up in the datanode trash directories until an upgrade is finalized.  
> When it is finally finalized, the deletion of trash is done in the service 
> actor thread's context synchronously.  This blocks the heartbeat and can 
> cause heartbeat expiration.  
> We have seen a namenode losing hundreds of nodes after a delayed upgrade 
> finalization.  The deletion of trash directories should be made asynchronous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9141) Thread leak in Datanode#refreshVolumes

2015-09-29 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-9141:
--
Status: Patch Available  (was: Open)

> Thread leak in Datanode#refreshVolumes
> --
>
> Key: HDFS-9141
> URL: https://issues.apache.org/jira/browse/HDFS-9141
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1, 3.0.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-9141.00.patch
>
>
> In refreshVolumes, we are creating executor service and submitting volume 
> addition tasks to it.
> But we are not shutting down the service after the use. Even though we are 
> not holding instance level service, the initialized thread could be left out.
> {code}
> ExecutorService service = Executors.newFixedThreadPool(
> changedVolumes.newLocations.size());
> {code}
> So, simple fix for this would be to shutdown the service after its use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9141) Thread leak in Datanode#refreshVolumes

2015-09-29 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-9141:
--
Attachment: HDFS-9141.00.patch

Updated the patch to handle this.

> Thread leak in Datanode#refreshVolumes
> --
>
> Key: HDFS-9141
> URL: https://issues.apache.org/jira/browse/HDFS-9141
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.7.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-9141.00.patch
>
>
> In refreshVolumes, we are creating executor service and submitting volume 
> addition tasks to it.
> But we are not shutting down the service after the use. Even though we are 
> not holding instance level service, the initialized thread could be left out.
> {code}
> ExecutorService service = Executors.newFixedThreadPool(
> changedVolumes.newLocations.size());
> {code}
> So, simple fix for this would be to shutdown the service after its use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7663) Erasure Coding: Append on striped file

2015-09-29 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935119#comment-14935119
 ] 

Walter Su commented on HDFS-7663:
-

Thank you for your detail information. Extending writeBlock is a good idea. The 
whole thing is more clear. I'll do HDFS-9173 firstly.

> Erasure Coding: Append on striped file
> --
>
> Key: HDFS-7663
> URL: https://issues.apache.org/jira/browse/HDFS-7663
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
> Attachments: HDFS-7663.00.txt
>
>
> Append should be easy if we have variable length block support from 
> HDFS-3689, i.e., the new data will be appended to a new block. We need to 
> revisit whether and how to support appending data to the original last block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-09-29 Thread Casey Brotherton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Casey Brotherton updated HDFS-9100:
---
Status: Open  (was: Patch Available)

> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9114) NameNode and DataNode metric log file name should follow the other log file name format.

2015-09-29 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-9114:
-
Attachment: HDFS-9114-trunk.02.patch
HDFS-9114-branch-2.02.patch

Re-based patch...

> NameNode and DataNode metric log file name should follow the other log file 
> name format.
> 
>
> Key: HDFS-9114
> URL: https://issues.apache.org/jira/browse/HDFS-9114
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-9114-branch-2.01.patch, 
> HDFS-9114-branch-2.02.patch, HDFS-9114-trunk.01.patch, 
> HDFS-9114-trunk.02.patch
>
>
> Currently datanode and namenode metric log file name is 
> {{datanode-metrics.log}} and {{namenode-metrics.log}}.
> This file name should be like {{hadoop-hdfs-namenode-metric-host192.log}} 
> same as namenode log file {{hadoop-hdfs-namenode-host192.log}}.
> This will help when we will copy log for issue analysis from different node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934689#comment-14934689
 ] 

Yi Liu commented on HDFS-9053:
--

[~jingzhao], you have given a good review, and it hits the two places I ever 
considered carefully how to do it better. Thanks.

{quote}
In INodeDirectory#replaceChild, can we directly call addOrReplace instead of 
calling get first?
{quote}
I think we can use {{addOrReplace}} directly.  Since there should be an INode 
with the name existing.  To keep original behavior, I will remove the added one 
if {{addOrReplace}} returns null.

{quote}
Do you think we can avoid the following code? Maybe we can add the EK type to 
the ReadOnlyCollection/ReadOnlyList level?
{quote}
That's a good comment, I ever considered this carefully.  I also thought adding 
the EK type as one of generic type to the ReadOnlyCollection/ReadOnlyList 
level, but I felt it looked not natural for a collection/list, and not all 
implementations of ReadOnlyList need to implement iterating starting from 
specified element, also I though it was OK since it's a private interface we 
use in HDFS.  I will leave this comment in next version of patch, if you feel 
we'd better to do this, I will update it, I am OK with the both ways.

{quote}
DirectoryWithSnapshotFeature#getChildrenList#iterator(EK) forgot to increase 
pos? Maybe also add a new test for this (e.g., set a small ls limit and list a 
snapshot of a directory)?
{quote}
Great catch, let me update it, and add a new test in {{TestLargeDirectory}} to 
cover.

{quote}
In getListing, instead of continuing the iteration, can we just call size() to 
calculate the number of the remaining items?
{quote}
I ever tried to find a better way. {{size()}} will return the total number of 
elements in B-Tree, but we don't know the current index, so seems not able to 
calculate the number of the remaining items.



> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch, HDFS-9053.002.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete/search, the time 
> complexity is O(log n), but insertion/deleting causes re-allocations and 
> copies of big arrays, so the operations are costly.  For example, if the 
> children grow to 1M size, the ArrayList will resize to > 1M capacity, so need 
> > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS 
> cluster where namenode heap memory is already highly used.  I recap the 3 
> main issues:
> # Insertion/deletion operations in large directories are expensive because 
> re-allocations and copies of big arrays.
> # Dynamically allocate several MB continuous heap memory which will be 
> long-lived can easily cause full GC problem.
> # Even most children are removed later, but the directory INode still 
> occupies same size heap memory, since the ArrayList will never shrink.
> This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to 
> solve the problem suggested by [~shv]. 
> So the target of this JIRA is to implement a low memory footprint B-Tree and 
> use it to replace ArrayList. 
> If the elements size is not large (less than the maximum degree of B-Tree 
> node), the B-Tree only has one root node which contains an array for the 
> elements. And if the size grows large enough, it will split automatically, 
> and if elements are removed, then B-Tree nodes can merge automatically (see 
> more: https://en.wikipedia.org/wiki/B-tree).  It will solve the above 3 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9166) Move hftp / hsftp filesystem to hfds-client

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934691#comment-14934691
 ] 

Hadoop QA commented on HDFS-9166:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764135/HDFS-9166.000.branch-2.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 151fca5 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12733/console |


This message was automatically generated.

> Move hftp / hsftp filesystem to hfds-client
> ---
>
> Key: HDFS-9166
> URL: https://issues.apache.org/jira/browse/HDFS-9166
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-9166.000.branch-2.patch
>
>
> The hftp / hsftp filesystems in branch-2 need to be moved to the hdfs-client 
> module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9141) Thread leak in Datanode#refreshVolumes

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934783#comment-14934783
 ] 

Yi Liu commented on HDFS-9141:
--

+1 pending Jenkins, Thanks Uma.

> Thread leak in Datanode#refreshVolumes
> --
>
> Key: HDFS-9141
> URL: https://issues.apache.org/jira/browse/HDFS-9141
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.7.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-9141.00.patch
>
>
> In refreshVolumes, we are creating executor service and submitting volume 
> addition tasks to it.
> But we are not shutting down the service after the use. Even though we are 
> not holding instance level service, the initialized thread could be left out.
> {code}
> ExecutorService service = Executors.newFixedThreadPool(
> changedVolumes.newLocations.size());
> {code}
> So, simple fix for this would be to shutdown the service after its use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9158) [OEV-Doc] : Document does not mention about "-f" and "-r" options

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934839#comment-14934839
 ] 

Hadoop QA commented on HDFS-9158:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 58s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  4s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 14s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   3m  8s | Site still builds. |
| {color:red}-1{color} | checkstyle |   1m 30s | The applied patch generated  1 
new checkstyle issues (total was 40, now 40). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 30s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 23s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 173m 19s | Tests failed in hadoop-hdfs. |
| | | 225m 42s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits |
|   | hadoop.hdfs.web.TestWebHDFSOAuth2 |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
| Timed out tests | org.apache.hadoop.hdfs.TestFileCreation |
|   | org.apache.hadoop.cli.TestHDFSCLI |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764173/HDFS-9158_03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / 151fca5 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12730/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12730/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12730/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12730/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12730/console |


This message was automatically generated.

> [OEV-Doc] : Document does not mention about "-f" and "-r" options
> -
>
> Key: HDFS-9158
> URL: https://issues.apache.org/jira/browse/HDFS-9158
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: nijel
>Assignee: nijel
> Attachments: HDFS-9158.01.patch, HDFS-9158_02.patch, 
> HDFS-9158_03.patch
>
>
> 1. Document does not mention about "-f" and "-r" options
> add these options also in document
> {noformat}
> -f,--fix-txids Renumber the transaction IDs in the input,
>so that there are no gaps or invalid  transaction IDs.
> -r,--recover   When reading binary edit logs, use recovery
>mode.  This will give you the chance to skip
>corrupt parts of the edit log.
> {noformat}
> 2. In help message there is some extra white spaces 
> {code}
> "so that there are no gaps or invalidtransaction IDs."
> {code}
> can remove this also



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9151) Mover should print the exit status/reason on console like balancer tool.

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934852#comment-14934852
 ] 

Hadoop QA commented on HDFS-9151:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 54s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  3s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 15s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 22s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 26s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 15s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 163m 25s | Tests failed in hadoop-hdfs. |
| | | 209m 11s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764177/HDFS-9151.02.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 151fca5 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12731/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12731/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12731/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12731/console |


This message was automatically generated.

> Mover should print the exit status/reason on console like balancer tool.
> 
>
> Key: HDFS-9151
> URL: https://issues.apache.org/jira/browse/HDFS-9151
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Archana T
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Attachments: HDFS-9151.01.patch, HDFS-9151.02.patch
>
>
> Mover should print exit reason on console --
> In cases where there is No blocks to move or unavailable Storages or any 
> other, Mover tool gives No information on exit reason on the console--
> {code}
> # ./hdfs mover
> ...
> Sep 28, 2015 12:31:25 PM Mover took 10sec
> # echo $?
> 0
> # ./hdfs mover
> ...
> Sep 28, 2015 12:33:10 PM Mover took 1sec
> # echo $?
> 254
> {code}
> Unlike Balancer prints exit reason 
> example--
> #./hdfs balancer
> ...
> {color:red}The cluster is balanced. Exiting...{color}
> Sep 28, 2015 12:18:02 PM Balancing took 1.744 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9151) Mover should print the exit status/reason on console like balancer tool.

2015-09-29 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935117#comment-14935117
 ] 

Daniel Templeton commented on HDFS-9151:


+1 (non-binding)

> Mover should print the exit status/reason on console like balancer tool.
> 
>
> Key: HDFS-9151
> URL: https://issues.apache.org/jira/browse/HDFS-9151
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Archana T
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Attachments: HDFS-9151.01.patch, HDFS-9151.02.patch
>
>
> Mover should print exit reason on console --
> In cases where there is No blocks to move or unavailable Storages or any 
> other, Mover tool gives No information on exit reason on the console--
> {code}
> # ./hdfs mover
> ...
> Sep 28, 2015 12:31:25 PM Mover took 10sec
> # echo $?
> 0
> # ./hdfs mover
> ...
> Sep 28, 2015 12:33:10 PM Mover took 1sec
> # echo $?
> 254
> {code}
> Unlike Balancer prints exit reason 
> example--
> #./hdfs balancer
> ...
> {color:red}The cluster is balanced. Exiting...{color}
> Sep 28, 2015 12:18:02 PM Balancing took 1.744 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935212#comment-14935212
 ] 

Yi Liu commented on HDFS-9053:
--

The three test failures are not related.

> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch, HDFS-9053.002.patch, HDFS-9053.003.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete/search, the time 
> complexity is O(log n), but insertion/deleting causes re-allocations and 
> copies of big arrays, so the operations are costly.  For example, if the 
> children grow to 1M size, the ArrayList will resize to > 1M capacity, so need 
> > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS 
> cluster where namenode heap memory is already highly used.  I recap the 3 
> main issues:
> # Insertion/deletion operations in large directories are expensive because 
> re-allocations and copies of big arrays.
> # Dynamically allocate several MB continuous heap memory which will be 
> long-lived can easily cause full GC problem.
> # Even most children are removed later, but the directory INode still 
> occupies same size heap memory, since the ArrayList will never shrink.
> This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to 
> solve the problem suggested by [~shv]. 
> So the target of this JIRA is to implement a low memory footprint B-Tree and 
> use it to replace ArrayList. 
> If the elements size is not large (less than the maximum degree of B-Tree 
> node), the B-Tree only has one root node which contains an array for the 
> elements. And if the size grows large enough, it will split automatically, 
> and if elements are removed, then B-Tree nodes can merge automatically (see 
> more: https://en.wikipedia.org/wiki/B-tree).  It will solve the above 3 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9174) Fix the latest findbugs of FSOutputSummer.tracer and DirectoryScanner$ReportCompiler.currentThread

2015-09-29 Thread Yi Liu (JIRA)
Yi Liu created HDFS-9174:


 Summary: Fix the latest findbugs of FSOutputSummer.tracer and 
DirectoryScanner$ReportCompiler.currentThread
 Key: HDFS-9174
 URL: https://issues.apache.org/jira/browse/HDFS-9174
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Yi Liu
Assignee: Yi Liu
Priority: Minor


https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html

https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-common.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9175) Change scope of 'AccessTokenProvider.getAccessToken()' and 'CredentialBasedAccessTokenProvider.getCredential()' abstract methods to public

2015-09-29 Thread Santhosh G Nayak (JIRA)
Santhosh G Nayak created HDFS-9175:
--

 Summary: Change scope of 'AccessTokenProvider.getAccessToken()' 
and 'CredentialBasedAccessTokenProvider.getCredential()' abstract methods to 
public
 Key: HDFS-9175
 URL: https://issues.apache.org/jira/browse/HDFS-9175
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 2.8.0
Reporter: Santhosh G Nayak
Assignee: Santhosh G Nayak


{{org.apache.hadoop.hdfs.web.oauth2.AccessTokenProvider#getAccessToken()}}  and 
{{org.apache.hadoop.hdfs.web.oauth2.CredentialBasedAccessTokenProvider#getCredential()}}
 abstract methods are having default scope.
Unfortunately, it restricts the users from having the custom implementations of 
these classes in a different package.
So, the proposal is to change scope of these abstract methods to {{public}}.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935101#comment-14935101
 ] 

Hadoop QA commented on HDFS-9053:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m  1s | Pre-patch trunk has 2 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 8 new or modified test files. |
| {color:green}+1{color} | javac |   7m 56s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 20s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m  6s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m 14s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 17s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |   7m 42s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 165m 33s | Tests failed in hadoop-hdfs. |
| | | 220m 50s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.fs.shell.TestCopyPreserveFlag |
|   | hadoop.hdfs.web.TestWebHDFSOAuth2 |
|   | hadoop.hdfs.TestCrcCorruption |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764210/HDFS-9053.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / d6fa34e |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-common.html
 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12739/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12739/console |


This message was automatically generated.

> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch, HDFS-9053.002.patch, HDFS-9053.003.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete/search, the time 
> complexity is O(log n), but insertion/deleting causes re-allocations and 
> copies of big arrays, so the operations are costly.  For example, if the 
> children grow to 1M size, the ArrayList will resize to > 1M capacity, so need 
> > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS 
> cluster where namenode heap memory is already highly used.  I recap the 3 
> main issues:
> # Insertion/deletion operations in large directories are expensive because 
> re-allocations and copies of big arrays.
> # Dynamically allocate several MB continuous heap memory which will be 
> long-lived can easily cause full GC problem.
> # Even most children are removed later, but the directory INode still 
> occupies same size heap memory, since the ArrayList will never shrink.
> This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to 
> solve the problem suggested by [~shv]. 
> So the target of this JIRA is to implement a low memory footprint B-Tree and 
> use it to replace ArrayList. 
> If the elements size is 

[jira] [Updated] (HDFS-9174) Fix the latest findbugs of FSOutputSummer.tracer and DirectoryScanner$ReportCompiler.currentThread

2015-09-29 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-9174:
-
Attachment: HDFS-9174.001.patch

> Fix the latest findbugs of FSOutputSummer.tracer and 
> DirectoryScanner$ReportCompiler.currentThread
> --
>
> Key: HDFS-9174
> URL: https://issues.apache.org/jira/browse/HDFS-9174
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-9174.001.patch
>
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-common.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9175) Change scope of 'AccessTokenProvider.getAccessToken()' and 'CredentialBasedAccessTokenProvider.getCredential()' abstract methods to public

2015-09-29 Thread Santhosh G Nayak (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Santhosh G Nayak updated HDFS-9175:
---
Attachment: HDFS-9175.1.patch

Attaching a patch containing the proposed changes.

> Change scope of 'AccessTokenProvider.getAccessToken()' and 
> 'CredentialBasedAccessTokenProvider.getCredential()' abstract methods to 
> public
> --
>
> Key: HDFS-9175
> URL: https://issues.apache.org/jira/browse/HDFS-9175
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.8.0
>Reporter: Santhosh G Nayak
>Assignee: Santhosh G Nayak
> Attachments: HDFS-9175.1.patch
>
>
> {{org.apache.hadoop.hdfs.web.oauth2.AccessTokenProvider#getAccessToken()}}  
> and 
> {{org.apache.hadoop.hdfs.web.oauth2.CredentialBasedAccessTokenProvider#getCredential()}}
>  abstract methods are having default scope.
> Unfortunately, it restricts the users from having the custom implementations 
> of these classes in a different package.
> So, the proposal is to change scope of these abstract methods to {{public}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9174) Fix the latest findbugs of FSOutputSummer.tracer and DirectoryScanner$ReportCompiler.currentThread

2015-09-29 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-9174:
-
Status: Patch Available  (was: Open)

> Fix the latest findbugs of FSOutputSummer.tracer and 
> DirectoryScanner$ReportCompiler.currentThread
> --
>
> Key: HDFS-9174
> URL: https://issues.apache.org/jira/browse/HDFS-9174
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-9174.001.patch
>
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-common.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9176) TestDirectoryScanner#testThrottling often fails.

2015-09-29 Thread Yi Liu (JIRA)
Yi Liu created HDFS-9176:


 Summary: TestDirectoryScanner#testThrottling often fails.
 Key: HDFS-9176
 URL: https://issues.apache.org/jira/browse/HDFS-9176
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.8.0
Reporter: Yi Liu
Priority: Minor


https://builds.apache.org/job/PreCommit-HDFS-Build/12736/testReport/
https://builds.apache.org/job/PreCommit-HADOOP-Build/7732/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9080) update htrace version to 4.0.1

2015-09-29 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9080:
---
Component/s: tracing

> update htrace version to 4.0.1
> --
>
> Key: HDFS-9080
> URL: https://issues.apache.org/jira/browse/HDFS-9080
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tracing
>Affects Versions: 2.8.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.8.0
>
> Attachments: HDFS-9080.001.patch, HDFS-9080.002.patch, 
> HDFS-9080.003.patch, HDFS-9080.004.patch, HDFS-9080.005.patch, 
> HDFS-9080.006.patch, HDFS-9080.007.patch, HDFS-9080.009.patch, 
> HDFS-9080.010.patch, HDFS-9080.011.patch, HDFS-9080.012.patch, 
> HDFS-9080.013.patch, tracing-fsshell-put.png
>
>
> Update the HTrace library version Hadoop uses to htrace 4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9141) Thread leak in Datanode#refreshVolumes

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934922#comment-14934922
 ] 

Hadoop QA commented on HDFS-9141:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 30s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m 48s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 29s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 33s | The applied patch generated  1 
new checkstyle issues (total was 145, now 143). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 37s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 55s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 40s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 127m 41s | Tests failed in hadoop-hdfs. |
| | | 179m 19s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
| Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestEditLog |
|   | org.apache.hadoop.hdfs.security.token.block.TestBlockToken |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764189/HDFS-9141.00.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 151fca5 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12736/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12736/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12736/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12736/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12736/console |


This message was automatically generated.

> Thread leak in Datanode#refreshVolumes
> --
>
> Key: HDFS-9141
> URL: https://issues.apache.org/jira/browse/HDFS-9141
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.7.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-9141.00.patch
>
>
> In refreshVolumes, we are creating executor service and submitting volume 
> addition tasks to it.
> But we are not shutting down the service after the use. Even though we are 
> not holding instance level service, the initialized thread could be left out.
> {code}
> ExecutorService service = Executors.newFixedThreadPool(
> changedVolumes.newLocations.size());
> {code}
> So, simple fix for this would be to shutdown the service after its use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9053) Support large directories efficiently using B-Tree

2015-09-29 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-9053:
-
Attachment: HDFS-9053.003.patch

> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch, HDFS-9053.002.patch, HDFS-9053.003.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete/search, the time 
> complexity is O(log n), but insertion/deleting causes re-allocations and 
> copies of big arrays, so the operations are costly.  For example, if the 
> children grow to 1M size, the ArrayList will resize to > 1M capacity, so need 
> > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS 
> cluster where namenode heap memory is already highly used.  I recap the 3 
> main issues:
> # Insertion/deletion operations in large directories are expensive because 
> re-allocations and copies of big arrays.
> # Dynamically allocate several MB continuous heap memory which will be 
> long-lived can easily cause full GC problem.
> # Even most children are removed later, but the directory INode still 
> occupies same size heap memory, since the ArrayList will never shrink.
> This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to 
> solve the problem suggested by [~shv]. 
> So the target of this JIRA is to implement a low memory footprint B-Tree and 
> use it to replace ArrayList. 
> If the elements size is not large (less than the maximum degree of B-Tree 
> node), the B-Tree only has one root node which contains an array for the 
> elements. And if the size grows large enough, it will split automatically, 
> and if elements are removed, then B-Tree nodes can merge automatically (see 
> more: https://en.wikipedia.org/wiki/B-tree).  It will solve the above 3 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9053) Support large directories efficiently using B-Tree

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934862#comment-14934862
 ] 

Yi Liu commented on HDFS-9053:
--

Update the patch to address Jing's comments:
1. Call {{addOrReplace}} directly in {{INodeDirectory#replaceChild}}
2. add EK type to the ReadOnlyCollection/ReadOnlyList level.
3. Fix the bug in 
{{DirectoryWithSnapshotFeature#getChildrenList#iterator(K)#next()}}, and add a 
new test in {{TestLargeDirectory}} for it: list the snapshot of directory, 
since the snapshot of directory is large enough, so can hit the code path.
 

> Support large directories efficiently using B-Tree
> --
>
> Key: HDFS-9053
> URL: https://issues.apache.org/jira/browse/HDFS-9053
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9053 (BTree with simple benchmark).patch, HDFS-9053 
> (BTree).patch, HDFS-9053.001.patch, HDFS-9053.002.patch, HDFS-9053.003.patch
>
>
> This is a long standing issue, we were trying to improve this in the past.  
> Currently we use an ArrayList for the children under a directory, and the 
> children are ordered in the list, for insert/delete/search, the time 
> complexity is O(log n), but insertion/deleting causes re-allocations and 
> copies of big arrays, so the operations are costly.  For example, if the 
> children grow to 1M size, the ArrayList will resize to > 1M capacity, so need 
> > 1M * 4bytes = 4M continuous heap memory, it easily causes full GC in HDFS 
> cluster where namenode heap memory is already highly used.  I recap the 3 
> main issues:
> # Insertion/deletion operations in large directories are expensive because 
> re-allocations and copies of big arrays.
> # Dynamically allocate several MB continuous heap memory which will be 
> long-lived can easily cause full GC problem.
> # Even most children are removed later, but the directory INode still 
> occupies same size heap memory, since the ArrayList will never shrink.
> This JIRA is similar to HDFS-7174 created by [~kihwal], but use B-Tree to 
> solve the problem suggested by [~shv]. 
> So the target of this JIRA is to implement a low memory footprint B-Tree and 
> use it to replace ArrayList. 
> If the elements size is not large (less than the maximum degree of B-Tree 
> node), the B-Tree only has one root node which contains an array for the 
> elements. And if the size grows large enough, it will split automatically, 
> and if elements are removed, then B-Tree nodes can merge automatically (see 
> more: https://en.wikipedia.org/wiki/B-tree).  It will solve the above 3 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9165) Move the rest of the entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934875#comment-14934875
 ] 

Hadoop QA commented on HDFS-9165:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 46s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 58s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 14s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 28s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | native |   3m 15s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 168m 16s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 31s | Tests passed in 
hadoop-hdfs-client. |
| | | 208m 29s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestFSNamesystem |
|   | hadoop.hdfs.web.TestWebHDFSOAuth2 |
| Timed out tests | org.apache.hadoop.net.TestNetworkTopology |
|   | org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults |
|   | org.apache.hadoop.hdfs.TestPread |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764066/HDFS-9165.000.patch |
| Optional Tests | javadoc javac unit |
| git revision | trunk / 151fca5 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12732/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12732/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12732/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12732/console |


This message was automatically generated.

> Move the rest of the entries in META-INF/services/o.a.h.fs.FileSystem to 
> hdfs-client
> 
>
> Key: HDFS-9165
> URL: https://issues.apache.org/jira/browse/HDFS-9165
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-9165.000.patch
>
>
> After HDFS-8740 the entries in META-INF/services/o.a.h.fs.FileSystem should 
> be updated accordingly similar to HDFS-9041.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8676) Delayed rolling upgrade finalization can cause heartbeat expiration

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934926#comment-14934926
 ] 

Hadoop QA commented on HDFS-8676:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m  3s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  1s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 14s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 24s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 27s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 29s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 149m 38s | Tests failed in hadoop-hdfs. |
| | | 195m 29s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
| Timed out tests | org.apache.hadoop.hdfs.server.mover.TestStorageMover |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764188/HDFS-8676.01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 151fca5 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12735/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12735/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12735/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12735/console |


This message was automatically generated.

> Delayed rolling upgrade finalization can cause heartbeat expiration
> ---
>
> Key: HDFS-8676
> URL: https://issues.apache.org/jira/browse/HDFS-8676
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Critical
> Attachments: HDFS-8676.01.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks 
> can pile up in the datanode trash directories until an upgrade is finalized.  
> When it is finally finalized, the deletion of trash is done in the service 
> actor thread's context synchronously.  This blocks the heartbeat and can 
> cause heartbeat expiration.  
> We have seen a namenode losing hundreds of nodes after a delayed upgrade 
> finalization.  The deletion of trash directories should be made asynchronous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8603) in NN WEBUI,some point which show time is using the system local language.in Chinese env,will have Chinese in NN UI

2015-09-29 Thread Surendra Singh Lilhore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore resolved HDFS-8603.
--
Resolution: Implemented

This is fixed as part of HDFS-8388

> in NN WEBUI,some point which show time is using the system local language.in 
> Chinese env,will have Chinese in NN UI
> ---
>
> Key: HDFS-8603
> URL: https://issues.apache.org/jira/browse/HDFS-8603
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: huangyitian
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Attachments: browser directory.png, overview.png
>
>
> in NN WebUI,in Chinese machine env,have some Chinese character showed for 
> time.its using machine local language.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934958#comment-14934958
 ] 

Hudson commented on HDFS-8859:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1196 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1196/])
HDFS-8859. Improve DataNode ReplicaMap memory footprint to save about 45%. 
(yliu) (yliu: rev d6fa34e014b0e2a61b24f05dd08ebe12354267fd)
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightResizableGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSetByHashMap.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaMap.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightResizableGSet.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightCache.java


> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934986#comment-14934986
 ] 

Hudson commented on HDFS-8859:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2401 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2401/])
HDFS-8859. Improve DataNode ReplicaMap memory footprint to save about 45%. 
(yliu) (yliu: rev d6fa34e014b0e2a61b24f05dd08ebe12354267fd)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSet.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightResizableGSet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaMap.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSetByHashMap.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightResizableGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestGSet.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightCache.java


> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935046#comment-14935046
 ] 

Hudson commented on HDFS-8859:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #433 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/433/])
HDFS-8859. Improve DataNode ReplicaMap memory footprint to save about 45%. 
(yliu) (yliu: rev d6fa34e014b0e2a61b24f05dd08ebe12354267fd)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSet.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightResizableGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightResizableGSet.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaMap.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightCache.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSetByHashMap.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestGSet.java


> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9172) Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client

2015-09-29 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934855#comment-14934855
 ] 

Zhe Zhang commented on HDFS-9172:
-

Just did 'git merge': 
https://github.com/zhe-thoughts/hadoop/tree/HDFS-7285-20150929

I think we can use this JIRA to track some non-ideal code changes. For example, 
I added {{StripedBlockUtil#getBlockIndex}} to avoid creating a whole new 
{{BlockIdManagerClient}} just for this method. Another case is in 
{{ErasureCodingWorker#newBlockReader}}, I hard-coded tracer to null.

> Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client
> --
>
> Key: HDFS-9172
> URL: https://issues.apache.org/jira/browse/HDFS-9172
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>
> The idea of this jira is to move the striped stream related classes to 
> {{hadoop-hdfs-client}} project. This will help to be in sync with the 
> HDFS-6200 proposal.
> - DFSStripedInputStream
> - DFSStripedOutputStream
> - StripedDataStreamer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8941) DistributedFileSystem listCorruptFileBlocks API should resolve relative path

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934904#comment-14934904
 ] 

Hadoop QA commented on HDFS-8941:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 55s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m 11s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 17s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 56s | The applied patch generated  1 
new checkstyle issues (total was 45, now 45). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   0m 56s | Post-patch findbugs 
hadoop-hdfs-project/hadoop-hdfs compilation is broken. |
| {color:red}-1{color} | findbugs |   3m 30s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 29s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  65m  0s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 29s | Tests passed in 
hadoop-hdfs-client. |
| | | 117m 24s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-client |
| Failed unit tests | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
| Timed out tests | org.apache.hadoop.hdfs.TestDecommission |
|   | org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
|   | org.apache.hadoop.hdfs.TestDFSOutputStream |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764199/HDFS-8941-03.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 151fca5 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12738/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12738/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12738/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs-client.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12738/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12738/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12738/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12738/console |


This message was automatically generated.

> DistributedFileSystem listCorruptFileBlocks API should resolve relative path
> 
>
> Key: HDFS-8941
> URL: https://issues.apache.org/jira/browse/HDFS-8941
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-8941-00.patch, HDFS-8941-01.patch, 
> HDFS-8941-02.patch, HDFS-8941-03.patch
>
>
> Presently {{DFS#listCorruptFileBlocks(path)}} API is not resolving the given 
> path relative to the workingDir. This jira is to discuss and provide the 
> implementation of the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934861#comment-14934861
 ] 

Hudson commented on HDFS-8859:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #8538 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8538/])
HDFS-8859. Improve DataNode ReplicaMap memory footprint to save about 45%. 
(yliu) (yliu: rev d6fa34e014b0e2a61b24f05dd08ebe12354267fd)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightResizableGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaMap.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSetByHashMap.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestGSet.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightCache.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightResizableGSet.java


> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9114) NameNode and DataNode metric log file name should follow the other log file name format.

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934870#comment-14934870
 ] 

Hadoop QA commented on HDFS-9114:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  19m 39s | Pre-patch trunk has 2 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  0s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 11s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 31s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | shellcheck |   0m  6s | The applied patch generated  2 
new shellcheck (v0.3.3) issues (total was 20, now 22). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 22s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |   7m 52s | Tests passed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 109m 31s | Tests failed in hadoop-hdfs. |
| | | 164m 40s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.namenode.ha.TestQuotasWithHA |
| Timed out tests | 
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager |
|   | org.apache.hadoop.hdfs.server.balancer.TestBalancer |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764187/HDFS-9114-trunk.02.patch
 |
| Optional Tests | shellcheck javadoc javac unit findbugs checkstyle |
| git revision | trunk / 151fca5 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12734/artifact/patchprocess/trunkFindbugsWarningshadoop-common.html
 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12734/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| shellcheck | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12734/artifact/patchprocess/diffpatchshellcheck.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12734/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12734/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12734/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12734/console |


This message was automatically generated.

> NameNode and DataNode metric log file name should follow the other log file 
> name format.
> 
>
> Key: HDFS-9114
> URL: https://issues.apache.org/jira/browse/HDFS-9114
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
> Attachments: HDFS-9114-branch-2.01.patch, 
> HDFS-9114-branch-2.02.patch, HDFS-9114-trunk.01.patch, 
> HDFS-9114-trunk.02.patch
>
>
> Currently datanode and namenode metric log file name is 
> {{datanode-metrics.log}} and {{namenode-metrics.log}}.
> This file name should be like {{hadoop-hdfs-namenode-metric-host192.log}} 
> same as namenode log file {{hadoop-hdfs-namenode-host192.log}}.
> This will help when we will copy log for issue analysis from different node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935071#comment-14935071
 ] 

Hudson commented on HDFS-8859:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2373 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2373/])
HDFS-8859. Improve DataNode ReplicaMap memory footprint to save about 45%. 
(yliu) (yliu: rev d6fa34e014b0e2a61b24f05dd08ebe12354267fd)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSet.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestGSet.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightCache.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSetByHashMap.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightResizableGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightResizableGSet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaMap.java


> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-8859:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934863#comment-14934863
 ] 

Yi Liu commented on HDFS-8859:
--

Committed to trunk and branch-2, thanks [~szetszwo], [~umamaheswararao], 
[~brahmareddy] for the reviews and comments!

> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934933#comment-14934933
 ] 

Hudson commented on HDFS-8859:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #465 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/465/])
HDFS-8859. Improve DataNode ReplicaMap memory footprint to save about 45%. 
(yliu) (yliu: rev d6fa34e014b0e2a61b24f05dd08ebe12354267fd)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaMap.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSet.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightCache.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightResizableGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSetByHashMap.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightResizableGSet.java


> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2015-09-29 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934931#comment-14934931
 ] 

Rakesh R commented on HDFS-8449:


Nice work, thanks [~libo-intel] for the patch. I've few comments, please take a 
look at it.

# Insead of {{recovery}} please use {{reconstruction}} for the newly adding 
code. HDFS-7955 task has raised to correct the existing code.
# {{"Count of EC recovery tasks"}} here can you describe as {{Erasure Coding}} 
instead of {{EC}}
# I prefer to avoid {{taskSucceed = false;}}. One way is to keep the 
{{datanode.getMetrics().incrECSuccessfulRecoveryTasks();}} right before the 
catch throwable statement.
# Please correct the indentation
a) Make it one line
{code}
+public class
+TestDataNodeErasureCodingMetrics {
{code}
b) Split this into two lines
{code}
+out.close();   conf = new 
Configuration();
{code}
# Could you please refer 
[TestDataNodeMetrics.java#L131|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java#L131]
 unit test case. It would be good to use the existing {{MetricsAsserts}} 
instead of reading the metrics value using relection in your tests.
{code}
+  private long getMetricsValue(String name, DataNodeMetrics metrics)
+  throws Exception{
+Field ecField = DataNodeMetrics.class.getDeclaredField(name);
+ecField.setAccessible(true);
+MutableCounterLong counter = (MutableCounterLong)ecField.get(metrics);
+return counter.value();
+  }
{code}
# Please make this function to {{private}} visibility
{code}public DataNode doTest{code}
# Its good practice to add test timeout, please add it by giving some bigger 
value {{@Test(timeout = 12)}}

> Add tasks count metrics to datanode for ECWorker
> 
>
> Key: HDFS-8449
> URL: https://issues.apache.org/jira/browse/HDFS-8449
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, 
> HDFS-8449-002.patch
>
>
> This sub task try to record ec recovery tasks that a datanode has done, 
> including total tasks, failed tasks and sucessful tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8859) Improve DataNode ReplicaMap memory footprint to save about 45%

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935008#comment-14935008
 ] 

Hudson commented on HDFS-8859:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #458 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/458/])
HDFS-8859. Improve DataNode ReplicaMap memory footprint to save about 45%. 
(yliu) (yliu: rev d6fa34e014b0e2a61b24f05dd08ebe12354267fd)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/ReplicaMap.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightCache.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInfo.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightResizableGSet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestLightWeightResizableGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/GSetByHashMap.java


> Improve DataNode ReplicaMap memory footprint to save about 45%
> --
>
> Key: HDFS-8859
> URL: https://issues.apache.org/jira/browse/HDFS-8859
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8859.001.patch, HDFS-8859.002.patch, 
> HDFS-8859.003.patch, HDFS-8859.004.patch, HDFS-8859.005.patch, 
> HDFS-8859.006.patch
>
>
> By using following approach we can save about *45%* memory footprint for each 
> block replica in DataNode memory (This JIRA only talks about *ReplicaMap* in 
> DataNode), the details are:
> In ReplicaMap, 
> {code}
> private final Map> map =
> new HashMap>();
> {code}
> Currently we use a HashMap {{Map}} to store the replicas 
> in memory.  The key is block id of the block replica which is already 
> included in {{ReplicaInfo}}, so this memory can be saved.  Also HashMap Entry 
> has a object overhead.  We can implement a lightweight Set which is  similar 
> to {{LightWeightGSet}}, but not a fixed size ({{LightWeightGSet}} uses fix 
> size for the entries array, usually it's a big value, an example is 
> {{BlocksMap}}, this can avoid full gc since no need to resize),  also we 
> should be able to get Element through key.
> Following is comparison of memory footprint If we implement a lightweight set 
> as described:
> We can save:
> {noformat}
> SIZE (bytes)   ITEM
> 20The Key: Long (12 bytes object overhead + 8 
> bytes long)
> 12HashMap Entry object overhead
> 4  reference to the key in Entry
> 4  reference to the value in Entry
> 4  hash in Entry
> {noformat}
> Total:  -44 bytes
> We need to add:
> {noformat}
> SIZE (bytes)   ITEM
> 4 a reference to next element in ReplicaInfo
> {noformat}
> Total:  +4 bytes
> So totally we can save 40bytes for each block replica 
> And currently one finalized replica needs around 46 bytes (notice: we ignore 
> memory alignment here).
> We can save 1 - (4 + 46) / (44 + 46) = *45%*  memory for each block replica 
> in DataNode.
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-09-29 Thread Casey Brotherton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Casey Brotherton updated HDFS-9100:
---
Attachment: HDFS-9100.003.patch

> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-09-29 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935304#comment-14935304
 ] 

Yongjun Zhang commented on HDFS-9100:
-

Thanks [~caseyjbrotherton].  +1 on rev3 pending jenkins test.


> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9141) Thread leak in Datanode#refreshVolumes

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935327#comment-14935327
 ] 

Hudson commented on HDFS-9141:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1197 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1197/])
HDFS-9141. Thread leak in Datanode#refreshVolumes. (Uma Maheswara Rao G via 
yliu) (yliu: rev 715dbddf77866bb47a4b95421091f64a3785444f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java


> Thread leak in Datanode#refreshVolumes
> --
>
> Key: HDFS-9141
> URL: https://issues.apache.org/jira/browse/HDFS-9141
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.7.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: 2.8.0
>
> Attachments: HDFS-9141.00.patch
>
>
> In refreshVolumes, we are creating executor service and submitting volume 
> addition tasks to it.
> But we are not shutting down the service after the use. Even though we are 
> not holding instance level service, the initialized thread could be left out.
> {code}
> ExecutorService service = Executors.newFixedThreadPool(
> changedVolumes.newLocations.size());
> {code}
> So, simple fix for this would be to shutdown the service after its use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9141) Thread leak in Datanode#refreshVolumes

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935228#comment-14935228
 ] 

Yi Liu commented on HDFS-9141:
--

The test failures are unrelated, and checkstyle is about the file contains too 
much lines, and we don't need to fix it.
Will commit the patch shortly.

> Thread leak in Datanode#refreshVolumes
> --
>
> Key: HDFS-9141
> URL: https://issues.apache.org/jira/browse/HDFS-9141
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.7.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-9141.00.patch
>
>
> In refreshVolumes, we are creating executor service and submitting volume 
> addition tasks to it.
> But we are not shutting down the service after the use. Even though we are 
> not holding instance level service, the initialized thread could be left out.
> {code}
> ExecutorService service = Executors.newFixedThreadPool(
> changedVolumes.newLocations.size());
> {code}
> So, simple fix for this would be to shutdown the service after its use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9141) Thread leak in Datanode#refreshVolumes

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935239#comment-14935239
 ] 

Hudson commented on HDFS-9141:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8539 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8539/])
HDFS-9141. Thread leak in Datanode#refreshVolumes. (Uma Maheswara Rao G via 
yliu) (yliu: rev 715dbddf77866bb47a4b95421091f64a3785444f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java


> Thread leak in Datanode#refreshVolumes
> --
>
> Key: HDFS-9141
> URL: https://issues.apache.org/jira/browse/HDFS-9141
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.7.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: 2.8.0
>
> Attachments: HDFS-9141.00.patch
>
>
> In refreshVolumes, we are creating executor service and submitting volume 
> addition tasks to it.
> But we are not shutting down the service after the use. Even though we are 
> not holding instance level service, the initialized thread could be left out.
> {code}
> ExecutorService service = Executors.newFixedThreadPool(
> changedVolumes.newLocations.size());
> {code}
> So, simple fix for this would be to shutdown the service after its use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9141) Thread leak in Datanode#refreshVolumes

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935275#comment-14935275
 ] 

Hudson commented on HDFS-9141:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #466 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/466/])
HDFS-9141. Thread leak in Datanode#refreshVolumes. (Uma Maheswara Rao G via 
yliu) (yliu: rev 715dbddf77866bb47a4b95421091f64a3785444f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Thread leak in Datanode#refreshVolumes
> --
>
> Key: HDFS-9141
> URL: https://issues.apache.org/jira/browse/HDFS-9141
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.7.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: 2.8.0
>
> Attachments: HDFS-9141.00.patch
>
>
> In refreshVolumes, we are creating executor service and submitting volume 
> addition tasks to it.
> But we are not shutting down the service after the use. Even though we are 
> not holding instance level service, the initialized thread could be left out.
> {code}
> ExecutorService service = Executors.newFixedThreadPool(
> changedVolumes.newLocations.size());
> {code}
> So, simple fix for this would be to shutdown the service after its use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9176) TestDirectoryScanner#testThrottling often fails.

2015-09-29 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton reassigned HDFS-9176:
--

Assignee: Daniel Templeton

> TestDirectoryScanner#testThrottling often fails.
> 
>
> Key: HDFS-9176
> URL: https://issues.apache.org/jira/browse/HDFS-9176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Yi Liu
>Assignee: Daniel Templeton
>Priority: Minor
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/12736/testReport/
> https://builds.apache.org/job/PreCommit-HADOOP-Build/7732/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9141) Thread leak in Datanode#refreshVolumes

2015-09-29 Thread Yi Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-9141:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks Uma.

> Thread leak in Datanode#refreshVolumes
> --
>
> Key: HDFS-9141
> URL: https://issues.apache.org/jira/browse/HDFS-9141
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.7.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: 2.8.0
>
> Attachments: HDFS-9141.00.patch
>
>
> In refreshVolumes, we are creating executor service and submitting volume 
> addition tasks to it.
> But we are not shutting down the service after the use. Even though we are 
> not holding instance level service, the initialized thread could be left out.
> {code}
> ExecutorService service = Executors.newFixedThreadPool(
> changedVolumes.newLocations.size());
> {code}
> So, simple fix for this would be to shutdown the service after its use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9100) HDFS Balancer does not respect dfs.client.use.datanode.hostname

2015-09-29 Thread Casey Brotherton (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Casey Brotherton updated HDFS-9100:
---
Status: Patch Available  (was: Open)

> HDFS Balancer does not respect dfs.client.use.datanode.hostname
> ---
>
> Key: HDFS-9100
> URL: https://issues.apache.org/jira/browse/HDFS-9100
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, HDFS
>Reporter: Yongjun Zhang
>Assignee: Casey Brotherton
> Attachments: HDFS-9100.000.patch, HDFS-9100.001.patch, 
> HDFS-9100.002.patch, HDFS-9100.003.patch
>
>
> In Balancer Dispatch.java:
> {code}
>private void dispatch() {
>   LOG.info("Start moving " + this);
>   Socket sock = new Socket();
>   DataOutputStream out = null;
>   DataInputStream in = null;
>   try {
> sock.connect(
> NetUtils.createSocketAddr(target.getDatanodeInfo().getXferAddr()),
> HdfsConstants.READ_TIMEOUT);
> {code}
> getXferAddr() is called without taking into consideration of 
> dfs.client.use.datanode.hostname setting, this would possibly fail balancer 
> run issued from outside a cluster.
> Thanks [~caseyjbrotherton] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9176) TestDirectoryScanner#testThrottling often fails.

2015-09-29 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935305#comment-14935305
 ] 

Daniel Templeton commented on HDFS-9176:


It's probably the timing-based nature of the test.  I'll dig into the errors 
and see what I can do.

> TestDirectoryScanner#testThrottling often fails.
> 
>
> Key: HDFS-9176
> URL: https://issues.apache.org/jira/browse/HDFS-9176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Yi Liu
>Assignee: Daniel Templeton
>Priority: Minor
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/12736/testReport/
> https://builds.apache.org/job/PreCommit-HADOOP-Build/7732/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8384) Allow NN to startup if there are files having a lease but are not under construction

2015-09-29 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935281#comment-14935281
 ] 

Yongjun Zhang commented on HDFS-8384:
-

HI [~szetszwo], [~jingzhao] ,

Thanks for your earlier work on this issue.  HDFS-8384 seems to be just a 
workaround for fsimages created by HDFS-7587, do you guys know if  the real 
issue (to prevent this kind of fsimage to be created) is addressed somewhere? 

Thanks.



> Allow NN to startup if there are files having a lease but are not under 
> construction
> 
>
> Key: HDFS-8384
> URL: https://issues.apache.org/jira/browse/HDFS-8384
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Jing Zhao
>Priority: Minor
>  Labels: 2.6.1-candidate
> Fix For: 2.6.1, 2.8.0, 2.7.2
>
> Attachments: HDFS-8384-branch-2.6.patch, HDFS-8384-branch-2.7.patch, 
> HDFS-8384.000.patch
>
>
> When there are files having a lease but are not under construction, NN will 
> fail to start up with
> {code}
> 15/05/12 00:36:31 ERROR namenode.FSImage: Unable to save image for 
> /hadoop/hdfs/namenode
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getINodesUnderConstruction(LeaseManager.java:412)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFilesUnderConstruction(FSNamesystem.java:7124)
> ...
> {code}
> The actually problem is that the image could be corrupted by bugs like 
> HDFS-7587.  We should have an option/conf to allow NN to start up so that the 
> problematic files could possibly be deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9165) Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client

2015-09-29 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9165:
-
Summary: Move entries in META-INF/services/o.a.h.fs.FileSystem to 
hdfs-client  (was: Move the rest of the entries in 
META-INF/services/o.a.h.fs.FileSystem to hdfs-client)

> Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client
> 
>
> Key: HDFS-9165
> URL: https://issues.apache.org/jira/browse/HDFS-9165
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-9165.000.patch
>
>
> After HDFS-8740 the entries in META-INF/services/o.a.h.fs.FileSystem should 
> be updated accordingly similar to HDFS-9041.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9172) Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client

2015-09-29 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9172:
---
Assignee: Zhe Zhang  (was: Rakesh R)

> Erasure Coding: Move DFSStripedIO stream related classes to hadoop-hdfs-client
> --
>
> Key: HDFS-9172
> URL: https://issues.apache.org/jira/browse/HDFS-9172
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Zhe Zhang
>
> The idea of this jira is to move the striped stream related classes to 
> {{hadoop-hdfs-client}} project. This will help to be in sync with the 
> HDFS-6200 proposal.
> - DFSStripedInputStream
> - DFSStripedOutputStream
> - StripedDataStreamer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9141) Thread leak in Datanode#refreshVolumes

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935346#comment-14935346
 ] 

Hudson commented on HDFS-9141:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #459 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/459/])
HDFS-9141. Thread leak in Datanode#refreshVolumes. (Uma Maheswara Rao G via 
yliu) (yliu: rev 715dbddf77866bb47a4b95421091f64a3785444f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Thread leak in Datanode#refreshVolumes
> --
>
> Key: HDFS-9141
> URL: https://issues.apache.org/jira/browse/HDFS-9141
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.7.1
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: 2.8.0
>
> Attachments: HDFS-9141.00.patch
>
>
> In refreshVolumes, we are creating executor service and submitting volume 
> addition tasks to it.
> But we are not shutting down the service after the use. Even though we are 
> not holding instance level service, the initialized thread could be left out.
> {code}
> ExecutorService service = Executors.newFixedThreadPool(
> changedVolumes.newLocations.size());
> {code}
> So, simple fix for this would be to shutdown the service after its use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9174) Fix the latest findbugs of FSOutputSummer.tracer and DirectoryScanner$ReportCompiler.currentThread

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935392#comment-14935392
 ] 

Hadoop QA commented on HDFS-9174:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  19m 57s | Pre-patch trunk has 2 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 50s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  2s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 28s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 36s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 21s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings, and fixes 2 pre-existing warnings. |
| {color:red}-1{color} | common tests |   6m 38s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |  87m 22s | Tests failed in hadoop-hdfs. |
| | | 141m 15s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.net.TestDNS |
|   | hadoop.hdfs.web.TestWebHDFSOAuth2 |
|   | hadoop.hdfs.server.namenode.TestFSNamesystem |
| Timed out tests | org.apache.hadoop.hdfs.TestFileAppend4 |
|   | org.apache.hadoop.hdfs.server.namenode.TestINodeAttributeProvider |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764245/HDFS-9174.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / d6fa34e |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12740/artifact/patchprocess/trunkFindbugsWarningshadoop-common.html
 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12740/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12740/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12740/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12740/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12740/console |


This message was automatically generated.

> Fix the latest findbugs of FSOutputSummer.tracer and 
> DirectoryScanner$ReportCompiler.currentThread
> --
>
> Key: HDFS-9174
> URL: https://issues.apache.org/jira/browse/HDFS-9174
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-9174.001.patch
>
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-common.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9165) Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client

2015-09-29 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935427#comment-14935427
 ] 

Haohui Mai commented on HDFS-9165:
--

+1

> Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client
> 
>
> Key: HDFS-9165
> URL: https://issues.apache.org/jira/browse/HDFS-9165
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-9165.000.patch
>
>
> After HDFS-8740 the entries in META-INF/services/o.a.h.fs.FileSystem should 
> be updated accordingly similar to HDFS-9041.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9165) Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935455#comment-14935455
 ] 

Hudson commented on HDFS-9165:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8541 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8541/])
HDFS-9165. Move entries in META-INF/services/o.a.h.fs.FileSystem to 
hdfs-client. Contributed by Mingliang Liu. (wheat9: rev 
80d33b589b0683f8343575416d77c64af343c5f7)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem


> Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client
> 
>
> Key: HDFS-9165
> URL: https://issues.apache.org/jira/browse/HDFS-9165
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9165.000.patch
>
>
> After HDFS-8740 the entries in META-INF/services/o.a.h.fs.FileSystem should 
> be updated accordingly similar to HDFS-9041.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9170) Move libhdfs / fuse-dfs / libwebhdfs to a separate module

2015-09-29 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935404#comment-14935404
 ] 

Haohui Mai commented on HDFS-9170:
--

The main benefit that (2) offers is that the implementation and the unit tests 
reside together, thus it's slightly easier to maintain. It looks like the 
benefit is marginal, and (1) sounds more reasonable then. I'll go with option 
(1).

> Move libhdfs / fuse-dfs / libwebhdfs to a separate module
> -
>
> Key: HDFS-9170
> URL: https://issues.apache.org/jira/browse/HDFS-9170
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> After HDFS-6200 the Java implementation of hdfs-client has be moved to a 
> separate hadoop-hdfs-client module.
> libhdfs, fuse-dfs and libwebhdfs still reside in the hadoop-hdfs module. 
> Ideally these modules should reside in the hadoop-hdfs-client. However, to 
> write unit tests for these components, it is often necessary to run 
> MiniDFSCluster which resides in the hadoop-hdfs module.
> This jira is to discuss how these native modules should layout after 
> HDFS-6200.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9165) Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client

2015-09-29 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9165:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~liuml07] for the 
contribution.

> Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client
> 
>
> Key: HDFS-9165
> URL: https://issues.apache.org/jira/browse/HDFS-9165
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9165.000.patch
>
>
> After HDFS-8740 the entries in META-INF/services/o.a.h.fs.FileSystem should 
> be updated accordingly similar to HDFS-9041.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9151) Mover should print the exit status/reason on console like balancer tool.

2015-09-29 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935450#comment-14935450
 ] 

Surendra Singh Lilhore commented on HDFS-9151:
--

Thanks [~templedf] for reviews.

[~szetszwo] Could you please review ? :)

> Mover should print the exit status/reason on console like balancer tool.
> 
>
> Key: HDFS-9151
> URL: https://issues.apache.org/jira/browse/HDFS-9151
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Reporter: Archana T
>Assignee: Surendra Singh Lilhore
>Priority: Minor
> Attachments: HDFS-9151.01.patch, HDFS-9151.02.patch
>
>
> Mover should print exit reason on console --
> In cases where there is No blocks to move or unavailable Storages or any 
> other, Mover tool gives No information on exit reason on the console--
> {code}
> # ./hdfs mover
> ...
> Sep 28, 2015 12:31:25 PM Mover took 10sec
> # echo $?
> 0
> # ./hdfs mover
> ...
> Sep 28, 2015 12:33:10 PM Mover took 1sec
> # echo $?
> 254
> {code}
> Unlike Balancer prints exit reason 
> example--
> #./hdfs balancer
> ...
> {color:red}The cluster is balanced. Exiting...{color}
> Sep 28, 2015 12:18:02 PM Balancing took 1.744 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9166) Move hftp / hsftp filesystem to hfds-client

2015-09-29 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935451#comment-14935451
 ] 

Haohui Mai commented on HDFS-9166:
--

+1.

Looks like test-patch somehow fails to recognizes the name of the patch. Tested 
manually in branch-2. Committing.

> Move hftp / hsftp filesystem to hfds-client
> ---
>
> Key: HDFS-9166
> URL: https://issues.apache.org/jira/browse/HDFS-9166
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-9166.000.branch-2.patch
>
>
> The hftp / hsftp filesystems in branch-2 need to be moved to the hdfs-client 
> module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9177) TestTextCommand: use mkdirs rather than mkdir to create test directory

2015-09-29 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-9177:
--

 Summary: TestTextCommand: use mkdirs rather than mkdir to create 
test directory
 Key: HDFS-9177
 URL: https://issues.apache.org/jira/browse/HDFS-9177
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


TestTextCommand should use mkdirs rather than mkdir to create the test 
directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9166) Move hftp / hsftp filesystem to hfds-client

2015-09-29 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9166:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~liuml07] for the 
contribution.

> Move hftp / hsftp filesystem to hfds-client
> ---
>
> Key: HDFS-9166
> URL: https://issues.apache.org/jira/browse/HDFS-9166
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9166.000.branch-2.patch
>
>
> The hftp / hsftp filesystems in branch-2 need to be moved to the hdfs-client 
> module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9165) Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935832#comment-14935832
 ] 

Hudson commented on HDFS-9165:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2375 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2375/])
HDFS-9165. Move entries in META-INF/services/o.a.h.fs.FileSystem to 
hdfs-client. Contributed by Mingliang Liu. (wheat9: rev 
80d33b589b0683f8343575416d77c64af343c5f7)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client
> 
>
> Key: HDFS-9165
> URL: https://issues.apache.org/jira/browse/HDFS-9165
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9165.000.patch
>
>
> After HDFS-8740 the entries in META-INF/services/o.a.h.fs.FileSystem should 
> be updated accordingly similar to HDFS-9041.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9174) Fix the latest findbugs of FSOutputSummer.tracer and DirectoryScanner$ReportCompiler.currentThread

2015-09-29 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935843#comment-14935843
 ] 

Daniel Templeton commented on HDFS-9174:


The changes look good to me.  +1 (non-binding)

> Fix the latest findbugs of FSOutputSummer.tracer and 
> DirectoryScanner$ReportCompiler.currentThread
> --
>
> Key: HDFS-9174
> URL: https://issues.apache.org/jira/browse/HDFS-9174
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Minor
> Attachments: HDFS-9174.001.patch
>
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-common.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9165) Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client

2015-09-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935928#comment-14935928
 ] 

Hudson commented on HDFS-9165:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #435 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/435/])
HDFS-9165. Move entries in META-INF/services/o.a.h.fs.FileSystem to 
hdfs-client. Contributed by Mingliang Liu. (wheat9: rev 
80d33b589b0683f8343575416d77c64af343c5f7)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem


> Move entries in META-INF/services/o.a.h.fs.FileSystem to hdfs-client
> 
>
> Key: HDFS-9165
> URL: https://issues.apache.org/jira/browse/HDFS-9165
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9165.000.patch
>
>
> After HDFS-8740 the entries in META-INF/services/o.a.h.fs.FileSystem should 
> be updated accordingly similar to HDFS-9041.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8696) Reduce the variances of latency of WebHDFS

2015-09-29 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935932#comment-14935932
 ] 

Xiaobing Zhou commented on HDFS-8696:
-

Checked the test failures are not related.

> Reduce the variances of latency of WebHDFS
> --
>
> Key: HDFS-8696
> URL: https://issues.apache.org/jira/browse/HDFS-8696
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8696.004.patch, HDFS-8696.005.patch, 
> HDFS-8696.006.patch, HDFS-8696.007.patch, HDFS-8696.008.patch, 
> HDFS-8696.009.patch, HDFS-8696.010.patch, HDFS-8696.1.patch, 
> HDFS-8696.2.patch, HDFS-8696.3.patch
>
>
> There is an issue that appears related to the webhdfs server. When making two 
> concurrent requests, the DN will sometimes pause for extended periods (I've 
> seen 1-300 seconds), killing performance and dropping connections. 
> To reproduce: 
> 1. set up a HDFS cluster
> 2. Upload a large file (I was using 10GB). Perform 1-byte reads, writing
> the time out to /tmp/times.txt
> {noformat}
> i=1
> while (true); do 
> echo $i
> let i++
> /usr/bin/time -f %e -o /tmp/times.txt -a curl -s -L -o /dev/null 
> "http://:50070/webhdfs/v1/tmp/bigfile?op=OPEN=root=1";
> done
> {noformat}
> 3. Watch for 1-byte requests that take more than one second:
> tail -F /tmp/times.txt | grep -E "^[^0]"
> 4. After it has had a chance to warm up, start doing large transfers from
> another shell:
> {noformat}
> i=1
> while (true); do 
> echo $i
> let i++
> /usr/bin/time -f %e curl -s -L -o /dev/null 
> "http://:50070/webhdfs/v1/tmp/bigfile?op=OPEN=root";
> done
> {noformat}
> It's easy to find after a minute or two that small reads will sometimes
> pause for 1-300 seconds. In some extreme cases, it appears that the
> transfers timeout and the DN drops the connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9174) Fix findbugs warnings in FSOutputSummer.tracer and DirectoryScanner$ReportCompiler.currentThread

2015-09-29 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9174:
-
Summary: Fix findbugs warnings in FSOutputSummer.tracer and 
DirectoryScanner$ReportCompiler.currentThread  (was: Fix the latest findbugs of 
FSOutputSummer.tracer and DirectoryScanner$ReportCompiler.currentThread)

> Fix findbugs warnings in FSOutputSummer.tracer and 
> DirectoryScanner$ReportCompiler.currentThread
> 
>
> Key: HDFS-9174
> URL: https://issues.apache.org/jira/browse/HDFS-9174
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Attachments: HDFS-9174.001.patch
>
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-common.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4015) Safemode should count and report orphaned blocks

2015-09-29 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-4015:
---
Attachment: HDFS-4015.001.patch

> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, dfsAdmin-report_with_forceExit.png, 
> dfsHealth.html.message.png
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8696) Make the lower and higher watermark in the DN Netty server configurable

2015-09-29 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936006#comment-14936006
 ] 

Haohui Mai commented on HDFS-8696:
--

+1

> Make the lower and higher watermark in the DN Netty server configurable
> ---
>
> Key: HDFS-8696
> URL: https://issues.apache.org/jira/browse/HDFS-8696
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8696.004.patch, HDFS-8696.005.patch, 
> HDFS-8696.006.patch, HDFS-8696.007.patch, HDFS-8696.008.patch, 
> HDFS-8696.009.patch, HDFS-8696.010.patch, HDFS-8696.1.patch, 
> HDFS-8696.2.patch, HDFS-8696.3.patch
>
>
> There is an issue that appears related to the webhdfs server. When making two 
> concurrent requests, the DN will sometimes pause for extended periods (I've 
> seen 1-300 seconds), killing performance and dropping connections. 
> To reproduce: 
> 1. set up a HDFS cluster
> 2. Upload a large file (I was using 10GB). Perform 1-byte reads, writing
> the time out to /tmp/times.txt
> {noformat}
> i=1
> while (true); do 
> echo $i
> let i++
> /usr/bin/time -f %e -o /tmp/times.txt -a curl -s -L -o /dev/null 
> "http://:50070/webhdfs/v1/tmp/bigfile?op=OPEN=root=1";
> done
> {noformat}
> 3. Watch for 1-byte requests that take more than one second:
> tail -F /tmp/times.txt | grep -E "^[^0]"
> 4. After it has had a chance to warm up, start doing large transfers from
> another shell:
> {noformat}
> i=1
> while (true); do 
> echo $i
> let i++
> /usr/bin/time -f %e curl -s -L -o /dev/null 
> "http://:50070/webhdfs/v1/tmp/bigfile?op=OPEN=root";
> done
> {noformat}
> It's easy to find after a minute or two that small reads will sometimes
> pause for 1-300 seconds. In some extreme cases, it appears that the
> transfers timeout and the DN drops the connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8971) Remove guards when calling LOG.debug() and LOG.trace() in client package

2015-09-29 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-8971:

Attachment: HDFS-8971.001.patch

The v1 patch addresses the whitespace and checkstyle warnings. Some existing 
(and independent) warning can be addressed separately as they are not related 
to this issue.

> Remove guards when calling LOG.debug() and LOG.trace() in client package
> 
>
> Key: HDFS-8971
> URL: https://issues.apache.org/jira/browse/HDFS-8971
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-8971.000.patch, HDFS-8971.001.patch
>
>
> We moved the {{shortcircuit}} package from {{hadoop-hdfs}} to 
> {{hadoop-hdfs-client}} module in JIRA 
> [HDFS-8934|https://issues.apache.org/jira/browse/HDFS-8934] and 
> [HDFS-8951|https://issues.apache.org/jira/browse/HDFS-8951], and 
> {{BlockReader}} in 
> [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. Meanwhile, we 
> also replaced the _log4j_ log with _slf4j_ logger. There were existing code 
> in the client package to guard the log when calling {{LOG.debug()}} and 
> {{LOG.trace()}}, e.g. in {{ShortCircuitCache.java}}, we have code like this:
> {code:title=Trace with guards|borderStyle=solid}
> 724if (LOG.isTraceEnabled()) {
> 725  LOG.trace(this + ": found waitable for " + key);
> 726}
> {code}
> In _slf4j_, this kind of guard is not necessary. We should clean the code by 
> removing the guard from the client package.
> {code:title=Trace without guards|borderStyle=solid}
> 724LOG.trace("{}: found waitable for {}", this, key);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936105#comment-14936105
 ] 

Hadoop QA commented on HDFS-4015:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  22m 20s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |   7m 55s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  4s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   3m 43s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  2s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   7m 40s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |   7m 13s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | yarn tests |   8m 14s | Tests failed in 
hadoop-yarn-server-nodemanager. |
| {color:red}-1{color} | hdfs tests |   0m 29s | Tests failed in hadoop-hdfs. |
| {color:red}-1{color} | hdfs tests |   0m 23s | Tests failed in 
hadoop-hdfs-client. |
| | |  70m 43s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.net.TestClusterTopology |
|   | hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor |
| Timed out tests | 
org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerReboot |
|   | org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels |
| Failed build | hadoop-hdfs |
|   | hadoop-hdfs-client |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764329/HDFS-4015.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 6f335e4 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12746/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12746/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12746/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12746/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12746/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12746/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12746/console |


This message was automatically generated.

> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, dfsAdmin-report_with_forceExit.png, 
> dfsHealth.html.message.png
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something 

[jira] [Commented] (HDFS-9174) Fix findbugs warnings in FSOutputSummer.tracer and DirectoryScanner$ReportCompiler.currentThread

2015-09-29 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936154#comment-14936154
 ] 

Yi Liu commented on HDFS-9174:
--

Thanks [~wheat9] and [~templedf] for the review, and thanks Haohui for the 
committing.

> Fix findbugs warnings in FSOutputSummer.tracer and 
> DirectoryScanner$ReportCompiler.currentThread
> 
>
> Key: HDFS-9174
> URL: https://issues.apache.org/jira/browse/HDFS-9174
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Yi Liu
>Assignee: Yi Liu
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: HDFS-9174.001.patch
>
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
> https://builds.apache.org/job/PreCommit-HDFS-Build/12739/artifact/patchprocess/trunkFindbugsWarningshadoop-common.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-1172) Blocks in newly completed files are considered under-replicated too quickly

2015-09-29 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935973#comment-14935973
 ] 

Jing Zhao edited comment on HDFS-1172 at 9/29/15 9:56 PM:
--

Thanks for updating the patch, [~iwasakims]. Comments on the latest patch:
# It is not necessary to call {{numNodes}} again in the following code. We can 
directly use {{numNodes}}.
{code}
 int numNodes = curBlock.numNodes();
 ..
+DatanodeStorageInfo[] expectedStorages =
+curBlock.getUnderConstructionFeature().getExpectedStorageLocations();
+if (curBlock.numNodes() < expectedStorages.length) {
{code}
# We'd better place the new "adding block to pending replica queue" logic only 
in {{checkReplication}}. Several reasons for this: 
#* {{completeBlock}} is also called by {{forceCompleteBlock}}, which is invoked 
when loading edits. At this time we should not update pending replication queue 
since the NN is just being started.
#* {{completeBlock}} can often be called when NN has only received 1 
block_received msg, updating pending replication queue at this time means later 
when further IBRs (incremental block reports) come we need to remove these DN 
from pending queue again.
#* Semantically updating pending queue is more closely coupled with updating 
neededReplication queue.
# Instead of making changes to {{PendingBlockInfo}}'s constructor, when 
updating the pending replication queue, you can prepare all the corresponding 
{{DatanodeDescriptor}} in an array first, and call 
{{pendingReplications.increment}} only once.
# Do we need to call {{computeAllPendingWork}} in 
{{TestReplication#pendingReplicationCount}}?
# Let's add a maximum retry count or total waiting time for 
{{waitForNoPendingReplication}}.



was (Author: jingzhao):
# It is not necessary to call {{numNodes}} again. We can directly use 
{{numNodes}}.
{code}
 int numNodes = curBlock.numNodes();
 ..
+DatanodeStorageInfo[] expectedStorages =
+curBlock.getUnderConstructionFeature().getExpectedStorageLocations();
+if (curBlock.numNodes() < expectedStorages.length) {
{code}

# We'd better place the new "adding block to pending replica queue" logic only 
in {{checkReplication}}. Several reasons for this: 
#* {{completeBlock}} is also called by {{forceCompleteBlock}}, which is invoked 
when loading edits. At this time we should not update pending replication queue 
since the NN is just being started.
#* {{completeBlock}} can often be called when NN has only received 1 
block_received msg, updating pending replication queue at this time means later 
when further IBRs (incremental block reports) come we need to remove these DN 
from pending queue again.
#* Semantically updating pending queue is more closely coupled with updating 
neededReplication queue.
# Instead of making changes to {{PendingBlockInfo}}'s constructor, when 
updating the pending replication queue, you can prepare all the corresponding 
{{DatanodeDescriptor}} in an array first, and call 
{{pendingReplications.increment}} only once.
# Do we need to call {{computeAllPendingWork}} in 
{{TestReplication#pendingReplicationCount}}?
# Let's add a maximum retry count or total waiting time for 
{{waitForNoPendingReplication}}.


> Blocks in newly completed files are considered under-replicated too quickly
> ---
>
> Key: HDFS-1172
> URL: https://issues.apache.org/jira/browse/HDFS-1172
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 0.21.0
>Reporter: Todd Lipcon
>Assignee: Masatake Iwasaki
> Attachments: HDFS-1172-150907.patch, HDFS-1172.008.patch, 
> HDFS-1172.009.patch, HDFS-1172.010.patch, HDFS-1172.patch, hdfs-1172.txt, 
> hdfs-1172.txt, replicateBlocksFUC.patch, replicateBlocksFUC1.patch, 
> replicateBlocksFUC1.patch
>
>
> I've seen this for a long time, and imagine it's a known issue, but couldn't 
> find an existing JIRA. It often happens that we see the NN schedule 
> replication on the last block of files very quickly after they're completed, 
> before the other DNs in the pipeline have a chance to report the new block. 
> This results in a lot of extra replication work on the cluster, as we 
> replicate the block and then end up with multiple excess replicas which are 
> very quickly deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8971) Remove guards when calling LOG.debug() and LOG.trace() in client package

2015-09-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935889#comment-14935889
 ] 

Hadoop QA commented on HDFS-8971:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m  9s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 45s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  1s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 29s | The applied patch generated  
11 new checkstyle issues (total was 802, now 787). |
| {color:red}-1{color} | whitespace |   0m  7s | The patch has 4  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 27s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 58s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 10s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests |   0m 28s | Tests passed in 
hadoop-hdfs-client. |
| | |  44m 36s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12764309/HDFS-8971.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 80d33b5 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12745/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12745/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12745/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12745/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12745/console |


This message was automatically generated.

> Remove guards when calling LOG.debug() and LOG.trace() in client package
> 
>
> Key: HDFS-8971
> URL: https://issues.apache.org/jira/browse/HDFS-8971
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-8971.000.patch
>
>
> We moved the {{shortcircuit}} package from {{hadoop-hdfs}} to 
> {{hadoop-hdfs-client}} module in JIRA 
> [HDFS-8934|https://issues.apache.org/jira/browse/HDFS-8934] and 
> [HDFS-8951|https://issues.apache.org/jira/browse/HDFS-8951], and 
> {{BlockReader}} in 
> [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. Meanwhile, we 
> also replaced the _log4j_ log with _slf4j_ logger. There were existing code 
> in the client package to guard the log when calling {{LOG.debug()}} and 
> {{LOG.trace()}}, e.g. in {{ShortCircuitCache.java}}, we have code like this:
> {code:title=Trace with guards|borderStyle=solid}
> 724if (LOG.isTraceEnabled()) {
> 725  LOG.trace(this + ": found waitable for " + key);
> 726}
> {code}
> In _slf4j_, this kind of guard is not necessary. We should clean the code by 
> removing the guard from the client package.
> {code:title=Trace without guards|borderStyle=solid}
> 724LOG.trace("{}: found waitable for {}", this, key);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8971) Remove guards when calling LOG.debug() and LOG.trace() in client package

2015-09-29 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935894#comment-14935894
 ] 

Haohui Mai commented on HDFS-8971:
--

The patch looks good to me. +1 after fixing the checkstyle / whitespace issues.

> Remove guards when calling LOG.debug() and LOG.trace() in client package
> 
>
> Key: HDFS-8971
> URL: https://issues.apache.org/jira/browse/HDFS-8971
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-8971.000.patch
>
>
> We moved the {{shortcircuit}} package from {{hadoop-hdfs}} to 
> {{hadoop-hdfs-client}} module in JIRA 
> [HDFS-8934|https://issues.apache.org/jira/browse/HDFS-8934] and 
> [HDFS-8951|https://issues.apache.org/jira/browse/HDFS-8951], and 
> {{BlockReader}} in 
> [HDFS-8925|https://issues.apache.org/jira/browse/HDFS-8925]. Meanwhile, we 
> also replaced the _log4j_ log with _slf4j_ logger. There were existing code 
> in the client package to guard the log when calling {{LOG.debug()}} and 
> {{LOG.trace()}}, e.g. in {{ShortCircuitCache.java}}, we have code like this:
> {code:title=Trace with guards|borderStyle=solid}
> 724if (LOG.isTraceEnabled()) {
> 725  LOG.trace(this + ": found waitable for " + key);
> 726}
> {code}
> In _slf4j_, this kind of guard is not necessary. We should clean the code by 
> removing the guard from the client package.
> {code:title=Trace without guards|borderStyle=solid}
> 724LOG.trace("{}: found waitable for {}", this, key);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8696) Make the lower and higher watermark in the DN Netty server configurable

2015-09-29 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8696:
-
Summary: Make the lower and higher watermark in the DN Netty server 
configurable  (was: Reduce the variances of latency of WebHDFS)

> Make the lower and higher watermark in the DN Netty server configurable
> ---
>
> Key: HDFS-8696
> URL: https://issues.apache.org/jira/browse/HDFS-8696
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.7.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8696.004.patch, HDFS-8696.005.patch, 
> HDFS-8696.006.patch, HDFS-8696.007.patch, HDFS-8696.008.patch, 
> HDFS-8696.009.patch, HDFS-8696.010.patch, HDFS-8696.1.patch, 
> HDFS-8696.2.patch, HDFS-8696.3.patch
>
>
> There is an issue that appears related to the webhdfs server. When making two 
> concurrent requests, the DN will sometimes pause for extended periods (I've 
> seen 1-300 seconds), killing performance and dropping connections. 
> To reproduce: 
> 1. set up a HDFS cluster
> 2. Upload a large file (I was using 10GB). Perform 1-byte reads, writing
> the time out to /tmp/times.txt
> {noformat}
> i=1
> while (true); do 
> echo $i
> let i++
> /usr/bin/time -f %e -o /tmp/times.txt -a curl -s -L -o /dev/null 
> "http://:50070/webhdfs/v1/tmp/bigfile?op=OPEN=root=1";
> done
> {noformat}
> 3. Watch for 1-byte requests that take more than one second:
> tail -F /tmp/times.txt | grep -E "^[^0]"
> 4. After it has had a chance to warm up, start doing large transfers from
> another shell:
> {noformat}
> i=1
> while (true); do 
> echo $i
> let i++
> /usr/bin/time -f %e curl -s -L -o /dev/null 
> "http://:50070/webhdfs/v1/tmp/bigfile?op=OPEN=root";
> done
> {noformat}
> It's easy to find after a minute or two that small reads will sometimes
> pause for 1-300 seconds. In some extreme cases, it appears that the
> transfers timeout and the DN drops the connection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8873) Allow the directoryScanner to be rate-limited

2015-09-29 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935933#comment-14935933
 ] 

Haohui Mai commented on HDFS-8873:
--

It looks like that this patch introduces a new findbugs warning in trunk. 
Please see

https://builds.apache.org/job/PreCommit-HDFS-Build/12742/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html

> Allow the directoryScanner to be rate-limited
> -
>
> Key: HDFS-8873
> URL: https://issues.apache.org/jira/browse/HDFS-8873
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Nathan Roberts
>Assignee: Daniel Templeton
> Fix For: 2.8.0
>
> Attachments: HDFS-8873.001.patch, HDFS-8873.002.patch, 
> HDFS-8873.003.patch, HDFS-8873.004.patch, HDFS-8873.005.patch, 
> HDFS-8873.006.patch, HDFS-8873.007.patch, HDFS-8873.008.patch, 
> HDFS-8873.009.patch
>
>
> The new 2-level directory layout can make directory scans expensive in terms 
> of disk seeks (see HDFS-8791) for details. 
> It would be good if the directoryScanner() had a configurable duty cycle that 
> would reduce its impact on disk performance (much like the approach in 
> HDFS-8617). 
> Without such a throttle, disks can go 100% busy for many minutes at a time 
> (assuming the common case of all inodes in cache but no directory blocks 
> cached, 64K seeks are required for full directory listing which translates to 
> 655 seconds) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8855) Webhdfs client leaks active NameNode connections

2015-09-29 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-8855:

Attachment: HDFS-8855.007.patch

> Webhdfs client leaks active NameNode connections
> 
>
> Key: HDFS-8855
> URL: https://issues.apache.org/jira/browse/HDFS-8855
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Bob Hansen
>Assignee: Xiaobing Zhou
> Attachments: HDFS-8855.005.patch, HDFS-8855.006.patch, 
> HDFS-8855.007.patch, HDFS-8855.1.patch, HDFS-8855.2.patch, HDFS-8855.3.patch, 
> HDFS-8855.4.patch, HDFS_8855.prototype.patch
>
>
> The attached script simulates a process opening ~50 files via webhdfs and 
> performing random reads.  Note that there are at most 50 concurrent reads, 
> and all webhdfs sessions are kept open.  Each read is ~64k at a random 
> position.  
> The script periodically (once per second) shells into the NameNode and 
> produces a summary of the socket states.  For my test cluster with 5 nodes, 
> it took ~30 seconds for the NameNode to have ~25000 active connections and 
> fails.
> It appears that each request to the webhdfs client is opening a new 
> connection to the NameNode and keeping it open after the request is complete. 
>  If the process continues to run, eventually (~30-60 seconds), all of the 
> open connections are closed and the NameNode recovers.  
> This smells like SoftReference reaping.  Are we using SoftReferences in the 
> webhdfs client to cache NameNode connections but never re-using them?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-09-29 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936086#comment-14936086
 ] 

Arpit Agarwal commented on HDFS-4015:
-

Hi [~anu], thanks for this improvement. Few comments below, I haven't reviewed 
the test case yet.

# ClientProtocol.java:729: Perhaps we can describe it as "bytes that are at 
risk for deletion."?
# DFSAdmin.java:474: This can happen even without blocks with future generation 
stamps e.g. DN is restarted after a long downtime and reports blocks for 
deleted files. 
# FSNamesystem.java:4438: For turn-off tip, should we check 
{{getBytesInFuture}} after the threshold of reported blocks isreached? One 
potential issue is that the administrator may see this message and immediately 
run {{-forceExit}} even before block thresholds are reached.
# FSNamesystem.java:4445: "you are ok with data loss." might also be confusing. 
Perhaps we can say "if you are certain that the NameNode was started with the 
correct FsImage and edit logs."
# FSNamesystem.java:4631: Not sure how this works. leaveSafeMode will just 
return {{if (isInStartupSafeMode() && (blockManager.getBytesInFuture() > 0))}}

Comments also posted at 
https://github.com/arp7/hadoop/commit/f16f4525a9a814f0945e76af55ad06b5fc18ecb7

> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, dfsAdmin-report_with_forceExit.png, 
> dfsHealth.html.message.png
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-09-29 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936147#comment-14936147
 ] 

Anu Engineer commented on HDFS-4015:


Hi [~arpitagarwal], thanks for the review and comments. I will wait for the 
rest of the review comments and post a new patch.

bq.ClientProtocol.java:729: Perhaps we can describe it as "bytes that are at 
risk for deletion."?
Makes sense, I will modify this.

bq. DFSAdmin.java:474: This can happen even without blocks with future 
generation stamps e.g. DN is restarted after a long downtime and reports blocks 
for deleted files.
In this patch we track blocks with generation stamp greater than the current 
highest generation stamp that is known to NN. I have made the assumption that 
if DN comes back on-line and reports blocks for files that have been deleted, 
those Generation IDs for those blocks will be lesser than the current 
Generation Stamp of NN. Please let me know if you think this assumption is not 
valid or breaks down in special cases, Could this happen with V1 vs V2 
generation stamps ? 
bq. FSNamesystem.java:4438: For turn-off tip, should we check getBytesInFuture 
after the threshold of reported blocks isreached? One potential issue is that 
the administrator may see this message and immediately run -forceExit even 
before block thresholds are reached.

With this patch we are slightly changing the behavior of SafeMode. Even if we 
find the threshold blocks we will not exit if we find blocks with future 
generation stamps, under the assumption that NN meta-data has been modified. 

bq. FSNamesystem.java:4445: "you are ok with data loss." might also be 
confusing. Perhaps we can say "if you are certain that the NameNode was started 
with the correct FsImage and edit logs."
Agreed, I will modify this warning. But we also have have the case where 
someone is actually replacing the NN metadata, and is ok with data loss.

bq. FSNamesystem.java:4631: Not sure how this works. leaveSafeMode will just 
return if (isInStartupSafeMode() && (blockManager.getBytesInFuture() > 0))
As the error message says , we are refusing to leave the safe mode -- we want 
the users to send up -forceExit to restart NN with right Metadata files before 
we will move out of safe mode.



> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, dfsAdmin-report_with_forceExit.png, 
> dfsHealth.html.message.png
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >