[jira] [Created] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)
Kai Zheng created HDFS-9705:
---

 Summary: Refine the behaviour of getFileChecksum when length = 0
 Key: HDFS-9705
 URL: https://issues.apache.org/jira/browse/HDFS-9705
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Kai Zheng
Assignee: Kai Zheng
Priority: Minor


{{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a valid 
value. Currently it will return {{null}} when length is 0, in the following 
code block:
{code}
//compute file MD5
final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
switch (crcType) {
case CRC32:
  return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
  crcPerBlock, fileMD5);
case CRC32C:
  return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
  crcPerBlock, fileMD5);
default:
  // If there is no block allocated for the file,
  // return one with the magic entry that matches what previous
  // hdfs versions return.
  if (locatedblocks.size() == 0) {
return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
  }

  // we should never get here since the validity was checked
  // when getCrcType() was called above.
  return null;
}
{code}
The comment says "we should never get here since the validity was checked" but 
it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} actually 
is a valid case in which the MD5 value is {{d41d8cd98f00b204e9800998ecf8427e}}, 
so suggest we return a reasonable value other than null. At least some useful 
information in the returned value can be seen, like values from block checksum 
header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9701) DN may deadlock when hot-swapping under load

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116999#comment-15116999
 ] 

Hadoop QA commented on HDFS-9701:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 26s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 44s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: patch generated 0 
new + 335 unchanged - 1 fixed = 335 total (was 336) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 34s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 8s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 180m 57s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.qjournal.TestSecureNNWithQJM |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
| JDK v1.8.0_66 Timed out junit tests | 
org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | org.apache.hadoop.hdfs.TestWriteReadStripedFile |
|   | org.apache.hadoop.fs.TestWebHdfsFileContextMainOperations |
|   | 

[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Attachment: (was: HDFS-9705-v1.patch)

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Attachment: HDFS-9705-v1.patch

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Status: Patch Available  (was: Open)

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9705:

Attachment: HDFS-9705-v2.patch

Change safer and add comment for a confusing place.

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-01-26 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117103#comment-15117103
 ] 

Kai Zheng commented on HDFS-9694:
-

Oops, the result patch seems too large. Will do the not-so-relevant refactoring 
separately.

> Make existing DFSClient#getFileChecksum() work for striped blocks
> -
>
> Key: HDFS-9694
> URL: https://issues.apache.org/jira/browse/HDFS-9694
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Fix For: 3.0.0
>
>
> This is a sub-task of HDFS-8430 and will get the existing API 
> {{FileSystem#getFileChecksum(path)}} work for striped files. It will also 
> refactor existing codes and layout basic work for subsequent tasks like 
> support of the new API proposed there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118764#comment-15118764
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9705:
---

Thanks Kai for working on this.  Some comment on the patch:
- Let's check if length == 0 in the very beginning of DFSClient.getFileChecksum 
and return.
- For the new test, let's add it to 
TestDistributedFileSystem.testFileChecksum().  It is expensive to start a 
cluster.

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9239) DataNode Lifeline Protocol: an alternative protocol for reporting DataNode liveness

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118590#comment-15118590
 ] 

Hadoop QA commented on HDFS-9239:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} HDFS-9239 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12771665/HDFS-9239.001.patch |
| JIRA Issue | HDFS-9239 |
| Powered by | Apache Yetus 0.2.0-SNAPSHOT   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14257/console |


This message was automatically generated.



> DataNode Lifeline Protocol: an alternative protocol for reporting DataNode 
> liveness
> ---
>
> Key: HDFS-9239
> URL: https://issues.apache.org/jira/browse/HDFS-9239
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
> Attachments: DataNode-Lifeline-Protocol.pdf, HDFS-9239.001.patch
>
>
> This issue proposes introduction of a new feature: the DataNode Lifeline 
> Protocol.  This is an RPC protocol that is responsible for reporting liveness 
> and basic health information about a DataNode to a NameNode.  Compared to the 
> existing heartbeat messages, it is lightweight and not prone to resource 
> contention problems that can harm accurate tracking of DataNode liveness 
> currently.  The attached design document contains more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9677) Rename generationStampV1/generationStampV2 to legacyGenerationStamp/generationStamp

2016-01-26 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118752#comment-15118752
 ] 

Vinayakumar B commented on HDFS-9677:
-

I feel current change is fine. {{legacyGenerationStamp}} looks better and 
inline with similar legacy things.
bq. Suppose a new user plays with generation stamp for the very first time, she 
needs to know implementation details before she is able to tell which one is 
legacy or deprecated. Even with current V1/V2 naming, we should not blame a new 
user who wonders whether a generationStampV3 version exists.
First of all, Genstamp is not user-controlled or user-exposed. ( I mean, 
exactly user, not the developer :) ).
Second, Since its legacy, its kept there just to support old blocks. Not to 
generate new blocks with that. New blocks will always be generated with 
newGenstamp itself.
So IMO, its fine. No need to be so specific on implementation details.

> Rename generationStampV1/generationStampV2 to 
> legacyGenerationStamp/generationStamp
> ---
>
> Key: HDFS-9677
> URL: https://issues.apache.org/jira/browse/HDFS-9677
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Attachments: HDFS-9677.000.patch, HDFS-9677.001.patch
>
>
> [comment|https://issues.apache.org/jira/browse/HDFS-9542?focusedCommentId=15110531=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15110531]
>  from [~drankye] in HDFS-9542:
> {quote}
> Just wonder if it's a good idea to rename: generationStampV1 => 
> legacyGenerationStamp; generationStampV2 => generationStamp, similar for 
> other variables, as we have legacy block and block.
> {quote}
> This jira plans to do this rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9654) Code refactoring for HDFS-8578

2016-01-26 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118792#comment-15118792
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9654:
---

> 2. Nit: Maybe make doUgrade method name a little more descriptive? How about 
> hardLinkAndRename?:

I was going to rename the doUgrade method.  However, it does do more than 
hardLink and rename.  It additionally does set cTime and writeProperties.  It 
won't looks good if we rename it to 
hardLinkAndSetCTimeAndWritePropertiesAndRename.  We simply cannot encode 
everything in the method name.

I also was going to fix the trailing whitespaces using git, although I always 
believe that we should focus on the non-whitespace characters and ignore the 
whitespaces, especially the trailing whitespaces.  Since there are no other 
changes, I won't post a new patch.  We could commit the patch by 
"--whitespace=fix" command.

> Code refactoring for HDFS-8578
> --
>
> Key: HDFS-9654
> URL: https://issues.apache.org/jira/browse/HDFS-9654
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h9654_20160116.patch
>
>
> This is a code refactoring JIRA in order to change Datanode to process all 
> storage/data dirs in parallel; see also HDFS-8578.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9668) Many long-time BLOCKED threads on FsDatasetImpl in a tiered storage test

2016-01-26 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118758#comment-15118758
 ] 

Vinayakumar B commented on HDFS-9668:
-

bq.  I am +1 on the concept of fixing the locking
Yes, this needs to be fixed on existing impl itself.

> Many long-time BLOCKED threads on FsDatasetImpl in a tiered storage test
> 
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and all other files are stored in HDD), we observe many 
> long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part 
> of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48521 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread 
> t@93336
>java.lang.Thread.State: BLOCKED
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:)
>   - waiting to lock <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by 
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
>   
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread 
> t@93335
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140)
>   - locked <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
> {noformat}
> We measured the execution of some operations in FsDatasetImpl during the 
> test. Here following is the result.
> !execution_time.png!
> The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy 
> load take a really long time.
> It means one slow operation of finalizeBlock, addBlock and createRbw in a 
> slow storage can block all the other same operations in the same DataNode, 
> especially in HBase when many wal/flusher/compactor are configured.
> We need a finer grained lock mechanism in a new FsDatasetImpl implementation 
> and users can choose the implementation by configuring 
> "dfs.datanode.fsdataset.factory" in DataNode.
> We can implement the lock by either storage level or block-level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9701) DN may deadlock when hot-swapping under load

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118585#comment-15118585
 ] 

Hadoop QA commented on HDFS-9701:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
54s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: patch generated 0 
new + 134 unchanged - 1 fixed = 134 total (was 135) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 44s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 49m 59s 
{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 128m 45s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12784542/HDFS-9701.04.patch |
| JIRA Issue | HDFS-9701 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5842b7933a9b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HDFS-9677) Rename generationStampV1/generationStampV2 to legacyGenerationStamp/generationStamp

2016-01-26 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118811#comment-15118811
 ] 

Mingliang Liu commented on HDFS-9677:
-

This makes perfect sense to me. Thanks for your comment and review, 
[~vinayrpet]!

> Rename generationStampV1/generationStampV2 to 
> legacyGenerationStamp/generationStamp
> ---
>
> Key: HDFS-9677
> URL: https://issues.apache.org/jira/browse/HDFS-9677
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Attachments: HDFS-9677.000.patch, HDFS-9677.001.patch
>
>
> [comment|https://issues.apache.org/jira/browse/HDFS-9542?focusedCommentId=15110531=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15110531]
>  from [~drankye] in HDFS-9542:
> {quote}
> Just wonder if it's a good idea to rename: generationStampV1 => 
> legacyGenerationStamp; generationStampV2 => generationStamp, similar for 
> other variables, as we have legacy block and block.
> {quote}
> This jira plans to do this rename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9704) terminate progress after namenode recover finished

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118725#comment-15118725
 ] 

Hadoop QA commented on HDFS-9704:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
33s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 41s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 32s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
4s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 28s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 
186 unchanged - 0 fixed = 187 total (was 186) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 35s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 51s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 9s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
34s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 206m 37s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.qjournal.TestSecureNNWithQJM |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.server.namenode.TestNameNodeRecovery |
| JDK v1.7.0_91 Failed 

[jira] [Commented] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118810#comment-15118810
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9705:
---

FileChecksum#EMPTY is attractive, however, it is more correct to use 
MD5-of-0MD5-of-0CRC32C:d41d8cd98f00b204e9800998ecf8427e since non-Hadoop could 
use the same algorithm to compute the same value.

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118783#comment-15118783
 ] 

Kai Zheng commented on HDFS-9705:
-

Thanks Nicholas for the comment!

bq. Let's check if length == 0 in the very beginning of 
DFSClient.getFileChecksum and return.
At the very beginning, what would you prefer to return? Still {{null}} or a 
{{FileChecksum}} type object that represents empty content? If the latter case, 
do you want to go to datanodes to retrieve the checksum meta data in replicas? 
Otherwise we could use a value like 
{{MD5-of-0MD5-of-0CRC32C:d41d8cd98f00b204e9800998ecf8427e}}, here 
d41d8cd98f00b204e9800998ecf8427e represents empty value for MD5. Kinds of a bit 
complicated, maybe define a constant value {{FileChecksum#EMPTY}} for the 
purpose? Thanks for your confirm.

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9654) Code refactoring for HDFS-8578

2016-01-26 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118795#comment-15118795
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9654:
---

Thanks [~ctrezzo] for reviewing the patch!

> Code refactoring for HDFS-8578
> --
>
> Key: HDFS-9654
> URL: https://issues.apache.org/jira/browse/HDFS-9654
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h9654_20160116.patch
>
>
> This is a code refactoring JIRA in order to change Datanode to process all 
> storage/data dirs in parallel; see also HDFS-8578.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9503) Replace -namenode option with -fs for NNThroughputBenchmark

2016-01-26 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118820#comment-15118820
 ] 

Vinayakumar B commented on HDFS-9503:
-

bq. If not specified, the default file system URI will be file:///. Otherwise, 
it should be hdfs://host:port format, parsed by GenericOptionsParser. Perhaps 
we don't need another flag to indicate new NN instance or remote NN?
This is not quite right. This makes the restriction that, core-site.xml also 
should not have changed the 'fs.defaultFS' configuration.

bq. I think, Since -fs is already provided by default, Now just need another 
flag ( may be instead of -namenode) to mention whether to use New Namenode 
instance, or just connect to remote namenode.
What I mean by this is, both configuring 'fs.defaultFS' or specifying '-fs' to 
'hdfs://localhost:port' are same. 
Only thing we need to inform {{NNThroughputBenchmark}} is whether to use 
existing running Namenode (might be remote or on same node) or create its own 
instance. This we can do by having another option as flag ( instead of 
-namenode ) like, "-remotenamenode". If this present, then makes use of '-fs' 
or 'fs.defaultFS' option. Else creates its own Namenode.


Since this change is incompatible,
If anybody out there feels these changes are not required?

> Replace -namenode option with -fs for NNThroughputBenchmark
> ---
>
> Key: HDFS-9503
> URL: https://issues.apache.org/jira/browse/HDFS-9503
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Konstantin Shvachko
>Assignee: Mingliang Liu
> Attachments: HDFS-9053.000.patch, HDFS-9053.001.patch, 
> HDFS-9053.002.patch
>
>
> HDFS-7847 introduced a new option {{-namenode}}, which is intended to point 
> the benchmark to a remote NameNode. It should use a standard generic option 
> {{-fs}} instead, which is routinely used to specify NameNode URI in shell 
> commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118821#comment-15118821
 ] 

Kai Zheng commented on HDFS-9705:
-

Got it. I'll use the preferred option. Thanks!

> Refine the behaviour of getFileChecksum when length = 0
> ---
>
> Key: HDFS-9705
> URL: https://issues.apache.org/jira/browse/HDFS-9705
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9705-v1.patch, HDFS-9705-v2.patch
>
>
> {{FileSystem#getFileChecksum}} may accept {{length}} parameter and 0 is a 
> valid value. Currently it will return {{null}} when length is 0, in the 
> following code block:
> {code}
> //compute file MD5
> final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData());
> switch (crcType) {
> case CRC32:
>   return new MD5MD5CRC32GzipFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> case CRC32C:
>   return new MD5MD5CRC32CastagnoliFileChecksum(bytesPerCRC,
>   crcPerBlock, fileMD5);
> default:
>   // If there is no block allocated for the file,
>   // return one with the magic entry that matches what previous
>   // hdfs versions return.
>   if (locatedblocks.size() == 0) {
> return new MD5MD5CRC32GzipFileChecksum(0, 0, fileMD5);
>   }
>   // we should never get here since the validity was checked
>   // when getCrcType() was called above.
>   return null;
> }
> {code}
> The comment says "we should never get here since the validity was checked" 
> but it does. As we're using the MD5-MD5-X approach, and {{EMPTY--CONTENT}} 
> actually is a valid case in which the MD5 value is 
> {{d41d8cd98f00b204e9800998ecf8427e}}, so suggest we return a reasonable value 
> other than null. At least some useful information in the returned value can 
> be seen, like values from block checksum header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118647#comment-15118647
 ] 

Hadoop QA commented on HDFS-9349:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
7s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
58s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 3 new + 
223 unchanged - 0 fixed = 226 total (was 223) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 5s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 53s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 161m 2s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.TestRecoverStripedFile |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12771653/HDFS-9349.002.patch |
| JIRA Issue | HDFS-9349 |
| Optional 

[jira] [Commented] (HDFS-9654) Code refactoring for HDFS-8578

2016-01-26 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118784#comment-15118784
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9654:
---

> 1. Nit: Params in javadoc do not match the params for the method (i.e. 
> missing config).  ...

These are not public APIs so that we don't require perfect javadoc.  We usually 
add javadoc/comment when the code is tricky.

> 3. Nit: the two checkstyle warnings for number of parameters in doUgrade. ...
> 4. Whitespace issue

We could safely ignore the checkstyle warnings when they are very minor or 
unreasonable.  

> 5. TestDataNodeVolumeFailure.testUnderReplicationAfterVolFailure test 
> failure. Is this related?

It does not seem related.  It also failed on some other build such as
- 
https://builds.apache.org/job/PreCommit-HDFS-Build/14245/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/

> Code refactoring for HDFS-8578
> --
>
> Key: HDFS-9654
> URL: https://issues.apache.org/jira/browse/HDFS-9654
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h9654_20160116.patch
>
>
> This is a code refactoring JIRA in order to change Datanode to process all 
> storage/data dirs in parallel; see also HDFS-8578.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9503) Replace -namenode option with -fs for NNThroughputBenchmark

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116918#comment-15116918
 ] 

Hadoop QA commented on HDFS-9503:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
6s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 55s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 105m 22s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 44s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 224m 14s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.TestEncryptionZones |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
| JDK v1.8.0_66 Timed out junit tests | 
org.apache.hadoop.hdfs.TestParallelShortCircuitLegacyRead |
|   | org.apache.hadoop.hdfs.TestAclsEndToEnd 

[jira] [Commented] (HDFS-9684) DataNode stopped sending heartbeat after getting OutOfMemoryError form DataTransfer thread.

2016-01-26 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116906#comment-15116906
 ] 

Surendra Singh Lilhore commented on HDFS-9684:
--

Thanks [~kihwal] for comments.

Yes, ulimit is reached and this we did intentionally for reliability testing. 
We just want to see if the datanode recover automatically or not after removing 
the fault.

> DataNode stopped sending heartbeat after getting OutOfMemoryError form 
> DataTransfer thread.
> ---
>
> Key: HDFS-9684
> URL: https://issues.apache.org/jira/browse/HDFS-9684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Blocker
> Attachments: HDFS-9684.01.patch
>
>
> {noformat}
> java.lang.OutOfMemoryError: unable to create new native thread
>   at java.lang.Thread.start0(Native Method)
>   at java.lang.Thread.start(Thread.java:714)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.transferBlock(DataNode.java:1999)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.transferBlocks(DataNode.java:2008)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:657)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:615)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:857)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:671)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:823)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117309#comment-15117309
 ] 

Hadoop QA commented on HDFS-9705:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 31s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 23s 
{color} | {color:red} hadoop-hdfs-project: patch generated 1 new + 52 unchanged 
- 1 fixed = 53 total (was 53) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 44s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 54m 29s 
{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 154m 0s {color} 
| {color:black} {color} |
\\

[jira] [Commented] (HDFS-9705) Refine the behaviour of getFileChecksum when length = 0

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117357#comment-15117357
 ] 

Hadoop QA commented on HDFS-9705:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 39s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 51s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 4s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 42s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 45s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 4s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 20s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 4s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 8s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 53s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
32s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 255m 2s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 

[jira] [Commented] (HDFS-9525) hadoop utilities need to support provided delegation tokens

2016-01-26 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117373#comment-15117373
 ] 

Daryn Sharp commented on HDFS-9525:
---

Allowing webhdfs to search for tokens with security off is a fine feature.  The 
problem is the patch rearranged logic in getDelegationToken which introduced a 
subtle bug that existing tests caught.  This should have been a red flag but 
the tests were changed.  A feature for security off should never break security 
on tests.

The 1-liner I posted should be all that's needed in webhdfs.

> hadoop utilities need to support provided delegation tokens
> ---
>
> Key: HDFS-9525
> URL: https://issues.apache.org/jira/browse/HDFS-9525
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-7984.001.patch, HDFS-7984.002.patch, 
> HDFS-7984.003.patch, HDFS-7984.004.patch, HDFS-7984.005.patch, 
> HDFS-7984.006.patch, HDFS-7984.007.patch, HDFS-7984.patch, 
> HDFS-9525.008.patch, HDFS-9525.009.patch, HDFS-9525.009.patch, 
> HDFS-9525.branch-2.008.patch, HDFS-9525.branch-2.009.patch
>
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9684) DataNode stopped sending heartbeat after getting OutOfMemoryError form DataTransfer thread.

2016-01-26 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117387#comment-15117387
 ] 

Kihwal Lee commented on HDFS-9684:
--

If we are to make datanode recoverable from such conditions, we need to take 
care of other essential services running in datanode. E.g. I've seen DU threads 
silently terminating, causing storage report to be stale. Sometimes crippled 
datanodes keep heartbeating so clients are sent there and cause more failures.  
It feels like we need a self healthcheck in datanode along with recovery 
mechanism. 

> DataNode stopped sending heartbeat after getting OutOfMemoryError form 
> DataTransfer thread.
> ---
>
> Key: HDFS-9684
> URL: https://issues.apache.org/jira/browse/HDFS-9684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Blocker
> Attachments: HDFS-9684.01.patch
>
>
> {noformat}
> java.lang.OutOfMemoryError: unable to create new native thread
>   at java.lang.Thread.start0(Native Method)
>   at java.lang.Thread.start(Thread.java:714)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.transferBlock(DataNode.java:1999)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.transferBlocks(DataNode.java:2008)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:657)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:615)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:857)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:671)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:823)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9690) ClientProtocol.addBlock is not idempotent after HDFS-8071

2016-01-26 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-9690:

Summary: ClientProtocol.addBlock is not idempotent after HDFS-8071  (was: 
addBlock is not idempotent)

> ClientProtocol.addBlock is not idempotent after HDFS-8071
> -
>
> Key: HDFS-9690
> URL: https://issues.apache.org/jira/browse/HDFS-9690
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h9690_20160124.patch, h9690_20160124b.patch, 
> h9690_20160124b_branch-2.7.patch
>
>
> TestDFSClientRetries#testIdempotentAllocateBlockAndClose can illustrate the 
> bug. It failed in the following builds.
> - 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14188/testReport/org.apache.hadoop.hdfs/TestDFSClientRetries/testIdempotentAllocateBlockAndClose/
> - 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14201/testReport/org.apache.hadoop.hdfs/TestDFSClientRetries/testIdempotentAllocateBlockAndClose/
> - 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14202/testReport/org.apache.hadoop.hdfs/TestDFSClientRetries/testIdempotentAllocateBlockAndClose/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9690) ClientProtocol.addBlock is not idempotent after HDFS-8071

2016-01-26 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117469#comment-15117469
 ] 

Vinayakumar B commented on HDFS-9690:
-

+1 for the patch for branch-2.7

Committed to branch-2.7

> ClientProtocol.addBlock is not idempotent after HDFS-8071
> -
>
> Key: HDFS-9690
> URL: https://issues.apache.org/jira/browse/HDFS-9690
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.7.3
>
> Attachments: h9690_20160124.patch, h9690_20160124b.patch, 
> h9690_20160124b_branch-2.7.patch
>
>
> TestDFSClientRetries#testIdempotentAllocateBlockAndClose can illustrate the 
> bug. It failed in the following builds.
> - 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14188/testReport/org.apache.hadoop.hdfs/TestDFSClientRetries/testIdempotentAllocateBlockAndClose/
> - 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14201/testReport/org.apache.hadoop.hdfs/TestDFSClientRetries/testIdempotentAllocateBlockAndClose/
> - 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14202/testReport/org.apache.hadoop.hdfs/TestDFSClientRetries/testIdempotentAllocateBlockAndClose/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9690) ClientProtocol.addBlock is not idempotent after HDFS-8071

2016-01-26 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-9690:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.3
   Status: Resolved  (was: Patch Available)

Committed,
also changed the title to be same as CHANGES.txt entry.

> ClientProtocol.addBlock is not idempotent after HDFS-8071
> -
>
> Key: HDFS-9690
> URL: https://issues.apache.org/jira/browse/HDFS-9690
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: 2.7.3
>
> Attachments: h9690_20160124.patch, h9690_20160124b.patch, 
> h9690_20160124b_branch-2.7.patch
>
>
> TestDFSClientRetries#testIdempotentAllocateBlockAndClose can illustrate the 
> bug. It failed in the following builds.
> - 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14188/testReport/org.apache.hadoop.hdfs/TestDFSClientRetries/testIdempotentAllocateBlockAndClose/
> - 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14201/testReport/org.apache.hadoop.hdfs/TestDFSClientRetries/testIdempotentAllocateBlockAndClose/
> - 
> https://builds.apache.org/job/PreCommit-HDFS-Build/14202/testReport/org.apache.hadoop.hdfs/TestDFSClientRetries/testIdempotentAllocateBlockAndClose/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7694) FSDataInputStream should support "unbuffer"

2016-01-26 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117491#comment-15117491
 ] 

Junping Du commented on HDFS-7694:
--

Thanks [~cmccabe] for confirmation. I will cherry-pick this patch to branch-2.6 
later when build failure (caused by HADOOP-12715) is figured out.

> FSDataInputStream should support "unbuffer"
> ---
>
> Key: HDFS-7694
> URL: https://issues.apache.org/jira/browse/HDFS-7694
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.7.0
>
> Attachments: HDFS-7694.001.patch, HDFS-7694.002.patch, 
> HDFS-7694.003.patch, HDFS-7694.004.patch, HDFS-7694.005.patch
>
>
> For applications that have many open HDFS (or other Hadoop filesystem) files, 
> it would be useful to have an API to clear readahead buffers and sockets.  
> This could be added to the existing APIs as an optional interface, in much 
> the same way as we added setReadahead / setDropBehind / etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9701) DN may deadlock when hot-swapping under load

2016-01-26 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9701:

Attachment: HDFS-9701.03.patch

> DN may deadlock when hot-swapping under load
> 
>
> Key: HDFS-9701
> URL: https://issues.apache.org/jira/browse/HDFS-9701
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9701.01.patch, HDFS-9701.02.patch, 
> HDFS-9701.03.patch
>
>
> If the DN is under load (new blocks being written), a hot-swap task by {{hdfs 
> dfsadmin -reconfig}} may cause a dead lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9701) DN may deadlock when hot-swapping under load

2016-01-26 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117686#comment-15117686
 ] 

Xiao Chen commented on HDFS-9701:
-

The findbugs warning helped me thinking of a cleaner way to fix this.
Patch 3 still goes with option 1 above, but instead of moving the 
close-and-wait logic all the way out to {{DataNode}}, I just moved it to 
{{FsDatasetImpl}}, and use {{wait}} instead of {{Thread.sleep}} to prevent 
locking on the {{FsDatasetImpl}} object. Please review, thanks.

> DN may deadlock when hot-swapping under load
> 
>
> Key: HDFS-9701
> URL: https://issues.apache.org/jira/browse/HDFS-9701
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9701.01.patch, HDFS-9701.02.patch, 
> HDFS-9701.03.patch
>
>
> If the DN is under load (new blocks being written), a hot-swap task by {{hdfs 
> dfsadmin -reconfig}} may cause a dead lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9706) Log more details in debug logs in BlockReceiver's constructor

2016-01-26 Thread Xiao Chen (JIRA)
Xiao Chen created HDFS-9706:
---

 Summary: Log more details in debug logs in BlockReceiver's 
constructor
 Key: HDFS-9706
 URL: https://issues.apache.org/jira/browse/HDFS-9706
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Xiao Chen
Assignee: Xiao Chen
Priority: Minor


Currently {{BlockReceiver}}'s constructor has some debug logs to help 
identifying problems. During my triage of HDFS-9701, I needed to add the 
{{isTransfer}} into the logs to see which block the code goes later.

I propose to add more details in the debug logs, to save future effort. Will 
also see whether more details need to be logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9706) Log more details in debug logs in BlockReceiver's constructor

2016-01-26 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9706:

Labels: supportability  (was: )

> Log more details in debug logs in BlockReceiver's constructor
> -
>
> Key: HDFS-9706
> URL: https://issues.apache.org/jira/browse/HDFS-9706
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Minor
>  Labels: supportability
>
> Currently {{BlockReceiver}}'s constructor has some debug logs to help 
> identifying problems. During my triage of HDFS-9701, I needed to add the 
> {{isTransfer}} into the logs to see which block the code goes later.
> I propose to add more details in the debug logs, to save future effort. Will 
> also see whether more details need to be logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9706) Log more details in debug logs in BlockReceiver's constructor

2016-01-26 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9706:

Attachment: HDFS-9706.01.patch

Patch 1 adds isTransfer, and some other infos I think might be useful.


> Log more details in debug logs in BlockReceiver's constructor
> -
>
> Key: HDFS-9706
> URL: https://issues.apache.org/jira/browse/HDFS-9706
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-9706.01.patch
>
>
> Currently {{BlockReceiver}}'s constructor has some debug logs to help 
> identifying problems. During my triage of HDFS-9701, I needed to add the 
> {{isTransfer}} into the logs to see which block the code goes later.
> I propose to add more details in the debug logs, to save future effort. Will 
> also see whether more details need to be logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9706) Log more details in debug logs in BlockReceiver's constructor

2016-01-26 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9706:

Status: Patch Available  (was: Open)

> Log more details in debug logs in BlockReceiver's constructor
> -
>
> Key: HDFS-9706
> URL: https://issues.apache.org/jira/browse/HDFS-9706
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-9706.01.patch
>
>
> Currently {{BlockReceiver}}'s constructor has some debug logs to help 
> identifying problems. During my triage of HDFS-9701, I needed to add the 
> {{isTransfer}} into the logs to see which block the code goes later.
> I propose to add more details in the debug logs, to save future effort. Will 
> also see whether more details need to be logged.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9666) Enable hdfs-client to read even remote SSD/RAM prior to local disk replica to improve random read

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-9666:
--
Fix Version/s: (was: 2.7.2)

[~aderen], please use Target Version for your intention and leave out the 
fix-version for committers to fill in at commit time. FYI, fixed this JIRA 
myself. Tx.

> Enable hdfs-client to read even remote SSD/RAM prior to local disk replica to 
> improve random read
> -
>
> Key: HDFS-9666
> URL: https://issues.apache.org/jira/browse/HDFS-9666
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.6.0, 2.7.0
>Reporter: ade
>Assignee: ade
> Attachments: HDFS-9666.0.patch
>
>
> We want to improve random read performance of HDFS for HBase, so enabled the 
> heterogeneous storage in our cluster. But there are only ~50% of datanode & 
> regionserver hosts with SSD. we can set hfile with only ONE_SSD not ALL_SSD 
> storagepolicy and the regionserver on none-SSD host can only read the local 
> disk replica . So we developed this feature in hdfs client to read even 
> remote SSD/RAM prior to local disk replica.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-815) FileContext tests fail on Windows

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-815:
-
Fix Version/s: (was: 2.7.2)

> FileContext tests fail on Windows
> -
>
> Key: HDFS-815
> URL: https://issues.apache.org/jira/browse/HDFS-815
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.21.0
> Environment: Windows
>Reporter: Konstantin Shvachko
>
> The following FileContext-related tests are failing on windows because of 
> incorrect use "test.build.data" system property for setting hdfs paths, which 
> end up containing "C:" as a path component, which hdfs does not support.
> {code}
> org.apache.hadoop.fs.TestFcHdfsCreateMkdir
> org.apache.hadoop.fs.TestFcHdfsPermission
> org.apache.hadoop.fs.TestHDFSFileContextMainOperations
> org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9629) Update the footer of Web UI to show year 2016

2016-01-26 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117734#comment-15117734
 ] 

Xiao Chen commented on HDFS-9629:
-

Thanks [~yzhangal] for the input.

My intention of this JIRA was just to update the 2015->2016, and probably in a 
more automatic way. So far seems like using the build time is automatic and 
simple enough, and has some acceptance.

I understand build and release are 2 different things. I think for most of the 
case, when people get a released package, the year will be shown as the year 
when the release was built, which in most of the case is the same as release 
year. So I propose to use this jira as an automation enhancement, and add the 
Release/Build in a separate JIRA (and if needed, probably some other 
copyright/license lines). Does this make sense to you?

> Update the footer of Web UI to show year 2016
> -
>
> Key: HDFS-9629
> URL: https://issues.apache.org/jira/browse/HDFS-9629
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>  Labels: supportability
> Attachments: HDFS-9629.01.patch, HDFS-9629.02.patch, 
> HDFS-9629.03.patch, HDFS-9629.04.patch, HDFS-9629.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9525) hadoop utilities need to support provided delegation tokens

2016-01-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117737#comment-15117737
 ] 

Allen Wittenauer commented on HDFS-9525:


bq. it will cause webhdfs to look for a token even if security is off. Nothing 
else in webhdfs should require a change.

if canRefreshDelegationToken is (defaulted) to true and without a token present 
in the UGI, then on insecure systems it will attempt to fetch a delegation.  
Perhaps the 

{code}
if (canRefreshDelegationToken) {
{code}

should be

{code}
this.canRefreshDelegationToken = true;
...
if (canRefreshDelegationToken && 
UserGroupInformation.isSecurityEnabled()) {
{code}

would satisfy everyone.

> hadoop utilities need to support provided delegation tokens
> ---
>
> Key: HDFS-9525
> URL: https://issues.apache.org/jira/browse/HDFS-9525
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-7984.001.patch, HDFS-7984.002.patch, 
> HDFS-7984.003.patch, HDFS-7984.004.patch, HDFS-7984.005.patch, 
> HDFS-7984.006.patch, HDFS-7984.007.patch, HDFS-7984.patch, 
> HDFS-9525.008.patch, HDFS-9525.009.patch, HDFS-9525.009.patch, 
> HDFS-9525.branch-2.008.patch, HDFS-9525.branch-2.009.patch
>
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7764) DirectoryScanner shouldn't abort the scan if one directory had an error

2016-01-26 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117766#comment-15117766
 ] 

Rakesh R commented on HDFS-7764:


Hi [~cmccabe], Could you share your opinion about the latest patch when you get 
a chance. Thanks!

> DirectoryScanner shouldn't abort the scan if one directory had an error
> ---
>
> Key: HDFS-7764
> URL: https://issues.apache.org/jira/browse/HDFS-7764
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-7764-01.patch, HDFS-7764-02.patch, 
> HDFS-7764-03.patch, HDFS-7764-04.patch, HDFS-7764.patch
>
>
> If there is an exception while preparing the ScanInfo for the blocks in the 
> directory, DirectoryScanner is immediately throwing exception and coming out 
> of the current scan cycle. The idea of this jira is to discuss & improve the 
> exception handling mechanism.
> DirectoryScanner.java
> {code}
> for (Entry report :
> compilersInProgress.entrySet()) {
>   try {
> dirReports[report.getKey()] = report.getValue().get();
>   } catch (Exception ex) {
> LOG.error("Error compiling report", ex);
> // Propagate ex to DataBlockScanner to deal with
> throw new RuntimeException(ex);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9541) Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater than 2 GB

2016-01-26 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9541:
---
Status: Patch Available  (was: In Progress)

> Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater 
> than 2 GB
> ---
>
> Key: HDFS-9541
> URL: https://issues.apache.org/jira/browse/HDFS-9541
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: libhdfs
>Affects Versions: 0.20.1
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9541.001.patch, HDFS-9541.002.patch
>
>
> We should have a new API in libhdfs which will support creating files with a 
> default block size that is more than 31 bits in size.  We should also make 
> this a builder API so that it is easy to add more options later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9629) Update the footer of Web UI to show year 2016

2016-01-26 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117782#comment-15117782
 ] 

Yongjun Zhang commented on HDFS-9629:
-

Thanks Xiao, agree.


> Update the footer of Web UI to show year 2016
> -
>
> Key: HDFS-9629
> URL: https://issues.apache.org/jira/browse/HDFS-9629
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>  Labels: supportability
> Attachments: HDFS-9629.01.patch, HDFS-9629.02.patch, 
> HDFS-9629.03.patch, HDFS-9629.04.patch, HDFS-9629.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9541) Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater than 2 GB

2016-01-26 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117787#comment-15117787
 ] 

Zhe Zhang commented on HDFS-9541:
-

Thanks Colin for the update. +1 on the v02 patch.

> Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater 
> than 2 GB
> ---
>
> Key: HDFS-9541
> URL: https://issues.apache.org/jira/browse/HDFS-9541
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: libhdfs
>Affects Versions: 0.20.1
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-9541.001.patch, HDFS-9541.002.patch
>
>
> We should have a new API in libhdfs which will support creating files with a 
> default block size that is more than 31 bits in size.  We should also make 
> this a builder API so that it is easy to add more options later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8676) Delayed rolling upgrade finalization can cause heartbeat expiration and write failures

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8676:
--
Fix Version/s: (was: 3.0.0)

> Delayed rolling upgrade finalization can cause heartbeat expiration and write 
> failures
> --
>
> Key: HDFS-8676
> URL: https://issues.apache.org/jira/browse/HDFS-8676
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8676.01.patch, HDFS-8676.02.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks 
> can pile up in the datanode trash directories until an upgrade is finalized.  
> When it is finally finalized, the deletion of trash is done in the service 
> actor thread's context synchronously.  This blocks the heartbeat and can 
> cause heartbeat expiration.  
> We have seen a namenode losing hundreds of nodes after a delayed upgrade 
> finalization.  The deletion of trash directories should be made asynchronous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8950) NameNode refresh doesn't remove DataNodes that are no longer in the allowed list

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8950:
--
Fix Version/s: (was: 3.0.0)

> NameNode refresh doesn't remove DataNodes that are no longer in the allowed 
> list
> 
>
> Key: HDFS-8950
> URL: https://issues.apache.org/jira/browse/HDFS-8950
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.6.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>  Labels: 2.7.2-candidate
> Fix For: 2.7.2
>
> Attachments: HDFS-8950.001.patch, HDFS-8950.002.patch, 
> HDFS-8950.003.patch, HDFS-8950.004.patch, HDFS-8950.005.patch, 
> HDFS-8950.branch-2.7.patch
>
>
> If you remove a DN from NN's allowed host list (HDFS was HA) and then do NN 
> refresh, it doesn't remove it actually and the NN UI keeps showing that node. 
> It may try to allocate some blocks to that DN as well during an MR job.  This 
> issue is independent from DN decommission.
> To reproduce:
> 1. Add a DN to dfs_hosts_allow
> 2. Refresh NN
> 3. Start DN. Now NN starts seeing DN.
> 4. Stop DN
> 5. Remove DN from dfs_hosts_allow
> 6. Refresh NN -> NN is still reporting DN as being used by HDFS.
> This is different from decom because there DN is added to exclude list in 
> addition to being removed from allowed list, and in that case everything 
> works correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9220) Reading small file (< 512 bytes) that is open for append fails due to incorrect checksum

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-9220:
--
Fix Version/s: (was: 3.0.0)

> Reading small file (< 512 bytes) that is open for append fails due to 
> incorrect checksum
> 
>
> Key: HDFS-9220
> URL: https://issues.apache.org/jira/browse/HDFS-9220
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Bogdan Raducanu
>Assignee: Jing Zhao
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: HDFS-9220.000.patch, HDFS-9220.001.patch, 
> HDFS-9220.002.patch, test2.java
>
>
> Exception:
> 2015-10-09 14:59:40 WARN  DFSClient:1150 - fetchBlockByteRange(). Got a 
> checksum exception for /tmp/file0.05355529331575182 at 
> BP-353681639-10.10.10.10-1437493596883:blk_1075692769_9244882:0 from 
> DatanodeInfoWithStorage[10.10.10.10]:5001
> All 3 replicas cause this exception and the read fails entirely with:
> BlockMissingException: Could not obtain block: 
> BP-353681639-10.10.10.10-1437493596883:blk_1075692769_9244882 
> file=/tmp/file0.05355529331575182
> Code to reproduce is attached.
> Does not happen in 2.7.0.
> Data is read correctly if checksum verification is disabled.
> More generally, the failure happens when reading from the last block of a 
> file and the last block has <= 512 bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8850) VolumeScanner thread exits with exception if there is no block pool to be scanned but there are suspicious blocks

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8850:
--
Fix Version/s: (was: 3.0.0)

> VolumeScanner thread exits with exception if there is no block pool to be 
> scanned but there are suspicious blocks
> -
>
> Key: HDFS-8850
> URL: https://issues.apache.org/jira/browse/HDFS-8850
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.7.2
>
> Attachments: HDFS-8850.001.patch
>
>
> The VolumeScanner threads inside the BlockScanner exit with an exception if 
> there is no block pool to be scanned but there are suspicious blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8879) Quota by storage type usage incorrectly initialized upon namenode restart

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8879:
--
Fix Version/s: (was: 3.0.0)

> Quota by storage type usage incorrectly initialized upon namenode restart
> -
>
> Key: HDFS-8879
> URL: https://issues.apache.org/jira/browse/HDFS-8879
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Kihwal Lee
>Assignee: Xiaoyu Yao
> Fix For: 2.7.2
>
> Attachments: HDFS-8879.01.patch
>
>
> This was found by [~kihwal] as part of HDFS-8865 work in this 
> [comment|https://issues.apache.org/jira/browse/HDFS-8865?focusedCommentId=14660904=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660904].
> The unit test 
> testQuotaByStorageTypePersistenceInFsImage/testQuotaByStorageTypePersistenceInFsEdit
>  failed to detect this because they were using an obsolete
> FsDirectory instance. Once added the highlighted line below, the issue can be 
> reproed.
> {code}
> >fsdir = cluster.getNamesystem().getFSDirectory();
> INode testDirNodeAfterNNRestart = fsdir.getINode4Write(testDir.toString());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7645) Rolling upgrade is restoring blocks from trash multiple times

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-7645:
--
Fix Version/s: (was: 3.0.0)

> Rolling upgrade is restoring blocks from trash multiple times
> -
>
> Key: HDFS-7645
> URL: https://issues.apache.org/jira/browse/HDFS-7645
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0
>Reporter: Nathan Roberts
>Assignee: Keisuke Ogiwara
> Fix For: 2.7.2
>
> Attachments: HDFS-7645.01.patch, HDFS-7645.02.patch, 
> HDFS-7645.03.patch, HDFS-7645.04.patch, HDFS-7645.05.patch, 
> HDFS-7645.06.patch, HDFS-7645.07.patch
>
>
> When performing an HDFS rolling upgrade, the trash directory is getting 
> restored twice when under normal circumstances it shouldn't need to be 
> restored at all. iiuc, the only time these blocks should be restored is if we 
> need to rollback a rolling upgrade. 
> On a busy cluster, this can cause significant and unnecessary block churn 
> both on the datanodes, and more importantly in the namenode.
> The two times this happens are:
> 1) restart of DN onto new software
> {code}
>   private void doTransition(DataNode datanode, StorageDirectory sd,
>   NamespaceInfo nsInfo, StartupOption startOpt) throws IOException {
> if (startOpt == StartupOption.ROLLBACK && sd.getPreviousDir().exists()) {
>   Preconditions.checkState(!getTrashRootDir(sd).exists(),
>   sd.getPreviousDir() + " and " + getTrashRootDir(sd) + " should not 
> " +
>   " both be present.");
>   doRollback(sd, nsInfo); // rollback if applicable
> } else {
>   // Restore all the files in the trash. The restored files are retained
>   // during rolling upgrade rollback. They are deleted during rolling
>   // upgrade downgrade.
>   int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd));
>   LOG.info("Restored " + restored + " block files from trash.");
> }
> {code}
> 2) When heartbeat response no longer indicates a rollingupgrade is in progress
> {code}
>   /**
>* Signal the current rolling upgrade status as indicated by the NN.
>* @param inProgress true if a rolling upgrade is in progress
>*/
>   void signalRollingUpgrade(boolean inProgress) throws IOException {
> String bpid = getBlockPoolId();
> if (inProgress) {
>   dn.getFSDataset().enableTrash(bpid);
>   dn.getFSDataset().setRollingUpgradeMarker(bpid);
> } else {
>   dn.getFSDataset().restoreTrash(bpid);
>   dn.getFSDataset().clearRollingUpgradeMarker(bpid);
> }
>   }
> {code}
> HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely 
> clear whether this is somehow intentional. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9221) HdfsServerConstants#ReplicaState#getState should avoid calling values() since it creates a temporary array

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-9221:
--
Fix Version/s: (was: 3.0.0)

> HdfsServerConstants#ReplicaState#getState should avoid calling values() since 
> it creates a temporary array
> --
>
> Key: HDFS-9221
> URL: https://issues.apache.org/jira/browse/HDFS-9221
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Fix For: 2.7.2
>
> Attachments: HADOOP-9221.001.patch
>
>
> When the BufferDecoder in BlockListAsLongs converts the stored value to a 
> ReplicaState enum it calls ReplicaState.getState(int) unfortunately this 
> method creates a ReplicaState[] for each call since it calls 
> ReplicaState.values().
> This patch creates a cached version of the values and thus avoid all 
> allocation when doing the conversion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8656) Preserve compatibility of ClientProtocol#rollingUpgrade after finalization

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8656:
--
Fix Version/s: (was: 3.0.0)

> Preserve compatibility of ClientProtocol#rollingUpgrade after finalization
> --
>
> Key: HDFS-8656
> URL: https://issues.apache.org/jira/browse/HDFS-8656
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.8.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: hdfs-8656.001.patch, hdfs-8656.002.patch, 
> hdfs-8656.003.patch, hdfs-8656.004.patch
>
>
> HDFS-7645 changed rollingUpgradeInfo to still return an RUInfo after 
> finalization, so the DNs can differentiate between rollback and a 
> finalization. However, this breaks compatibility for the user facing APIs, 
> which always expect a null after finalization. Let's fix this and edify it in 
> unit tests.
> As an additional improvement, isFinalized and isStarted are part of the Java 
> API, but not in the JMX output of RollingUpgradeInfo. It'd be nice to expose 
> these booleans so JMX users don't need to do the != 0 check that possibly 
> exposes our implementation details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9178) Slow datanode I/O can cause a wrong node to be marked bad

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-9178:
--
Fix Version/s: (was: 3.0.0)

> Slow datanode I/O can cause a wrong node to be marked bad
> -
>
> Key: HDFS-9178
> URL: https://issues.apache.org/jira/browse/HDFS-9178
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-9178.branch-2.6.patch, HDFS-9178.patch
>
>
> When non-leaf datanode in a pipeline is slow on or stuck at disk I/O, the 
> downstream node can timeout on reading packet since even the heartbeat 
> packets will not be relayed down.  
> The packet read timeout is set in {{DataXceiver#run()}}:
> {code}
>   peer.setReadTimeout(dnConf.socketTimeout);
> {code}
> When the downstream node times out and closes the connection to the upstream, 
> the upstream node's {{PacketResponder}} gets {{EOFException}} and it sends an 
> ack upstream with the downstream node status set to {{ERROR}}.  This caused 
> the client to exclude the downstream node, even thought the upstream node was 
> the one got stuck.
> The connection to downstream has longer timeout, so the downstream will 
> always timeout  first. The downstream timeout is set in {{writeBlock()}}
> {code}
>   int timeoutValue = dnConf.socketTimeout +
>   (HdfsConstants.READ_TIMEOUT_EXTENSION * targets.length);
>   int writeTimeout = dnConf.socketWriteTimeout +
>   (HdfsConstants.WRITE_TIMEOUT_EXTENSION * targets.length);
>   NetUtils.connect(mirrorSock, mirrorTarget, timeoutValue);
>   OutputStream unbufMirrorOut = NetUtils.getOutputStream(mirrorSock,
>   writeTimeout);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9541) Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater than 2 GB

2016-01-26 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9541:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2.

> Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater 
> than 2 GB
> ---
>
> Key: HDFS-9541
> URL: https://issues.apache.org/jira/browse/HDFS-9541
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: libhdfs
>Affects Versions: 0.20.1
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.9.0
>
> Attachments: HDFS-9541.001.patch, HDFS-9541.002.patch
>
>
> We should have a new API in libhdfs which will support creating files with a 
> default block size that is more than 31 bits in size.  We should also make 
> this a builder API so that it is easy to add more options later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-9290:
--
Fix Version/s: (was: 3.0.0)

> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Fix For: 2.7.2
>
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9445) Datanode may deadlock while handling a bad volume

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-9445:
--
Fix Version/s: (was: 3.0.0)

> Datanode may deadlock while handling a bad volume
> -
>
> Key: HDFS-9445
> URL: https://issues.apache.org/jira/browse/HDFS-9445
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kihwal Lee
>Assignee: Walter Su
>Priority: Blocker
> Fix For: 2.7.2, 2.6.4
>
> Attachments: HDFS-9445-branch-2.6.02.patch, 
> HDFS-9445-branch-2.6_02.patch, HDFS-9445.00.patch, HDFS-9445.01.patch, 
> HDFS-9445.02.patch
>
>
> {noformat}
> Found one Java-level deadlock:
> =
> "DataXceiver for client DFSClient_attempt_xxx at /1.2.3.4:100 [Sending block 
> BP-x:blk_123_456]":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> "Thread-565":
>   waiting for ownable synchronizer 0xd55613c8, (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
>   which is held by "DataNode: heartbeating to my-nn:8020"
> "DataNode: heartbeating to my-nn:8020":
>   waiting to lock monitor 0x7f77d0731768 (object 0xd60d9930, a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl),
>   which is held by "Thread-565"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9541) Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater than 2 GB

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117809#comment-15117809
 ] 

Hadoop QA commented on HDFS-9541:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 36s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12784038/HDFS-9541.002.patch |
| JIRA Issue | HDFS-9541 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux c682bb5f4624 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d0d7c22 |
| Default Java | 1.7.0_91 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_66 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_91 |
| JDK v1.7.0_91  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14249/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Max memory used | 77MB |
| Powered by | Apache Yetus 0.2.0-SNAPSHOT   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14249/console |


This message was automatically generated.



> Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater 
> than 2 GB
> ---
>
> Key: HDFS-9541
> URL: 

[jira] [Updated] (HDFS-8384) Allow NN to startup if there are files having a lease but are not under construction

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8384:
--
Fix Version/s: (was: 2.8.0)

> Allow NN to startup if there are files having a lease but are not under 
> construction
> 
>
> Key: HDFS-8384
> URL: https://issues.apache.org/jira/browse/HDFS-8384
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Jing Zhao
>Priority: Minor
>  Labels: 2.6.1-candidate
> Fix For: 2.6.1, 2.7.2
>
> Attachments: HDFS-8384-branch-2.6.patch, HDFS-8384-branch-2.7.patch, 
> HDFS-8384.000.patch
>
>
> When there are files having a lease but are not under construction, NN will 
> fail to start up with
> {code}
> 15/05/12 00:36:31 ERROR namenode.FSImage: Unable to save image for 
> /hadoop/hdfs/namenode
> java.lang.IllegalStateException
> at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:129)
> at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getINodesUnderConstruction(LeaseManager.java:412)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFilesUnderConstruction(FSNamesystem.java:7124)
> ...
> {code}
> The actually problem is that the image could be corrupted by bugs like 
> HDFS-7587.  We should have an option/conf to allow NN to start up so that the 
> problematic files could possibly be deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7609) Avoid retry cache collision when Standby NameNode loading edits

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-7609:
--
Fix Version/s: (was: 2.8.0)

> Avoid retry cache collision when Standby NameNode loading edits
> ---
>
> Key: HDFS-7609
> URL: https://issues.apache.org/jira/browse/HDFS-7609
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
>Reporter: Carrey Zhan
>Assignee: Ming Ma
>Priority: Critical
>  Labels: 2.6.1-candidate, 2.7.2-candidate
> Fix For: 2.6.1, 2.7.2
>
> Attachments: HDFS-7609-2.patch, HDFS-7609-3.patch, 
> HDFS-7609-CreateEditsLogWithRPCIDs.patch, HDFS-7609-branch-2.7.2.txt, 
> HDFS-7609.patch, recovery_do_not_use_retrycache.patch
>
>
> One day my namenode crashed because of two journal node timed out at the same 
> time under very high load, leaving behind about 100 million transactions in 
> edits log.(I still have no idea why they were not rolled into fsimage.)
> I tryed to restart namenode, but it showed that almost 20 hours would be 
> needed before finish, and it was loading fsedits most of the time. I also 
> tryed to restart namenode in recover mode, the loading speed had no different.
> I looked into the stack trace, judged that it is caused by the retry cache. 
> So I set dfs.namenode.enable.retrycache to false, the restart process 
> finished in half an hour.
> I think the retry cached is useless during startup, at least during recover 
> process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8846) Add a unit test for INotify functionality across a layout version upgrade

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8846:
--
Fix Version/s: (was: 2.8.0)

> Add a unit test for INotify functionality across a layout version upgrade
> -
>
> Key: HDFS-8846
> URL: https://issues.apache.org/jira/browse/HDFS-8846
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: 2.6.1-candidate, 2.7.2-candidate
> Fix For: 2.6.1, 2.7.2
>
> Attachments: HDFS-8846-branch-2.6.1.txt, HDFS-8846.00.patch, 
> HDFS-8846.01.patch, HDFS-8846.02.patch, HDFS-8846.03.patch
>
>
> Per discussion under HDFS-8480, we should create some edit log files with old 
> layout version, to test whether they can be correctly handled in upgrades.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9294) DFSClient deadlock when close file and failed to renew lease

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-9294:
--
Fix Version/s: (was: 2.8.0)

> DFSClient  deadlock when close file and failed to renew lease
> -
>
> Key: HDFS-9294
> URL: https://issues.apache.org/jira/browse/HDFS-9294
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.2.0, 2.7.1
> Environment: Hadoop 2.2.0
>Reporter: DENG FEI
>Assignee: Brahma Reddy Battula
>Priority: Blocker
> Fix For: 2.7.2, 2.6.4
>
> Attachments: HDFS-9294-002.patch, HDFS-9294-002.patch, 
> HDFS-9294-branch-2.6.patch, HDFS-9294-branch-2.7.patch, 
> HDFS-9294-branch-2.patch, HDFS-9294.patch
>
>
> We found a deadlock at our HBase(0.98) cluster(and the Hadoop Version is 
> 2.2.0),and it should be HDFS BUG,at the time our network is not stable.
>  below is the stack:
> *
> Found one Java-level deadlock:
> =
> "MemStoreFlusher.1":
>   waiting to lock monitor 0x7ff27cfa5218 (object 0x0002fae5ebe0, a 
> org.apache.hadoop.hdfs.LeaseRenewer),
>   which is held by "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel"
> "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel":
>   waiting to lock monitor 0x7ff2e67e16a8 (object 0x000486ce6620, a 
> org.apache.hadoop.hdfs.DFSOutputStream),
>   which is held by "MemStoreFlusher.0"
> "MemStoreFlusher.0":
>   waiting to lock monitor 0x7ff27cfa5218 (object 0x0002fae5ebe0, a 
> org.apache.hadoop.hdfs.LeaseRenewer),
>   which is held by "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel"
> Java stack information for the threads listed above:
> ===
> "MemStoreFlusher.1":
>   at org.apache.hadoop.hdfs.LeaseRenewer.addClient(LeaseRenewer.java:216)
>   - waiting to lock <0x0002fae5ebe0> (a 
> org.apache.hadoop.hdfs.LeaseRenewer)
>   at org.apache.hadoop.hdfs.LeaseRenewer.getInstance(LeaseRenewer.java:81)
>   at org.apache.hadoop.hdfs.DFSClient.getLeaseRenewer(DFSClient.java:648)
>   at org.apache.hadoop.hdfs.DFSClient.endFileLease(DFSClient.java:659)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1882)
>   - locked <0x00055b606cb0> (a org.apache.hadoop.hdfs.DFSOutputStream)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:71)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:104)
>   at 
> org.apache.hadoop.hbase.io.hfile.AbstractHFileWriter.finishClose(AbstractHFileWriter.java:250)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2.close(HFileWriterV2.java:402)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:974)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:78)
>   at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:75)
>   - locked <0x00059869eed8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:812)
>   at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:1974)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1795)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1678)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1591)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:472)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:211)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$500(MemStoreFlusher.java:66)
>   at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:238)
>   at java.lang.Thread.run(Thread.java:744)
> "LeaseRenewer:hbaseadmin@hbase-ns-gdt-sh-marvel":
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.abort(DFSOutputStream.java:1822)
>   - waiting to lock <0x000486ce6620> (a 
> org.apache.hadoop.hdfs.DFSOutputStream)
>   at 
> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:780)
>   at org.apache.hadoop.hdfs.DFSClient.abort(DFSClient.java:753)
>   at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:453)
>   - locked <0x0002fae5ebe0> (a org.apache.hadoop.hdfs.LeaseRenewer)
>   at 

[jira] [Updated] (HDFS-9273) ACLs on root directory may be lost after NN restart

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-9273:
--
Fix Version/s: (was: 2.8.0)

> ACLs on root directory may be lost after NN restart
> ---
>
> Key: HDFS-9273
> URL: https://issues.apache.org/jira/browse/HDFS-9273
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: HDFS-9273.001.patch, HDFS-9273.002.patch
>
>
> After restarting namenode, the ACLs on the root directory ("/") may be lost 
> if it's rolled over to fsimage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8219) setStoragePolicy with folder behavior is different after cluster restart

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8219:
--
Fix Version/s: (was: 2.8.0)

> setStoragePolicy with folder behavior is different after cluster restart
> 
>
> Key: HDFS-8219
> URL: https://issues.apache.org/jira/browse/HDFS-8219
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Peter Shi
>Assignee: Surendra Singh Lilhore
>  Labels: 2.6.1-candidate, 2.7.2-candidate, BB2015-05-RFC
> Fix For: 2.6.1, 2.7.2
>
> Attachments: HDFS-8219.patch, HDFS-8219.unittest-norepro.patch
>
>
> Reproduce steps.
> 1) mkdir named /temp
> 2) put one file A under /temp
> 3) change /temp storage policy to COLD
> 4) use -getStoragePolicy to query file A's storage policy, it is same with 
> /temp
> 5) change /temp folder storage policy again, will see file A's storage policy 
> keep same with parent folder.
> then restart the cluster.
> do 3) 4) again, will find file A's storage policy is not change while parent 
> folder's storage policy changes. It behaves different.
> As i debugged, found the code:
> in INodeFile.getStoragePolicyID
> {code}
>   public byte getStoragePolicyID() {
> byte id = getLocalStoragePolicyID();
> if (id == BLOCK_STORAGE_POLICY_ID_UNSPECIFIED) {
>   return this.getParent() != null ?
>   this.getParent().getStoragePolicyID() : id;
> }
> return id;
>   }
> {code}
> If the file do not have its storage policy, it will use parent's. But after 
> cluster restart, the file turns to have its own storage policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8431) hdfs crypto class not found in Windows

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8431:
--
Fix Version/s: (was: 2.8.0)

> hdfs crypto class not found in Windows
> --
>
> Key: HDFS-8431
> URL: https://issues.apache.org/jira/browse/HDFS-8431
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 2.6.0
> Environment: Windows only
>Reporter: Sumana Sathish
>Assignee: Anu Engineer
>Priority: Critical
>  Labels: 2.6.1-candidate, 2.7.2-candidate, encryption, scripts, 
> windows
> Fix For: 2.6.1, 2.7.2
>
> Attachments: Screen Shot 2015-05-18 at 6.27.11 PM.png, 
> hdfs-8431.001.patch, hdfs-8431.002.patch
>
>
> Attached screenshot shows that hdfs could not find class 'crypto' for Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7314) When the DFSClient lease cannot be renewed, abort open-for-write files rather than the entire DFSClient

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-7314:
--
Fix Version/s: (was: 2.8.0)

> When the DFSClient lease cannot be renewed, abort open-for-write files rather 
> than the entire DFSClient
> ---
>
> Key: HDFS-7314
> URL: https://issues.apache.org/jira/browse/HDFS-7314
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Ming Ma
>  Labels: 2.6.1-candidate, 2.7.2-candidate, BB2015-05-TBR
> Fix For: 2.6.1, 2.7.2
>
> Attachments: HDFS-7314-2.patch, HDFS-7314-3.patch, HDFS-7314-4.patch, 
> HDFS-7314-5.patch, HDFS-7314-6.patch, HDFS-7314-7.patch, HDFS-7314-8.patch, 
> HDFS-7314-9.patch, HDFS-7314-branch-2.7.2.txt, HDFS-7314.patch
>
>
> It happened in YARN nodemanger scenario. But it could happen to any long 
> running service that use cached instance of DistrbutedFileSystem.
> 1. Active NN is under heavy load. So it became unavailable for 10 minutes; 
> any DFSClient request will get ConnectTimeoutException.
> 2. YARN nodemanager use DFSClient for certain write operation such as log 
> aggregator or shared cache in YARN-1492. DFSClient used by YARN NM's 
> renewLease RPC got ConnectTimeoutException.
> {noformat}
> 2014-10-29 01:36:19,559 WARN org.apache.hadoop.hdfs.LeaseRenewer: Failed to 
> renew lease for [DFSClient_NONMAPREDUCE_-550838118_1] for 372 seconds.  
> Aborting ...
> {noformat}
> 3. After DFSClient is in Aborted state, YARN NM can't use that cached 
> instance of DistributedFileSystem.
> {noformat}
> 2014-10-29 20:26:23,991 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
>  Failed to download rsrc...
> java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:727)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1780)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
> at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:237)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:340)
> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:57)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> We can make YARN or DFSClient more tolerant to temporary NN unavailability. 
> Given the callstack is YARN -> DistributedFileSystem -> DFSClient, this can 
> be addressed at different layers.
> * YARN closes the DistributedFileSystem object when it receives some well 
> defined exception. Then the next HDFS call will create a new instance of 
> DistributedFileSystem. We have to fix all the places in YARN. Plus other HDFS 
> applications need to address this as well.
> * DistributedFileSystem detects Aborted DFSClient and create a new instance 
> of DFSClient. We will need to fix all the places DistributedFileSystem calls 
> DFSClient.
> * After DFSClient gets into Aborted state, it doesn't have to reject all 
> requests , instead it can retry. If NN is available again it can transition 
> to healthy state.
> Comments?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8046) Allow better control of getContentSummary

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8046:
--
Fix Version/s: (was: 2.8.0)

> Allow better control of getContentSummary
> -
>
> Key: HDFS-8046
> URL: https://issues.apache.org/jira/browse/HDFS-8046
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>  Labels: 2.6.1-candidate, 2.7.2-candidate
> Fix For: 2.6.1, 2.7.2
>
> Attachments: HDFS-8046-branch-2.6.1.txt, HDFS-8046.v1.patch
>
>
> On busy clusters, users performing quota checks against a big directory 
> structure can affect the namenode performance. It has become a lot better 
> after HDFS-4995, but as clusters get bigger and busier, it is apparent that 
> we need finer grain control to avoid long read lock causing throughput drop.
> Even with unfair namesystem lock setting, a long read lock (10s of 
> milliseconds) can starve many readers and especially writers. So the locking 
> duration should be reduced, which can be done by imposing a lower 
> count-per-iteration limit in the existing implementation.  But HDFS-4995 came 
> with a fixed amount of sleep between locks. This needs to be made 
> configurable, so that {{getContentSummary()}} doesn't get exceedingly slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6945) BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed

2016-01-26 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-6945:
--
Fix Version/s: (was: 2.8.0)

> BlockManager should remove a block from excessReplicateMap and decrement 
> ExcessBlocks metric when the block is removed
> --
>
> Key: HDFS-6945
> URL: https://issues.apache.org/jira/browse/HDFS-6945
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Akira AJISAKA
>Assignee: Akira AJISAKA
>Priority: Critical
>  Labels: metrics
> Fix For: 2.7.2, 2.6.4
>
> Attachments: HDFS-6945-003.patch, HDFS-6945-004.patch, 
> HDFS-6945-005.patch, HDFS-6945.2.patch, HDFS-6945.patch
>
>
> I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, 
> however, there are no over-replicated blocks (confirmed by fsck).
> After a further research, I noticed when deleting a block, BlockManager does 
> not remove the block from excessReplicateMap or decrement excessBlocksCount.
> Usually the metric is decremented when processing block report, however, if 
> the block has been deleted, BlockManager does not remove the block from 
> excessReplicateMap or decrement the metric.
> That way the metric and excessReplicateMap can increase infinitely (i.e. 
> memory leak can occur).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9579) Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level

2016-01-26 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117815#comment-15117815
 ] 

Colin Patrick McCabe commented on HDFS-9579:


Thanks for this patch, [~mingma].  As [~sjlee0] commented, we need to use 
something thread-safe, so {{HashMap}} is not the right choice.

I think it's reasonable to use individual {{long}} values as you've proposed.  
if it only goes up to six, should it be "six or longer"?

This might be a dumb question, but is it possible to have a distance of one or 
three?

> Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
> -
>
> Key: HDFS-9579
> URL: https://issues.apache.org/jira/browse/HDFS-9579
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9579-2.patch, HDFS-9579-3.patch, HDFS-9579-4.patch, 
> HDFS-9579.patch, MR job counters.png
>
>
> For cross DC distcp or other applications, it becomes useful to have insight 
> as to the traffic volume for each network distance to distinguish cross-DC 
> traffic, local-DC-remote-rack, etc.
> FileSystem's existing {{bytesRead}} metrics tracks all the bytes read. To 
> provide additional metrics for each network distance, we can add additional 
> metrics to FileSystem level and have {{DFSInputStream}} update the value 
> based on the network distance between client and the datanode.
> {{DFSClient}} will resolve client machine's network location as part of its 
> initialization. It doesn't need to resolve datanode's network location for 
> each read as {{DatanodeInfo}} already has the info.
> There are existing HDFS specific metrics such as {{ReadStatistics}} and 
> {{DFSHedgedReadMetrics}}. But these metrics are only accessible via 
> {{DFSClient}} or {{DFSInputStream}}. Not something that application framework 
> such as MR and Tez can get to. That is the benefit of storing these new 
> metrics in FileSystem.Statistics.
> This jira only includes metrics generation by HDFS. The consumption of these 
> metrics at MR and Tez will be tracked by separated jiras.
> We can add similar metrics for HDFS write scenario later if it is necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9541) Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater than 2 GB

2016-01-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117817#comment-15117817
 ] 

Hudson commented on HDFS-9541:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9187 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9187/])
HDFS-9541. Add hdfsStreamBuilder API to libhdfs to support (zhz: rev 
cf8af7bb459b21babaad2d972330a3b4c6bb222d)
* hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/include/hdfs/hdfs.h


> Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater 
> than 2 GB
> ---
>
> Key: HDFS-9541
> URL: https://issues.apache.org/jira/browse/HDFS-9541
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: libhdfs
>Affects Versions: 0.20.1
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.9.0
>
> Attachments: HDFS-9541.001.patch, HDFS-9541.002.patch
>
>
> We should have a new API in libhdfs which will support creating files with a 
> default block size that is more than 31 bits in size.  We should also make 
> this a builder API so that it is easy to add more options later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )

2016-01-26 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117849#comment-15117849
 ] 

Jing Zhao commented on HDFS-9494:
-

Thanks for updating the patch, [~demongaorui]! The latest patch looks good to 
me. Some minors:
# We do not need to create a thread pool instance for each 
{{flushAllInternals}} call. It will be better if we can reuse.
# In {{handleCurrentStreamerFailure}}, let's still set {{currentPacket}} to 
null.
# should be "new ConcurrentHashMap<>()"
{code}
+final Map streamersExceptionMap = new
+ConcurrentHashMap();
{code}
# We do not need to remove the entry from the temporary map here.
{code}
+  iterator.remove();
+}
+executor.shutdownNow();
{code}
# The exception thrown by {{waitForAckedSeqno}} can be caught as 
{{ExecutionException}} while calling {{Future#get}}. Thus I think we do not 
need to use a concurrent hashmap to catch the exceptions. Instead, we can 
create a map to track the mapping between streamers and corresponding futures. 
In this way we can remove both {{healthyStreamerCount}} and 
{{streamersExceptionMap}}. Please see {{DFSStripedInputStream#readStripe}} as 
an example.

bq. After checking the related codes, it seems that we haven't set a timeout 
for waitForAckedSeqno(). Maybe we could consider to set a timeout for it in 
another new Jira. 

It will be very hard to set this timeout value. Let's still depend on the 
read/write timeout set on the connection instead of adding a timeout for 
{{waitForAckedSeqno}}.

> Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
> 
>
> Key: HDFS-9494
> URL: https://issues.apache.org/jira/browse/HDFS-9494
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: GAO Rui
>Assignee: GAO Rui
>Priority: Minor
> Attachments: HDFS-9494-origin-trunk.00.patch, 
> HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, 
> HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, 
> HDFS-9494-origin-trunk.05.patch
>
>
> Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and 
> wait for flushInternal( ) in sequence. So the runtime flow is like:
> {code}
> Streamer0#flushInternal( )
> Streamer0#waitForAckedSeqno( )
> Streamer1#flushInternal( )
> Streamer1#waitForAckedSeqno( )
> …
> Streamer8#flushInternal( )
> Streamer8#waitForAckedSeqno( )
> {code}
> It could be better to trigger all the streamers to flushInternal( ) and
> wait for all of them to return from waitForAckedSeqno( ),  and then 
> flushAllInternals( ) returns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9679) libhdfs++: Fix inconsistencies with libhdfs C API

2016-01-26 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9679:
--
Summary: libhdfs++: Fix inconsistencies with libhdfs C API  (was: 
libhdfs++: Implement lseek in the extended C API)

> libhdfs++: Fix inconsistencies with libhdfs C API
> -
>
> Key: HDFS-9679
> URL: https://issues.apache.org/jira/browse/HDFS-9679
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>
> It'd be nice to have a version of hdfsSeek that returns the new offset as a 
> 64 bit int similar to a posix lseek.  The underlying C++ implementation 
> already does this for free and doing a seek with whence=SEEK_CUR and then 
> checking the resulting offset seems like a common operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9698) Long running Balancer should renew TGT

2016-01-26 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117862#comment-15117862
 ] 

Andrew Wang commented on HDFS-9698:
---

Hey Zhe, overall idea seems good, though I wonder if there's a place we can do 
this closer to the RPC? e.g. we do this in DFSClient or openConnection, pretty 
low down. 

Which call is it that hits the authentication error? I'm guessing it's one of 
the RPC proxies in NameNodeConnector, but confirming will help us figure out 
where exactly to add this call. If we have opportunities to use DFS rather than 
a raw RPC proxy, that'll also help address this issue.

> Long running Balancer should renew TGT
> --
>
> Key: HDFS-9698
> URL: https://issues.apache.org/jira/browse/HDFS-9698
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, security
>Affects Versions: 2.6.3
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9698.00.patch
>
>
> When the {{Balancer}} runs beyond the configured TGT lifetime, the current 
> logic won't renew TGT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9679) libhdfs++: Fix inconsistencies with libhdfs C API

2016-01-26 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9679:
--
Description: There is at least 1 minor inconsistency with the libhdfs api.  
Currently hdfsSeek returns a 32 bit int to indicate the result offset of a 
seek, similar to a posix lseek.  It should return 1 and 0 as error codes.  If 
someone starts using hdfsSeek like they would lseek everything works great - 
until you have an >4GB offset and the upper bits are truncated.  (was: It'd be 
nice to have a version of hdfsSeek that returns the new offset as a 64 bit int 
similar to a posix lseek.  The underlying C++ implementation already does this 
for free and doing a seek with whence=SEEK_CUR and then checking the resulting 
offset seems like a common operation.)

> libhdfs++: Fix inconsistencies with libhdfs C API
> -
>
> Key: HDFS-9679
> URL: https://issues.apache.org/jira/browse/HDFS-9679
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
>
> There is at least 1 minor inconsistency with the libhdfs api.  Currently 
> hdfsSeek returns a 32 bit int to indicate the result offset of a seek, 
> similar to a posix lseek.  It should return 1 and 0 as error codes.  If 
> someone starts using hdfsSeek like they would lseek everything works great - 
> until you have an >4GB offset and the upper bits are truncated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9698) Long running Balancer should renew TGT

2016-01-26 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117894#comment-15117894
 ] 

Zhe Zhang commented on HDFS-9698:
-

Thanks Andrew for taking a look. Below is one possible stack trace caused by 
long running balancer. There are a few other cases where we didn't capture the 
stack trace. I think it's also possible for the {{Dispatcher}} to hit expired 
TGT before {{LeaseRenewer}} -- it calls NN connector for multiple purposes.
{code}
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)] 
2015-06-02 09:18:48,316 WARN [LeaseRenewer:hdfs@nameservice1] ipc.Client 
(Client.java:run(670)) - Couldn't setup connection for xxx
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)] 
at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
 
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413) 
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:552) 
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:367) 
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:717) 
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:713) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
 
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712) 
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367) 
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1463) 
at org.apache.hadoop.ipc.Client.call(Client.java:1382) 
at org.apache.hadoop.ipc.Client.call(Client.java:1364) 
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
 
at com.sun.proxy.$Proxy16.renewLease(Unknown Source) 
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:563)
 
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
at java.lang.reflect.Method.invoke(Method.java:606) 
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
 
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
 
at com.sun.proxy.$Proxy17.renewLease(Unknown Source) 
at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:845) 
at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417) 
at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442) 
at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71) 
at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298) 
at java.lang.Thread.run(Thread.java:745)
{code}

I think there's a tradeoff about where to put the renew logic. The code will be 
cleaner and more consolidated if we put it low down (single renew logic works 
for multiple cases), but it will also cause the renew method 
({{checkTGTAndReloginFromKeytab}}) to be called too often. It's not a terribly 
heavy method but still some overhead.

> Long running Balancer should renew TGT
> --
>
> Key: HDFS-9698
> URL: https://issues.apache.org/jira/browse/HDFS-9698
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, security
>Affects Versions: 2.6.3
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9698.00.patch
>
>
> When the {{Balancer}} runs beyond the configured TGT lifetime, the current 
> logic won't renew TGT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9598) Use heap buffer in PacketReceiver If hdfs client has no enough free direct buffer.

2016-01-26 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117906#comment-15117906
 ] 

Colin Patrick McCabe commented on HDFS-9598:


Hmm.  It looks like the code already has a fallback.

{code}
  /**
   * Allocate a direct buffer of the specified size, in bytes.
   * If a pooled buffer is available, returns that. Otherwise
   * allocates a new one.
   */
  public ByteBuffer getBuffer(int size) {
Queue list = buffersBySize.get(size);
if (list == null) {
  // no available buffers for this size
  return ByteBuffer.allocateDirect(size);
}

WeakReference ref;
while ((ref = list.poll()) != null) {
  ByteBuffer b = ref.get();
  if (b != null) {
return b;
  }
}
  
return ByteBuffer.allocateDirect(size);
  }
{code}

It sounds like what is happening here is that we failed to allocate a direct 
byte buffer.  I'm not sure what we should do in this case...

> Use heap buffer in PacketReceiver If hdfs client has no enough free direct 
> buffer.
> --
>
> Key: HDFS-9598
> URL: https://issues.apache.org/jira/browse/HDFS-9598
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yong Zhang
>Assignee: Yong Zhang
> Attachments: HDFS-9598.001.patch, HDFS-9598.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9541) Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater than 2 GB

2016-01-26 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117909#comment-15117909
 ] 

Colin Patrick McCabe commented on HDFS-9541:


Thanks, [~zhz].

> Add hdfsStreamBuilder API to libhdfs to support defaultBlockSizes greater 
> than 2 GB
> ---
>
> Key: HDFS-9541
> URL: https://issues.apache.org/jira/browse/HDFS-9541
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: libhdfs
>Affects Versions: 0.20.1
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.9.0
>
> Attachments: HDFS-9541.001.patch, HDFS-9541.002.patch
>
>
> We should have a new API in libhdfs which will support creating files with a 
> default block size that is more than 31 bits in size.  We should also make 
> this a builder API so that it is easy to add more options later.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9611) DiskBalancer : Replace htrace json imports with jackson

2016-01-26 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117921#comment-15117921
 ] 

Colin Patrick McCabe commented on HDFS-9611:


Thanks for this, [~anu].  Just one small comment... it seems like it might have 
been more convenient to do on trunk directly?

> DiskBalancer : Replace htrace json imports with jackson
> ---
>
> Key: HDFS-9611
> URL: https://issues.apache.org/jira/browse/HDFS-9611
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Minor
> Fix For: HDFS-1312
>
> Attachments: HDFS-9611-HDFS-1312.001.patch
>
>
> Replace imports with correct json imports.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9610) cmake tests don't fail when they should?

2016-01-26 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117935#comment-15117935
 ] 

Colin Patrick McCabe commented on HDFS-9610:


This is not a bug.  {{test_libhdfs_threaded_hdfs_static}} tests that things 
work properly when it tries to access a file that doesn't exist.  That's why 
you see {{java.io.FileNotFoundException: File does not exist: 
/tlhData0001/file1}}.

The error message about an illegal open mode comes from here:
{code}
/* hdfsOpenFile should not accept mode = 3 */
EXPECT_NULL(hdfsOpenFile(fs, paths->file1, 3, 0, 0, 0));
{code}

And so on, and so forth.

We do want to keep those stderr and stdout log messages, since they are needed 
to debug actual failures that might happen.  Maybe we can add a per-test option 
for the pom.xml to suppress stderr output on tests we know are noisy.

This reminds me... Yetus should be keeping the .stderr and .stdout files that 
the CMake test plugin generates... I should file a JIRA for that.

> cmake tests don't fail when they should?
> 
>
> Key: HDFS-9610
> URL: https://issues.apache.org/jira/browse/HDFS-9610
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Allen Wittenauer
>Assignee: James Clampffer
> Attachments: HDFS-9610.HDFS-8707.000.patch, LastTest.log
>
>
> Playing around with adding ctest output support to Yetus, and I stumbled upon 
> a case where the tests throw errors left and right but claim success.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9611) DiskBalancer : Replace htrace json imports with jackson

2016-01-26 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117937#comment-15117937
 ] 

Anu Engineer commented on HDFS-9611:


This code is not in trunk yet, it was a mistake that was made in disk balancer 
code and fortunately fixed here itself.


> DiskBalancer : Replace htrace json imports with jackson
> ---
>
> Key: HDFS-9611
> URL: https://issues.apache.org/jira/browse/HDFS-9611
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Minor
> Fix For: HDFS-1312
>
> Attachments: HDFS-9611-HDFS-1312.001.patch
>
>
> Replace imports with correct json imports.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9701) DN may deadlock when hot-swapping under load

2016-01-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15117943#comment-15117943
 ] 

Hadoop QA commented on HDFS-9701:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 
134 unchanged - 1 fixed = 135 total (was 135) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 51m 42s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 22s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 128m 6s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.namenode.TestCacheDirectives |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
|   | hadoop.hdfs.TestRollingUpgrade |
| JDK v1.7.0_91 Failed junit tests | hadoop.hdfs.TestRecoverStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12784453/HDFS-9701.03.patch |
| JIRA Issue | HDFS-9701 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 759cc5ff23fc 

[jira] [Commented] (HDFS-9641) IOException in hdfs write process causes file leases not released

2016-01-26 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118023#comment-15118023
 ] 

Colin Patrick McCabe commented on HDFS-9641:


This seems like a duplicate of HDFS-4504.

> IOException in hdfs write process causes file leases not released
> -
>
> Key: HDFS-9641
> URL: https://issues.apache.org/jira/browse/HDFS-9641
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0, 2.6.1, 2.6.2, 2.6.3
> Environment: hadoop 2.6.0, 
>Reporter: Yongtao Yang
>
> when writing a file, an IOException may be raised in 
> DFSOutputStream.DataStreamer.run(), then 'streamerClosed' may be set to true, 
> then closeInternal() will be invoked, where DFSOutputStream.closed will be 
> set to be true. That is to say, 'closed' is true before 
> DFSOutputStream.close() is invoked, then dfsClient.endFileLease(fileId) will 
> not be executed. The references of the DFSOutputStream objects will still be 
> hold in DFSClient.filesBeingWritten untill the client quits. The related 
> resources will not be released. HDFS-4504 is a related issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9629) Update the footer of Web UI to show year 2016

2016-01-26 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118045#comment-15118045
 ] 

Xiao Chen commented on HDFS-9629:
-

Thanks Yongjun, created HDFS-9707 for the above.

> Update the footer of Web UI to show year 2016
> -
>
> Key: HDFS-9629
> URL: https://issues.apache.org/jira/browse/HDFS-9629
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Xiao Chen
>Assignee: Xiao Chen
>  Labels: supportability
> Attachments: HDFS-9629.01.patch, HDFS-9629.02.patch, 
> HDFS-9629.03.patch, HDFS-9629.04.patch, HDFS-9629.05.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9707) Add more info the the Web UI footer

2016-01-26 Thread Xiao Chen (JIRA)
Xiao Chen created HDFS-9707:
---

 Summary: Add more info the the Web UI footer
 Key: HDFS-9707
 URL: https://issues.apache.org/jira/browse/HDFS-9707
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Xiao Chen
Priority: Minor


As discussed in HDFS-9629 by Yongjun, we can choose to display separate dates 
for build and release on WebUI. Also could display license info if needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9698) Long running Balancer should renew TGT

2016-01-26 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118072#comment-15118072
 ] 

Jing Zhao commented on HDFS-9698:
-

Thanks for working on this, Zhe. Looks like we do have the relogin-from-keytab 
logic in client RPC ({{client#handleSaslConnectionFailure}}). Do you know why 
it was not triggered in the failure?

> Long running Balancer should renew TGT
> --
>
> Key: HDFS-9698
> URL: https://issues.apache.org/jira/browse/HDFS-9698
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, security
>Affects Versions: 2.6.3
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9698.00.patch
>
>
> When the {{Balancer}} runs beyond the configured TGT lifetime, the current 
> logic won't renew TGT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9610) test_libhdfs_threaded_hdfs_static generates a lot of noise on stderr which looks like a failure even though it isn't

2016-01-26 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9610:
---
Summary: test_libhdfs_threaded_hdfs_static generates a lot of noise on 
stderr which looks like a failure even though it isn't  (was: cmake tests don't 
fail when they should?)

> test_libhdfs_threaded_hdfs_static generates a lot of noise on stderr which 
> looks like a failure even though it isn't
> 
>
> Key: HDFS-9610
> URL: https://issues.apache.org/jira/browse/HDFS-9610
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Allen Wittenauer
>Assignee: James Clampffer
> Attachments: HDFS-9610.HDFS-8707.000.patch, LastTest.log
>
>
> Playing around with adding ctest output support to Yetus, and I stumbled upon 
> a case where the tests throw errors left and right but claim success.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9610) test_libhdfs_threaded_hdfs_static generates a lot of noise on stderr which looks like a failure even though it isn't

2016-01-26 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118129#comment-15118129
 ] 

Colin Patrick McCabe commented on HDFS-9610:


Also, the native unit tests in Hadoop include more than just ctest.  Most 
native unit tests are simple binaries that we just run directly.  ctest is just 
a special case of that.

> test_libhdfs_threaded_hdfs_static generates a lot of noise on stderr which 
> looks like a failure even though it isn't
> 
>
> Key: HDFS-9610
> URL: https://issues.apache.org/jira/browse/HDFS-9610
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Allen Wittenauer
>Assignee: James Clampffer
> Attachments: HDFS-9610.HDFS-8707.000.patch, LastTest.log
>
>
> Playing around with adding ctest output support to Yetus, and I stumbled upon 
> a case where the tests throw errors left and right but claim success.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9654) Code refactoring for HDFS-8578

2016-01-26 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118203#comment-15118203
 ] 

Chris Trezzo commented on HDFS-9654:


Thanks [~szetszwo] for the updated patch. Overall it is looking good! A few 
nits:

1. Nit: Params in javadoc do not match the params for the method (i.e. missing 
config). These are some examples, there might be some others as well:

BlockPoolSliceStorage
{noformat}
147  private StorageDirectory loadStorageDirectory(NamespaceInfo nsInfo,
148  File dataDir, StartupOption startOpt, Configuration conf)
149  throws IOException {
{noformat}
{noformat}
209  List loadBpStorageDirectories(NamespaceInfo nsInfo,
210  Collection dataDirs, StartupOption startOpt,
211  Configuration conf) throws IOException {
{noformat}
{noformat}
244  List recoverTransitionRead(NamespaceInfo nsInfo,
245  Collection dataDirs, StartupOption startOpt, Configuration 
conf)
246  throws IOException {
{noformat}
{noformat}
355  private boolean doTransition(StorageDirectory sd, NamespaceInfo nsInfo,
356  StartupOption startOpt, Configuration conf) throws IOException {
{noformat}
{noformat}
427  private void doUpgrade(final StorageDirectory bpSd,
428  final NamespaceInfo nsInfo, final Configuration conf) throws 
IOException {
{noformat}
{noformat}
658  private static void linkAllBlocks(File fromDir, File toDir,
659  int diskLayoutVersion, Configuration conf) throws IOException {
{noformat}
DataStorage
{noformat}
661  private boolean doTransition(StorageDirectory sd, NamespaceInfo nsInfo,
662  StartupOption startOpt, Configuration conf) throws IOException {
{noformat}
{noformat}
734  void doUpgrade(final StorageDirectory sd, final NamespaceInfo nsInfo,
735  final Configuration conf) throws IOException {
{noformat}

2. Nit: Maybe make doUgrade method name a little more descriptive? How about 
hardLinkAndRename?:
{noformat}
469  private void doUgrade(String name, final StorageDirectory bpSd,
470  NamespaceInfo nsInfo, final File bpPrevDir, final File bpTmpDir,
471  final File bpCurDir, final int oldLV, Configuration conf)
472  throws IOException {
{noformat}

3. Nit: the two checkstyle warnings for number of parameters in doUgrade. If we 
want to follow the checkstyle to the letter, then we could move this method 
into the callable annonymous inner class. If you want we can fix this in the 
next patch where we actually parallelize the method, whatever you think is best 
for this patch.

4. Whitespace issue

5. TestDataNodeVolumeFailure.testUnderReplicationAfterVolFailure test failure. 
Is this related?

Thanks!


> Code refactoring for HDFS-8578
> --
>
> Key: HDFS-9654
> URL: https://issues.apache.org/jira/browse/HDFS-9654
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>Priority: Minor
> Attachments: h9654_20160116.patch
>
>
> This is a code refactoring JIRA in order to change Datanode to process all 
> storage/data dirs in parallel; see also HDFS-8578.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9708) FSNamesystem.initAuditLoggers() doesn't trim classnames

2016-01-26 Thread Steve Loughran (JIRA)
Steve Loughran created HDFS-9708:


 Summary: FSNamesystem.initAuditLoggers() doesn't trim classnames
 Key: HDFS-9708
 URL: https://issues.apache.org/jira/browse/HDFS-9708
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fs
Affects Versions: 2.8.0
Reporter: Steve Loughran


The {{FSNamesystem.initAuditLoggers()}} method reads a list of audit loggers 
from a call to {{ conf.getStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);}}

What it doesn't do is trim each entry -so if there's a space or newline in the
list, the classname is invalid and won't load, so HDFS wont come out to play.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9708) FSNamesystem.initAuditLoggers() doesn't trim classnames

2016-01-26 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-9708:
-
Priority: Minor  (was: Major)

> FSNamesystem.initAuditLoggers() doesn't trim classnames
> ---
>
> Key: HDFS-9708
> URL: https://issues.apache.org/jira/browse/HDFS-9708
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> The {{FSNamesystem.initAuditLoggers()}} method reads a list of audit loggers 
> from a call to {{ conf.getStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);}}
> What it doesn't do is trim each entry -so if there's a space or newline in the
> list, the classname is invalid and won't load, so HDFS wont come out to play.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9708) FSNamesystem.initAuditLoggers() doesn't trim classnames

2016-01-26 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118239#comment-15118239
 ] 

Mingliang Liu commented on HDFS-9708:
-

Will {{conf.getTrimmedStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);}} work 
for this case?

> FSNamesystem.initAuditLoggers() doesn't trim classnames
> ---
>
> Key: HDFS-9708
> URL: https://issues.apache.org/jira/browse/HDFS-9708
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> The {{FSNamesystem.initAuditLoggers()}} method reads a list of audit loggers 
> from a call to {{ conf.getStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);}}
> What it doesn't do is trim each entry -so if there's a space or newline in the
> list, the classname is invalid and won't load, so HDFS wont come out to play.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9701) DN may deadlock when hot-swapping under load

2016-01-26 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118247#comment-15118247
 ] 

Lei (Eddy) Xu commented on HDFS-9701:
-

Hi, [~xiaochen] 

Thanks a lot for digging into this bug. The patch looks good in general.

A few minor issues:

* Can you move {{TestDatanodeImpl#testRemoveVolumeBeingWritten}} to 
{{TestDataNodeHotSwapVolumes}}?
Also you dont need two {{try}} here:

{code}
try {
  try (ReplicaHandler replica =
  dataset.createRbw(StorageType.DEFAULT, eb, false)) {
{code}

* Could you verify that during removing a volume, block report can be sent in 
the test.

* In {{FsDatasetImpl#removeVolume}},  {{LOG.info("All volumes are removed"); }} 
is not accurate. Only one volume is removed at a time, and the removal 
operation is not completed (i.e., ReplicaInfo are stored in DN's memory). Would 
you mind to rephrase that? 

* Realized that {{FsVolumeList#checkDirs}} and {{FsDatasetImpl#checkDirs}} 
share the same "waiting for checkVolumeRemoved()" code, would you mind to put 
them into one function? 

Thanks!


> DN may deadlock when hot-swapping under load
> 
>
> Key: HDFS-9701
> URL: https://issues.apache.org/jira/browse/HDFS-9701
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9701.01.patch, HDFS-9701.02.patch, 
> HDFS-9701.03.patch
>
>
> If the DN is under load (new blocks being written), a hot-swap task by {{hdfs 
> dfsadmin -reconfig}} may cause a dead lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9708) FSNamesystem.initAuditLoggers() doesn't trim classnames

2016-01-26 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9708:

Attachment: HDFS-9708.000.patch

> FSNamesystem.initAuditLoggers() doesn't trim classnames
> ---
>
> Key: HDFS-9708
> URL: https://issues.apache.org/jira/browse/HDFS-9708
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
> Attachments: HDFS-9708.000.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> The {{FSNamesystem.initAuditLoggers()}} method reads a list of audit loggers 
> from a call to {{ conf.getStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);}}
> What it doesn't do is trim each entry -so if there's a space or newline in the
> list, the classname is invalid and won't load, so HDFS wont come out to play.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9708) FSNamesystem.initAuditLoggers() doesn't trim classnames

2016-01-26 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HDFS-9708:
---

Assignee: Mingliang Liu

> FSNamesystem.initAuditLoggers() doesn't trim classnames
> ---
>
> Key: HDFS-9708
> URL: https://issues.apache.org/jira/browse/HDFS-9708
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-9708.000.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> The {{FSNamesystem.initAuditLoggers()}} method reads a list of audit loggers 
> from a call to {{ conf.getStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);}}
> What it doesn't do is trim each entry -so if there's a space or newline in the
> list, the classname is invalid and won't load, so HDFS wont come out to play.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9708) FSNamesystem.initAuditLoggers() doesn't trim classnames

2016-01-26 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9708:

Status: Patch Available  (was: Open)

> FSNamesystem.initAuditLoggers() doesn't trim classnames
> ---
>
> Key: HDFS-9708
> URL: https://issues.apache.org/jira/browse/HDFS-9708
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Priority: Minor
> Attachments: HDFS-9708.000.patch
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> The {{FSNamesystem.initAuditLoggers()}} method reads a list of audit loggers 
> from a call to {{ conf.getStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);}}
> What it doesn't do is trim each entry -so if there's a space or newline in the
> list, the classname is invalid and won't load, so HDFS wont come out to play.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9610) test_libhdfs_threaded_hdfs_static generates a lot of noise on stderr which looks like a failure even though it isn't

2016-01-26 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118259#comment-15118259
 ] 

James Clampffer commented on HDFS-9610:
---

bq. this is not a bug. test_libhdfs_threaded_hdfs_static tests that things work 
properly when it tries to access a file that doesn't exist. That's why you see 
java.io.FileNotFoundException: File does not exist: /tlhData0001/file1.
Oh, that makes a lot of sense.  Thanks for the explanation [~cmccabe]!

bq. We do want to keep those stderr and stdout log messages, since they are 
needed to debug actual failures that might happen. Maybe we can add a per-test 
option for the pom.xml to suppress stderr output on tests we know are noisy.
This seems like a good solution to make the log noise a little less 
disconcerting.  If a suppression flag was added would it be appropriate to 
discard stderr data when set and rely on someone to manually turn it off and 
rerun the test when there is a failure?  Or should the output currently going 
to stderr always be retained in a separate file for investigating failures?

bq. Also, the native unit tests in Hadoop include more than just ctest. Most 
native unit tests are simple binaries that we just run directly. ctest is just 
a special case of that.
Thanks for the pointer; I just checked out 
hadoop-common-project/hadoop-common/pom.xml and saw a few good examples of 
that.  I've been writing little sanity tests for HDFS-8765 and HDFS-9227 on my 
side already so it's good to know I can reuse them once I get around to 
finishing those patches.

> test_libhdfs_threaded_hdfs_static generates a lot of noise on stderr which 
> looks like a failure even though it isn't
> 
>
> Key: HDFS-9610
> URL: https://issues.apache.org/jira/browse/HDFS-9610
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Allen Wittenauer
>Assignee: James Clampffer
> Attachments: HDFS-9610.HDFS-8707.000.patch, LastTest.log
>
>
> Playing around with adding ctest output support to Yetus, and I stumbled upon 
> a case where the tests throw errors left and right but claim success.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-26 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118269#comment-15118269
 ] 

Colin Patrick McCabe commented on HDFS-9260:


bq. Should I convert storages field to private? (The triplets field was 
protected)

That sounds like a good idea.  We have functions to manipulate these fields, so 
the subclasses don't need to directly poke at the fields.

If that involves code changes to the subclasses, though, let's just do it in a 
follow-on JIRA.  This is one case where checkstyle is not that useful, since 
it's warning about a problem that already exists before this patch (and isn't 
even really a "problem," just an infelicity).

{code}
-  public static final String  DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_KEY 
= "dfs.namenode.replication.max-streams-hard-limit";
+  public static final String  DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_KEY =
+  "dfs.namenode.replication.max-streams-hard-limit";
{code}
Can we skip this change?  It's not really part of this work and it makes the 
diff bigger.

{code}
+  public static final String DFS_NAMENODE_STORAGEINFO_EFFICIENCY_INTERVAL_KEY
+  = "dfs.namenode.storageinfo.efficiency.interval";
+  public static final int DFS_NAMENODE_STORAGEINFO_EFFICIENCY_INTERVAL_DEFAULT
+  = 600;
{code}
Should be something like {{dfs.namenode.storageinfo.defragmenter.interval.ms}} 
to indicate that it's a scan interval, and that it's in milliseconds.

{code}
-  String poolId, StorageBlockReport[] reports, BlockReportContext context)
+  String poolId, StorageBlockReport[] reports, boolean sorted,
+  BlockReportContext context)
{code}
Another unnecessary diff

{code}
+Object[] old = storages;
+storages = new DatanodeStorageInfo[(last+num)];
+System.arraycopy(old, 0, storages, 0, last);
{code}
Now "old" can have type {{DatanodeStorageInfo[]}} rather than {{Object[]}}, 
right?

{code}
+  storageInfoMonitorThread.interrupt();
{code}
Maybe this should be something like {{storageInfoDefragmenterThread}}? 
"monitor" suggests something like the block scanner, not defragmentation (at 
least in my mind?)

{code}
import org.apache.hadoop.hdfs.util.TreeSet;
{code}
I think this might be less confusing if you called it {{ChunkedTreeSet}}.  If I 
were just a new developer looking at the code (or even an experienced 
developer), I wouldn't really expect us to be using something called 
{{TreeSet}} which was actually completely different than {{java.util.TreeSet}}.

{code}
--- a/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
@@ -246,6 +246,7 @@ message BlockReportRequestProto {
   required string blockPoolId = 2;
   repeated StorageBlockReportProto reports = 3;
   optional BlockReportContextProto context = 4;
+  optional bool sorted = 5 [default = false];
 }
{code}
The other fields in {{BlockReportRequestProto}} have comments explaining what 
they are.  Let's add one for "sorted"

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFSBenchmarks.zip, 
> HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2016-01-26 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118269#comment-15118269
 ] 

Colin Patrick McCabe edited comment on HDFS-9260 at 1/26/16 11:30 PM:
--

bq. Should I convert storages field to private? (The triplets field was 
protected)

That sounds like a good idea.  We have functions to manipulate these fields, so 
the subclasses don't need to directly poke at the fields.

If that involves code changes to the subclasses, though, let's just do it in a 
follow-on JIRA.  This is one case where checkstyle is not that useful, since 
it's warning about a problem that already exists before this patch (and isn't 
even really a "problem," just an infelicity).

{code}
-  public static final String  DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_KEY 
= "dfs.namenode.replication.max-streams-hard-limit";
+  public static final String  DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_KEY =
+  "dfs.namenode.replication.max-streams-hard-limit";
{code}
Can we skip this change?  It's not really part of this work and it makes the 
diff bigger.

{code}
+  public static final String DFS_NAMENODE_STORAGEINFO_EFFICIENCY_INTERVAL_KEY
+  = "dfs.namenode.storageinfo.efficiency.interval";
+  public static final int DFS_NAMENODE_STORAGEINFO_EFFICIENCY_INTERVAL_DEFAULT
+  = 600;
{code}
Should be something like {{dfs.namenode.storageinfo.defragmenter.interval.ms}} 
to indicate that it's a scan interval, and that it's in milliseconds.

{code}
+Object[] old = storages;
+storages = new DatanodeStorageInfo[(last+num)];
+System.arraycopy(old, 0, storages, 0, last);
{code}
Now "old" can have type {{DatanodeStorageInfo[]}} rather than {{Object[]}}, 
right?

{code}
+  storageInfoMonitorThread.interrupt();
{code}
Maybe this should be something like {{storageInfoDefragmenterThread}}? 
"monitor" suggests something like the block scanner, not defragmentation (at 
least in my mind?)

{code}
import org.apache.hadoop.hdfs.util.TreeSet;
{code}
I think this might be less confusing if you called it {{ChunkedTreeSet}}.  If I 
were just a new developer looking at the code (or even an experienced 
developer), I wouldn't really expect us to be using something called 
{{TreeSet}} which was actually completely different than {{java.util.TreeSet}}.

{code}
--- a/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/DatanodeProtocol.proto
@@ -246,6 +246,7 @@ message BlockReportRequestProto {
   required string blockPoolId = 2;
   repeated StorageBlockReportProto reports = 3;
   optional BlockReportContextProto context = 4;
+  optional bool sorted = 5 [default = false];
 }
{code}
The other fields in {{BlockReportRequestProto}} have comments explaining what 
they are.  Let's add one for "sorted"


was (Author: cmccabe):
bq. Should I convert storages field to private? (The triplets field was 
protected)

That sounds like a good idea.  We have functions to manipulate these fields, so 
the subclasses don't need to directly poke at the fields.

If that involves code changes to the subclasses, though, let's just do it in a 
follow-on JIRA.  This is one case where checkstyle is not that useful, since 
it's warning about a problem that already exists before this patch (and isn't 
even really a "problem," just an infelicity).

{code}
-  public static final String  DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_KEY 
= "dfs.namenode.replication.max-streams-hard-limit";
+  public static final String  DFS_NAMENODE_REPLICATION_STREAMS_HARD_LIMIT_KEY =
+  "dfs.namenode.replication.max-streams-hard-limit";
{code}
Can we skip this change?  It's not really part of this work and it makes the 
diff bigger.

{code}
+  public static final String DFS_NAMENODE_STORAGEINFO_EFFICIENCY_INTERVAL_KEY
+  = "dfs.namenode.storageinfo.efficiency.interval";
+  public static final int DFS_NAMENODE_STORAGEINFO_EFFICIENCY_INTERVAL_DEFAULT
+  = 600;
{code}
Should be something like {{dfs.namenode.storageinfo.defragmenter.interval.ms}} 
to indicate that it's a scan interval, and that it's in milliseconds.

{code}
-  String poolId, StorageBlockReport[] reports, BlockReportContext context)
+  String poolId, StorageBlockReport[] reports, boolean sorted,
+  BlockReportContext context)
{code}
Another unnecessary diff

{code}
+Object[] old = storages;
+storages = new DatanodeStorageInfo[(last+num)];
+System.arraycopy(old, 0, storages, 0, last);
{code}
Now "old" can have type {{DatanodeStorageInfo[]}} rather than {{Object[]}}, 
right?

{code}
+  storageInfoMonitorThread.interrupt();
{code}
Maybe this should be something like {{storageInfoDefragmenterThread}}? 
"monitor" suggests something like the block scanner, not defragmentation (at 
least in my mind?)

{code}
import org.apache.hadoop.hdfs.util.TreeSet;
{code}
I think this might be 

[jira] [Commented] (HDFS-9579) Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level

2016-01-26 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118276#comment-15118276
 ] 

Ming Ma commented on HDFS-9579:
---

Thanks [~cmccabe]. Yes, it is possible to have distance of three or five under 
temporary failure scenarios in certain network topology. I don't know if we 
really need to support it at this point given NetworkTopology and topology 
script are static (there is no change after it is built). Another option is to 
use something like {{bytesReadDistanceOfOneOrTwo}}. Thoughts?

> Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
> -
>
> Key: HDFS-9579
> URL: https://issues.apache.org/jira/browse/HDFS-9579
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9579-2.patch, HDFS-9579-3.patch, HDFS-9579-4.patch, 
> HDFS-9579.patch, MR job counters.png
>
>
> For cross DC distcp or other applications, it becomes useful to have insight 
> as to the traffic volume for each network distance to distinguish cross-DC 
> traffic, local-DC-remote-rack, etc.
> FileSystem's existing {{bytesRead}} metrics tracks all the bytes read. To 
> provide additional metrics for each network distance, we can add additional 
> metrics to FileSystem level and have {{DFSInputStream}} update the value 
> based on the network distance between client and the datanode.
> {{DFSClient}} will resolve client machine's network location as part of its 
> initialization. It doesn't need to resolve datanode's network location for 
> each read as {{DatanodeInfo}} already has the info.
> There are existing HDFS specific metrics such as {{ReadStatistics}} and 
> {{DFSHedgedReadMetrics}}. But these metrics are only accessible via 
> {{DFSClient}} or {{DFSInputStream}}. Not something that application framework 
> such as MR and Tez can get to. That is the benefit of storing these new 
> metrics in FileSystem.Statistics.
> This jira only includes metrics generation by HDFS. The consumption of these 
> metrics at MR and Tez will be tracked by separated jiras.
> We can add similar metrics for HDFS write scenario later if it is necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >