[jira] [Reopened] (HDFS-7346) Erasure Coding: perform stripping erasure encoding work given block reader and writer

2016-01-13 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo reopened HDFS-7346:
-

> Erasure Coding: perform stripping erasure encoding work given block reader 
> and writer
> -
>
> Key: HDFS-7346
> URL: https://issues.apache.org/jira/browse/HDFS-7346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Kai Zheng
>Assignee: Li Bo
>
> This assumes the facilities like block reader and writer are ready, 
> implements and performs erasure encoding work in *stripping* case utilizing 
> erasure codec and coder provided by the codec framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8430) Erasure coding: compute file checksum for stripe files

2016-01-13 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095978#comment-15095978
 ] 

Kai Zheng commented on HDFS-8430:
-

Well, I may wait some other days for some comments. Anyhow, would proceed in 
next week and provide a formal patch as I summarized above. In the initial 
version, very probably: 
* for the new API {{getFileChecksum}}, it may try distributing the computing 
task to DataNode to avoid network congestion in the client as Nicholas said; 
* will use the current MD5MD5CRC32 approach, not use CRC64 and leave it for 
subsequent revision or follow-on task according to review comments.

> Erasure coding: compute file checksum for stripe files
> --
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7346) Erasure Coding: perform stripping erasure encoding work given block reader and writer

2016-01-13 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-7346:

Release Note:   (was: The jira is very old and close it because we'll not 
handle it in the near future.)

> Erasure Coding: perform stripping erasure encoding work given block reader 
> and writer
> -
>
> Key: HDFS-7346
> URL: https://issues.apache.org/jira/browse/HDFS-7346
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Kai Zheng
>Assignee: Li Bo
>
> This assumes the facilities like block reader and writer are ready, 
> implements and performs erasure encoding work in *stripping* case utilizing 
> erasure codec and coder provided by the codec framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup

2016-01-13 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096192#comment-15096192
 ] 

Daniel Templeton commented on HDFS-9415:


' * ' is valid?  I thought only one space was allowed, or at least specified to 
be allowed.

> Document dfs.cluster.administrators and dfs.permissions.superusergroup
> --
>
> Key: HDFS-9415
> URL: https://issues.apache.org/jira/browse/HDFS-9415
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Arpit Agarwal
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, 
> HDFS-9415.003.patch
>
>
> dfs.cluster.administrators and dfs.permissions.superusergroup documentation 
> is not clear enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9643) libhdfs++: Support async cancellation of read operations

2016-01-13 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9643:
--
Attachment: HDFS-9643.HDFS-8707.000.patch

Initial patch, I've manually tested it but need to sort out the failures hidden 
by HDFS-9610 before I can write decent unit tests.

Open questions:
-Right now the cancel logic is added directly to each continuation in the 
remote block reader.  On one hand this is simple and works, on the other it's 
boilerplate code.  Is this worth pushing into the continuation pipeline code at 
the moment?  I think it's worth keeping it simple until NN operations become 
cancelable.

-In this implementation FileHandle::CancelOperations is irreversible and 
prevents it from being used again.  Can anyone think of a reason not to have it 
also close the file or at least clear vector?

-Should the FileHandle have a callback when it knows that there are no pending 
operations?  Should be possible to just check the reference count on the 
CancelHandle to verify.

> libhdfs++: Support async cancellation of read operations
> 
>
> Key: HDFS-9643
> URL: https://issues.apache.org/jira/browse/HDFS-9643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-9643.HDFS-8707.000.patch
>
>
> It should be possible for any thread to cancel operations in progress on a 
> FileHandle.  Any ephemeral objects created by the FileHandle should free 
> resources as quickly as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-1312) Re-balance disks within a Datanode

2016-01-13 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096827#comment-15096827
 ] 

Chris Trezzo commented on HDFS-1312:


I will dial into the call as well. Thanks for posting.

> Re-balance disks within a Datanode
> --
>
> Key: HDFS-1312
> URL: https://issues.apache.org/jira/browse/HDFS-1312
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode
>Reporter: Travis Crawford
>Assignee: Anu Engineer
> Attachments: Architecture_and_testplan.pdf, disk-balancer-proposal.pdf
>
>
> Filing this issue in response to ``full disk woes`` on hdfs-user.
> Datanodes fill their storage directories unevenly, leading to situations 
> where certain disks are full while others are significantly less used. Users 
> at many different sites have experienced this issue, and HDFS administrators 
> are taking steps like:
> - Manually rebalancing blocks in storage directories
> - Decomissioning nodes & later readding them
> There's a tradeoff between making use of all available spindles, and filling 
> disks at the sameish rate. Possible solutions include:
> - Weighting less-used disks heavier when placing new blocks on the datanode. 
> In write-heavy environments this will still make use of all spindles, 
> equalizing disk use over time.
> - Rebalancing blocks locally. This would help equalize disk use as disks are 
> added/replaced in older cluster nodes.
> Datanodes should actively manage their local disk so operator intervention is 
> not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9624) DataNode start slowly due to the initial DU command operations

2016-01-13 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096845#comment-15096845
 ] 

Andrew Wang commented on HDFS-9624:
---

Hi [~linyiqun] thanks for revving the patch,

First two fixes look good, but the test needs a little more work. What I meant 
by timer injection is something like the org.apache.hadoop.util.Timer class, it 
lets you explicitly advance the time rather than having to wait for the system 
clock to advance. This means the test will run in milliseconds instead of 
seconds, which is a lot faster.

There are some examples for how to mock Timer in other unit tests, let me know 
if it's still unclear though.

> DataNode start slowly due to the initial DU command operations
> --
>
> Key: HDFS-9624
> URL: https://issues.apache.org/jira/browse/HDFS-9624
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9624.001.patch, HDFS-9624.002.patch, 
> HDFS-9624.003.patch, HDFS-9624.004.patch, HDFS-9624.005.patch
>
>
> It seems starting datanode so slowly when I am finishing migration of 
> datanodes and restart them.I look the dn logs:
> {code}
> 2016-01-06 16:05:08,118 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> new volume: DS-70097061-42f8-4c33-ac27-2a6ca21e60d4
> 2016-01-06 16:05:08,118 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> volume - /home/data/data/hadoop/dfs/data/data12/current, StorageType: DISK
> 2016-01-06 16:05:08,176 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Registered FSDatasetState MBean
> 2016-01-06 16:05:08,177 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544
> 2016-01-06 16:05:08,178 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data2/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data3/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data4/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data5/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data6/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data7/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data8/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data9/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data10/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data11/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data12/current...
> 2016-01-06 16:09:49,646 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on 
> /home/data/data/hadoop/dfs/data/data7/current: 281466ms
> 2016-01-06 16:09:54,235 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on 
> 

[jira] [Updated] (HDFS-9643) libhdfs++: Support async cancellation of read operations

2016-01-13 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9643:
--
Status: Patch Available  (was: Open)

> libhdfs++: Support async cancellation of read operations
> 
>
> Key: HDFS-9643
> URL: https://issues.apache.org/jira/browse/HDFS-9643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-9643.HDFS-8707.000.patch
>
>
> It should be possible for any thread to cancel operations in progress on a 
> FileHandle.  Any ephemeral objects created by the FileHandle should free 
> resources as quickly as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-6221) Webhdfs should recover from dead DNs

2016-01-13 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HDFS-6221.
--
Resolution: Not A Problem

> Webhdfs should recover from dead DNs
> 
>
> Key: HDFS-6221
> URL: https://issues.apache.org/jira/browse/HDFS-6221
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, webhdfs
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>
> We've repeatedly observed the jetty acceptor thread silently dying in the 
> DNs.  The webhdfs servlet may also "disappear" and jetty returns non-json 
> 404s.
> One approach to make webhdfs more resilient to bad DNs is dfsclient-like 
> fetching of block locations to directly access the DNs instead of relying on 
> a NN redirect that may repeatedly send the client to the same faulty DN(s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9643) libhdfs++: Support async cancellation of read operations

2016-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096969#comment-15096969
 ] 

Hadoop QA commented on HDFS-9643:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 7m 35s 
{color} | {color:red} Docker failed to build yetus/hadoop:0cf5e66. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12782121/HDFS-9643.HDFS-8707.000.patch
 |
| JIRA Issue | HDFS-9643 |
| Powered by | Apache Yetus 0.2.0-SNAPSHOT   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14115/console |


This message was automatically generated.



> libhdfs++: Support async cancellation of read operations
> 
>
> Key: HDFS-9643
> URL: https://issues.apache.org/jira/browse/HDFS-9643
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-9643.HDFS-8707.000.patch
>
>
> It should be possible for any thread to cancel operations in progress on a 
> FileHandle.  Any ephemeral objects created by the FileHandle should free 
> resources as quickly as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9613) Avoid checking file checksums after copy when possible

2016-01-13 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096784#comment-15096784
 ] 

Yongjun Zhang commented on HDFS-9613:
-

HI [~drankye],

Thanks for clarifying and sorry for my delayed reply. I was stuck with some 
criitcal issue. I did not have time to do a very thorough review, but some 
comments here.

# Good idea to separate out the clean-up code (including most of the change of 
import statements) to a different jira HDFS-9630. Suggest to prune the patch of 
this jira to only address checksum checking change.
# Seems the conditions that need to be checked whether checksum comparison is 
needed are:
## cond0: whether file system supports checksum
## cond1: skipCrc
## cond2: fileAttributes.contains(FileAttribute.CHECKSUMTYPE)
## cond3: fileAttributes.contains(FileAttribute.BLOCKSIZE)
## cond4: (sourceFileStatus.getBlockSize() == 
targetFS.getDefaultBlockSize(targetPath))
# Some derived logic:
## !cond0 ==> cond1 will be ignored
## if cond2 is true, even if cond1 is true, we still compare checksum, which is 
not intuitive. Should issue warn msg at parameter checking stage.
## if cond2 implies cond3, we probably need to enforce cond3 is true when cond2 
is true at parameter checking time, but this enforcement may be not backward 
compatible. At least need to issue warn message.
 ## cond3 ==> cond4

The combined logic may be:

* boolean needToCompareChecksum = cond0 && ((!cond1) || cond2) && (cond3 || 
cond4);

I may be wrong here, but wonder if this makes sense. 

Hi [~jingzhao], thanks for your earlier comment, welcome to discuss further.

Thanks.


> Avoid checking file checksums after copy when possible
> --
>
> Key: HDFS-9613
> URL: https://issues.apache.org/jira/browse/HDFS-9613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>Priority: Minor
> Attachments: HDFS-9613-v1.patch, HDFS-9613-v2.patch
>
>
> While working on related issue, it was noticed there are some places in 
> {{distcp}} that's better to be improved and cleaned up. Particularly, after a 
> file is coped to target cluster, it will check the copied file is fine or 
> not. For replicated files, when checking, if the source block size and 
> checksum option are not preserved while copying, we can avoid comparing the 
> file checksums, which may save some time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9094) Add command line option to ask NameNode reload configuration.

2016-01-13 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9094:

Attachment: HDFS-9094-HDFS-9000.004.patch

> Add command line option to ask NameNode reload configuration.
> -
>
> Key: HDFS-9094
> URL: https://issues.apache.org/jira/browse/HDFS-9094
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9094-HDFS-9000.002.patch, 
> HDFS-9094-HDFS-9000.003.patch, HDFS-9094-HDFS-9000.004.patch, 
> HDFS-9094.001.patch
>
>
> This work is going to add DFS admin command that allows reloading NameNode 
> configuration. This is sibling work related to HDFS-6808.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup

2016-01-13 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096950#comment-15096950
 ] 

Xiaobing Zhou commented on HDFS-9415:
-

Yes, that's valid. There is one test case in 
TestAccessControlList#testWildCardAccessControlList.

> Document dfs.cluster.administrators and dfs.permissions.superusergroup
> --
>
> Key: HDFS-9415
> URL: https://issues.apache.org/jira/browse/HDFS-9415
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Arpit Agarwal
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, 
> HDFS-9415.003.patch
>
>
> dfs.cluster.administrators and dfs.permissions.superusergroup documentation 
> is not clear enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6221) Webhdfs should recover from dead DNs

2016-01-13 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096830#comment-15096830
 ] 

Kihwal Lee commented on HDFS-6221:
--

bq. We've repeatedly observed the jetty acceptor thread silently dying in the 
DNs
After converting DN to use netty, this is no longer a problem. jetty is still 
there, but only handles infrequent non-webhdfs http requests.

> Webhdfs should recover from dead DNs
> 
>
> Key: HDFS-6221
> URL: https://issues.apache.org/jira/browse/HDFS-6221
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, webhdfs
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>
> We've repeatedly observed the jetty acceptor thread silently dying in the 
> DNs.  The webhdfs servlet may also "disappear" and jetty returns non-json 
> 404s.
> One approach to make webhdfs more resilient to bad DNs is dfsclient-like 
> fetching of block locations to directly access the DNs instead of relying on 
> a NN redirect that may repeatedly send the client to the same faulty DN(s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9635) Add one more volume choosing policy with considering volume IO load

2016-01-13 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096871#comment-15096871
 ] 

Andrew Wang commented on HDFS-9635:
---

Hmm, so to clarify, do you plan to extend AvailableSpaceVolumeChoosingPolicy 
with IO load information, or write a new policy? I'd like to see it in ASVCP if 
possible, and if you're already using this policy by default, it sounds like 
this would work for you too.

As you mention, IO wait is a great way of measuring load on a disk. We can try 
to collect it in HDFS, but the OS also exposes IO wait information (e.g. 
iostat). IMO the OS info is better since it's more complete. The OS is aware of 
the actual writes to disk, whereas HDFS is getting buffered by page cache. 
Also, HDFS's IO wait info will only be as up-to-date as the last time it wrote, 
which is an issue when HDFS shares disks with other apps like MR (common).

In any case, I'm sure there'll be some experimentation to find the right 
signals and thresholds. Looking forward to your findings!

> Add one more volume choosing policy with considering volume IO load
> ---
>
> Key: HDFS-9635
> URL: https://issues.apache.org/jira/browse/HDFS-9635
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yong Zhang
>Assignee: Yong Zhang
>
> We have RoundRobinVolumeChoosingPolicy and 
> AvailableSpaceVolumeChoosingPolicy, but both not consider volume IO load.
> This jira will add a Add one more volume choosing policy base on how many 
> xceiver count on volume.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup

2016-01-13 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096997#comment-15096997
 ] 

Arpit Agarwal commented on HDFS-9415:
-

The test case does make it explicit that we accept {{"*"}} as a valid wildcard. 
I see no harm in documenting it if we are doing to document the wildcard 
behavior.

> Document dfs.cluster.administrators and dfs.permissions.superusergroup
> --
>
> Key: HDFS-9415
> URL: https://issues.apache.org/jira/browse/HDFS-9415
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Arpit Agarwal
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, 
> HDFS-9415.003.patch
>
>
> dfs.cluster.administrators and dfs.permissions.superusergroup documentation 
> is not clear enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9094) Add command line option to ask NameNode reload configuration.

2016-01-13 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096858#comment-15096858
 ] 

Xiaobing Zhou commented on HDFS-9094:
-

Thanks [~arpitagarwal] for review, patch V004 fixed the issues.

> Add command line option to ask NameNode reload configuration.
> -
>
> Key: HDFS-9094
> URL: https://issues.apache.org/jira/browse/HDFS-9094
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9094-HDFS-9000.002.patch, 
> HDFS-9094-HDFS-9000.003.patch, HDFS-9094-HDFS-9000.004.patch, 
> HDFS-9094.001.patch
>
>
> This work is going to add DFS admin command that allows reloading NameNode 
> configuration. This is sibling work related to HDFS-6808.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9635) Add one more volume choosing policy with considering volume IO load

2016-01-13 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096928#comment-15096928
 ] 

Anu Engineer commented on HDFS-9635:


[~java8964] Thanks for comments. I am looking at this from the perspective of 
HDFS-1312. Both HDFS-8538 and HDFS-1312 tries to minimize the internal disk 
usage imbalance. That is disks having different amount of data. However if we 
*only* use Volume IO as the criteria for selection of where the block will be 
placed then we might rapidly run into an issue where a set of small writes all 
go to a disk and some large writes to another disk. To avoid that scenario, 
would it not make sense to actually combine this with HDFS-1804 and solve 
HDFS-8538. Just like placing a block without considering I/O is not very 
efficient (as this JIRA illustrates),  I think having a block placement policy 
without considering how much space is left on volume can create other forms of 
inefficiencies. In fact, I think we might have two incomplete solutions, 
instead of having a whole working solution. As for the OS level I/O it is an 
aspirational goal. It is more than enough if  we consider only HDFS I/O 
happening to a volume. Would you please let me know if I am missing any use 
case for your clusters if we just solve HDFS-8538.


> Add one more volume choosing policy with considering volume IO load
> ---
>
> Key: HDFS-9635
> URL: https://issues.apache.org/jira/browse/HDFS-9635
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yong Zhang
>Assignee: Yong Zhang
>
> We have RoundRobinVolumeChoosingPolicy and 
> AvailableSpaceVolumeChoosingPolicy, but both not consider volume IO load.
> This jira will add a Add one more volume choosing policy base on how many 
> xceiver count on volume.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup

2016-01-13 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096968#comment-15096968
 ] 

Daniel Templeton commented on HDFS-9415:


Just because there's a test case, it doesn't mean it's a valid configuration.  
In the case of ' * ', the string is split on the first space, giving a user of 
'' and a group of '* '.  The group is then trimmed before splitting on comma, 
giving groups of \['*'\].  That's a long way to say that ' * ' is a poorly 
formatted version of ' *' and hence should not be mentioned in the docs.

> Document dfs.cluster.administrators and dfs.permissions.superusergroup
> --
>
> Key: HDFS-9415
> URL: https://issues.apache.org/jira/browse/HDFS-9415
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Arpit Agarwal
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, 
> HDFS-9415.003.patch
>
>
> dfs.cluster.administrators and dfs.permissions.superusergroup documentation 
> is not clear enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HDFS-9648:
-

 Summary: Test TestStartup.testImageChecksum keeps failing 
 Key: HDFS-9648
 URL: https://issues.apache.org/jira/browse/HDFS-9648
 Project: Hadoop HDFS
  Issue Type: Bug
 Environment: Jenkins
Reporter: Wei-Chiu Chuang


I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
consecutively 5 times.

https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9648:
--
Description: 
I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
consecutively 5 times.

https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/

Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
looking for the exact message.

  was:
I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
consecutively 5 times.

https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/


> Test TestStartup.testImageChecksum keeps failing 
> -
>
> Key: HDFS-9648
> URL: https://issues.apache.org/jira/browse/HDFS-9648
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>
> I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
> consecutively 5 times.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/
> Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
> looking for the exact message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9624) DataNode start slowly due to the initial DU command operations

2016-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097695#comment-15097695
 ] 

Hadoop QA commented on HDFS-9624:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
52s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 22s 
{color} | {color:red} Patch generated 4 new checkstyle issues in 
hadoop-hdfs-project/hadoop-hdfs (total was 543, now 546). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 51m 55s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 7s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 127m 56s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.server.namenode.TestNNThroughputBenchmark |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion |
|   | hadoop.hdfs.server.namenode.TestStartup |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
| JDK v1.7.0_91 Failed junit tests | hadoop.hdfs.server.namenode.TestStartup |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 

[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup

2016-01-13 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097033#comment-15097033
 ] 

Daniel Templeton commented on HDFS-9415:


My concern is that we clearly state that the correct format is 'users groups', 
but then say that ' * ' is also valid, which doesn't follow that format.  I 
don't see how that can improve clarity.  If what you want to say is that 
trailing spaces are allowed, then say that instead.  (It's probably not a bad 
thing to add in any case.)

I will yield on this one.  It's not worth arguing over 7 characters. :)

> Document dfs.cluster.administrators and dfs.permissions.superusergroup
> --
>
> Key: HDFS-9415
> URL: https://issues.apache.org/jira/browse/HDFS-9415
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Arpit Agarwal
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, 
> HDFS-9415.003.patch
>
>
> dfs.cluster.administrators and dfs.permissions.superusergroup documentation 
> is not clear enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9094) Add command line option to ask NameNode reload configuration.

2016-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097143#comment-15097143
 ] 

Hadoop QA commented on HDFS-9094:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
46s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 35s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 31s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
53s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 34s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s 
{color} | {color:red} Patch generated 16 new checkstyle issues in 
hadoop-hdfs-project (total was 307, now 315). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 59s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 43s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 7s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 167m 43s {color} 
| {color:black} {color} |
\\
\\
|| 

[jira] [Updated] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9646:

Attachment: test-reconstruct-stripe-file.patch

Upload the unit test from [~tfukudom] that can reproduce the issue.

> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> ---
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Takuya Fukudome
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup

2016-01-13 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097195#comment-15097195
 ] 

Arpit Agarwal commented on HDFS-9415:
-

bq. If what you want to say is that trailing spaces are allowed, then say that 
instead. (It's probably not a bad thing to add in any case.)
I never talked about trailing spaces. :-) But I see what you are saying now. 
The Jira font made it easy to miss the surrounding spaces in your comment. I 
agree it's fair to omit that one.
[~xiaobingo], do you want to post an updated patch that removes the {{" * "}} 
wildcard option?

> Document dfs.cluster.administrators and dfs.permissions.superusergroup
> --
>
> Key: HDFS-9415
> URL: https://issues.apache.org/jira/browse/HDFS-9415
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Arpit Agarwal
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, 
> HDFS-9415.003.patch
>
>
> dfs.cluster.administrators and dfs.permissions.superusergroup documentation 
> is not clear enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9648:
--
Assignee: (was: Wei-Chiu Chuang)

> Test TestStartup.testImageChecksum keeps failing 
> -
>
> Key: HDFS-9648
> URL: https://issues.apache.org/jira/browse/HDFS-9648
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>
> I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
> consecutively 5 times.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/
> Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
> looking for the exact message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned HDFS-9648:
-

Assignee: Wei-Chiu Chuang

> Test TestStartup.testImageChecksum keeps failing 
> -
>
> Key: HDFS-9648
> URL: https://issues.apache.org/jira/browse/HDFS-9648
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>
> I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
> consecutively 5 times.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/
> Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
> looking for the exact message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9648:
--
Description: 
I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
consecutively 5 times.

https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/

Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
looking for the exact message.

Expected to find 'Failed to load an FSImage file!' but got unexpected 
exception:java.io.IOException: Failed to load FSImage file, see error(s) above 
for more info.

  was:
I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
consecutively 5 times.

https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/

Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
looking for the exact message.


> Test TestStartup.testImageChecksum keeps failing 
> -
>
> Key: HDFS-9648
> URL: https://issues.apache.org/jira/browse/HDFS-9648
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>
> I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
> consecutively 5 times.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/
> Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
> looking for the exact message.
> Expected to find 'Failed to load an FSImage file!' but got unexpected 
> exception:java.io.IOException: Failed to load FSImage file, see error(s) 
> above for more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9648:
--
Attachment: HDFS-9648.001.patch

Rev01: match the new exception message.

> Test TestStartup.testImageChecksum keeps failing 
> -
>
> Key: HDFS-9648
> URL: https://issues.apache.org/jira/browse/HDFS-9648
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
> Attachments: HDFS-9648.001.patch
>
>
> I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
> consecutively 5 times.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/
> Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
> looking for the exact message.
> Expected to find 'Failed to load an FSImage file!' but got unexpected 
> exception:java.io.IOException: Failed to load FSImage file, see error(s) 
> above for more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8430) Erasure coding: compute file checksum for stripe files

2016-01-13 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097645#comment-15097645
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8430:
---

[~drankye], sorry for the late reply.  Your suggestion sounds good in general.  
Some minor comments:

> First, add a new API like getFileChecksum(int cell) using the New Algorithm 
> 2. ...

It is better to add the new API as getFileChecksum(String algorithm) since it 
is more general and more in sync with the Java API such as MessageDigest.  We 
don't want to change/modify the FileSystem API further if we want to support 
different algorithms in the future.

We may need another FileSystem API supportFileChecksum(String algorithm) for 
distcp or other tools to check if a particular algorithm is supported; see 
below.

> distcp will be updated to favor the new APIs and use the two APIs 
> appropriately. ...

distcp probably needs to first check if the same algorithm supported in both 
the source and the destination clusters.  If they don't support the same 
algorithm, it may fall back to use file length.

Thanks a lot!

> Erasure coding: compute file checksum for stripe files
> --
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8999) Namenode need not wait for {{blockReceived}} for the last block before completing a file.

2016-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097717#comment-15097717
 ] 

Hadoop QA commented on HDFS-8999:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 2s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 41s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
52s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 38s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 53s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 48s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 40s 
{color} | {color:red} Patch generated 7 new checkstyle issues in 
hadoop-hdfs-project (total was 1043, now 1044). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 16s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 18s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 180m 40s {color} 
| {color:black} {color} |
\\
\\
|| 

[jira] [Assigned] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned HDFS-9648:
-

Assignee: Wei-Chiu Chuang

> Test TestStartup.testImageChecksum keeps failing 
> -
>
> Key: HDFS-9648
> URL: https://issues.apache.org/jira/browse/HDFS-9648
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9648.001.patch
>
>
> I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
> consecutively 5 times.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/
> Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
> looking for the exact message.
> Expected to find 'Failed to load an FSImage file!' but got unexpected 
> exception:java.io.IOException: Failed to load FSImage file, see error(s) 
> above for more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9648:
--
Labels: test  (was: )

> Test TestStartup.testImageChecksum keeps failing 
> -
>
> Key: HDFS-9648
> URL: https://issues.apache.org/jira/browse/HDFS-9648
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Trivial
>  Labels: test
> Attachments: HDFS-9648.001.patch
>
>
> I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
> consecutively 5 times.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/
> Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
> looking for the exact message.
> Expected to find 'Failed to load an FSImage file!' but got unexpected 
> exception:java.io.IOException: Failed to load FSImage file, see error(s) 
> above for more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9648:
--
Priority: Trivial  (was: Major)

> Test TestStartup.testImageChecksum keeps failing 
> -
>
> Key: HDFS-9648
> URL: https://issues.apache.org/jira/browse/HDFS-9648
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Trivial
> Attachments: HDFS-9648.001.patch
>
>
> I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
> consecutively 5 times.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/
> Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
> looking for the exact message.
> Expected to find 'Failed to load an FSImage file!' but got unexpected 
> exception:java.io.IOException: Failed to load FSImage file, see error(s) 
> above for more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9648:
--
Affects Version/s: 3.0.0

> Test TestStartup.testImageChecksum keeps failing 
> -
>
> Key: HDFS-9648
> URL: https://issues.apache.org/jira/browse/HDFS-9648
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9648.001.patch
>
>
> I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
> consecutively 5 times.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/
> Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
> looking for the exact message.
> Expected to find 'Failed to load an FSImage file!' but got unexpected 
> exception:java.io.IOException: Failed to load FSImage file, see error(s) 
> above for more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9648:
--
Status: Patch Available  (was: Open)

> Test TestStartup.testImageChecksum keeps failing 
> -
>
> Key: HDFS-9648
> URL: https://issues.apache.org/jira/browse/HDFS-9648
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9648.001.patch
>
>
> I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
> consecutively 5 times.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/
> Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
> looking for the exact message.
> Expected to find 'Failed to load an FSImage file!' but got unexpected 
> exception:java.io.IOException: Failed to load FSImage file, see error(s) 
> above for more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9648) Test TestStartup.testImageChecksum keeps failing

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9648:
--
Component/s: namenode

> Test TestStartup.testImageChecksum keeps failing 
> -
>
> Key: HDFS-9648
> URL: https://issues.apache.org/jira/browse/HDFS-9648
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Trivial
>  Labels: test
> Attachments: HDFS-9648.001.patch
>
>
> I saw the Jenkins log shows TestStartup.testImageChecksum has been failing 
> consecutively 5 times.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2724/testReport/org.apache.hadoop.hdfs.server.namenode/TestStartup/testImageChecksum/
> Seems like HDFS-9569 by Yongjun changed exception message, and this test was 
> looking for the exact message.
> Expected to find 'Failed to load an FSImage file!' but got unexpected 
> exception:java.io.IOException: Failed to load FSImage file, see error(s) 
> above for more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8430) Erasure coding: compute file checksum for stripe files

2016-01-13 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097692#comment-15097692
 ] 

Kai Zheng commented on HDFS-8430:
-

Thanks Nicholas for the great elaborating and confirming. I thought the 
comments have resolved all my concerns and questions so far. Will surely 
proceed sooner and wish you a nice day!

> Erasure coding: compute file checksum for stripe files
> --
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: HDFS-7285
>Reporter: Walter Su
>Assignee: Kai Zheng
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9612) DistCp worker threads are not terminated after jobs are done.

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-9612:
--
Attachment: HDFS-9612.007.patch

Rev07:
Thanks [~yzhangal] for comments. I uploaded a patch that address most issues. 
The change of slf4j was used in conjunction of GenericTestUtils.setLogLevel to 
set log level to DEBUG. GenericTestUtils.setLogLevel is a useful tool, but 
unfortunately requires slf4j. It is not a necessary part of the fix, so I 
removed them.

About the tests, they use GenericTestUtils.waitForThreadTermination() which 
periodically checks to see if there are any threads whose name matches the 
pattern "pool-.*thread.*" (it's regular expression). These are the threads 
created by ExecutorService. If the fix works, those threads should terminate 
right away after ProducerConsumer.shutdown() is called.

> DistCp worker threads are not terminated after jobs are done.
> -
>
> Key: HDFS-9612
> URL: https://issues.apache.org/jira/browse/HDFS-9612
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.8.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9612.001.patch, HDFS-9612.002.patch, 
> HDFS-9612.003.patch, HDFS-9612.004.patch, HDFS-9612.005.patch, 
> HDFS-9612.006.patch, HDFS-9612.007.patch
>
>
> In HADOOP-11827, a producer-consumer style thread pool was introduced to 
> parallelize the task of listing files/directories.
> We have a use case where a distcp job is run during the commit phase of a MR2 
> job. However, it was found distcp does not terminate ProducerConsumer thread 
> pools properly. Because threads are not terminated, those MR2 jobs never 
> finish.
> In a more typical use case where distcp is run as a standalone job, those 
> threads are terminated forcefully when the java process is terminated. So 
> these leaked threads did not become a problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9415) Document dfs.cluster.administrators and dfs.permissions.superusergroup

2016-01-13 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097024#comment-15097024
 ] 

Arpit Agarwal commented on HDFS-9415:
-

Also IME customers routinely skip the trailing space.

{code}

  dfs.cluster.administrators
  hdfs

{code}

I plan to commit this patch later today.

> Document dfs.cluster.administrators and dfs.permissions.superusergroup
> --
>
> Key: HDFS-9415
> URL: https://issues.apache.org/jira/browse/HDFS-9415
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Arpit Agarwal
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9415.001.patch, HDFS-9415.002.patch, 
> HDFS-9415.003.patch
>
>
> dfs.cluster.administrators and dfs.permissions.superusergroup documentation 
> is not clear enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9466) TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097201#comment-15097201
 ] 

Wei-Chiu Chuang commented on HDFS-9466:
---

[~cmccabe] Xiao is right about what I thought. It does appear there is a race. 
From your perspective, do you think that's by design, or some unintended bugs 
in the code?

Thanks for the reviews!

> TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
> 
>
> Key: HDFS-9466
> URL: https://issues.apache.org/jira/browse/HDFS-9466
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, hdfs-client
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9466.001.patch, HDFS-9466.002.patch
>
>
> This test is flaky and fails quite frequently in trunk.
> Error Message
> expected:<1> but was:<2>
> Stacktrace
> {noformat}
> java.lang.AssertionError: expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache$17.accept(TestShortCircuitCache.java:636)
>   at 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.visit(ShortCircuitRegistry.java:395)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.checkNumberOfSegmentsAndSlots(TestShortCircuitCache.java:631)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.testDataXceiverCleansUpSlotsOnFailure(TestShortCircuitCache.java:684)
> {noformat}
> Thanks to [~xiaochen] for identifying the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097200#comment-15097200
 ] 

Jing Zhao commented on HDFS-9646:
-

The failure can also be reproduced with the following change on 
{{TestRecoverStripedFile}}:
{code}
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestRecoverStripedFile.java
@@ -212,7 +212,7 @@ private void assertFileBlocksRecovery(String fileName, int 
fileLen,
 
 int[] toDead = new int[toRecoverBlockNum];
 int n = 0;
-for (int i = 0; i < indices.length; i++) {
+for (int i = indices.length - 1; i >= 0; i--) {
   if (n < toRecoverBlockNum) {
 if (recovery == 0) {
   if (indices[i] >= dataBlkNum) {
{code}

> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> ---
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Takuya Fukudome
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097214#comment-15097214
 ] 

Jing Zhao edited comment on HDFS-9646 at 1/13/16 10:59 PM:
---

{{ErasureCodingWorker#ReconstructAndTransferBlock}} uses the length of the 
first internal block to decide whether to continue the recovery work:
{code}
long firstStripedBlockLength = getBlockLen(blockGroup, 0);
while (positionInBlock < firstStripedBlockLength) {
{code}

However, if we are recovering a block whose length is less than the first one 
(e.g., the last stripe like the following), we will run into an unnecessary 
iteration which generates decoded result filled with 0.

| b0 | b1 | b2 | b3 | b4 | b5 | p0 | p1 | p2 |
| 64k | 64k | 64k | 64k |  |  | 64k | 64k | 64k |

Then at the end of {{recoverTargets}}, we set the limit of the decoding output 
buffer based on the length of the block-to-be-recovered:
{code}
  long blockLen = getBlockLen(blockGroup, targetIndices[i]);
  long remaining = blockLen - positionInBlock;
  if (remaining < 0) {
targetBuffers[i].limit(0);
  } else if (remaining < toRecoverLen) {
targetBuffers[i].limit((int)remaining);
  }
{code}

This will set the buffer limit to 0, and cause {{transferData2Targets}} to 
return 0.


was (Author: jingzhao):
{{ErasureCodingWorker#ReconstructAndTransferBlock}} uses the length of the 
first internal block to decide whether to continue the recovery work:
{code}
long firstStripedBlockLength = getBlockLen(blockGroup, 0);
while (positionInBlock < firstStripedBlockLength) {
{code}

However, if we are recovering a block whose length is less than the first one, 
we will run into an unnecessary iteration which generates decoded result filled 
with 0. Then at the end of {{recoverTargets}}, we set the limit of the decoding 
output buffer based on the length of the block-to-be-recovered:
{code}
  long blockLen = getBlockLen(blockGroup, targetIndices[i]);
  long remaining = blockLen - positionInBlock;
  if (remaining < 0) {
targetBuffers[i].limit(0);
  } else if (remaining < toRecoverLen) {
targetBuffers[i].limit((int)remaining);
  }
{code}

This will set the buffer limit to 0, and cause {{transferData2Targets}} to 
return 0.

> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> ---
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Takuya Fukudome
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9646:

Attachment: HDFS-9646.000.patch

Upload a patch to fix the recover length calculation in ErasureCodingWorker.

> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> ---
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Takuya Fukudome
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-9646:
---

 Summary: ErasureCodingWorker may fail when recovering data blocks 
with length less than the first internal block
 Key: HDFS-9646
 URL: https://issues.apache.org/jira/browse/HDFS-9646
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: erasure-coding
Affects Versions: 3.0.0
Reporter: Takuya Fukudome
Assignee: Jing Zhao
Priority: Critical


This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
following exception when recovering a non-full internal block.

{code}
2016-01-06 11:14:44,740 WARN  datanode.DataNode 
(ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
54322288_29751
java.io.IOException: Transfer failed for all targets.
at 
org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9612) DistCp worker threads are not terminated after jobs are done.

2016-01-13 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097207#comment-15097207
 ] 

Yongjun Zhang commented on HDFS-9612:
-

Thanks for [~jojochuang]'s work here and [~3opan] for the review.

Overall the patch looks good. I have the following comments:

# About the following code
{code}
executor.shutdown();
executor.shutdownNow();
{code}
it looks a bit werid to me. Instead of calling two methods, why not just call 
{{ executor.shutdownNow();}}?
# Agree with Zoran that separating the log4j change to a different jira would 
be better.
# About
{code}
try {
  work = inputQueue.take();
} catch (InterruptedException e) {
  LOG.debug("Interrupted while waiting for request from inputQueue.");
  // if interrupt is triggered by shutdown(), terminate the thread
  // otherwise, attempt to take again
  Thread.currentThread().interrupt();
  return;
}

boolean isDone = false;
while (!isDone) {
  try {
// assume processor.processItem() is stateless
WorkReport result = processor.processItem(work);
outputQueue.put(result);
isDone = true;
  } catch (InterruptedException ie) {
LOG.debug("Could not put report into outputQueue. Retrying...");
  }
}
{code}
##  The call to  {{Thread.currentThread().interrupt();}} can be dropped
##  If I understand it correctly, the comment "if interrupt is triggered by 
shutdown(), terminate the thread; otherwise, attempt to take again" can be 
improved. such as "If interrupt  happens when taking work out from queue, then 
the interrupt is likely triggered by the shutdown() call, exit the thread; if 
the interrupt happens while the work is being processed, go back to process the 
same work again."
## The message ""Could not put report into outputQueue" is not accurate since 
interrupt can be triggered from either within processItem or put operation.
## Add javadoc to this method and probably even the class itself to say that it 
assumes " processor.processItem() is stateless"
# About the test, would you please put some comment to indicate how the test 
would fail and with the fix it won't fail?

Thanks.





> DistCp worker threads are not terminated after jobs are done.
> -
>
> Key: HDFS-9612
> URL: https://issues.apache.org/jira/browse/HDFS-9612
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 2.8.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9612.001.patch, HDFS-9612.002.patch, 
> HDFS-9612.003.patch, HDFS-9612.004.patch, HDFS-9612.005.patch, 
> HDFS-9612.006.patch
>
>
> In HADOOP-11827, a producer-consumer style thread pool was introduced to 
> parallelize the task of listing files/directories.
> We have a use case where a distcp job is run during the commit phase of a MR2 
> job. However, it was found distcp does not terminate ProducerConsumer thread 
> pools properly. Because threads are not terminated, those MR2 jobs never 
> finish.
> In a more typical use case where distcp is run as a standalone job, those 
> threads are terminated forcefully when the java process is terminated. So 
> these leaked threads did not become a problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9628) libhdfs++: Implement builder apis from C bindings

2016-01-13 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097238#comment-15097238
 ] 

James Clampffer commented on HDFS-9628:
---

Looks good to me, +1.

> libhdfs++: Implement builder apis from C bindings
> -
>
> Key: HDFS-9628
> URL: https://issues.apache.org/jira/browse/HDFS-9628
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9628.HDFS-8707.000.patch, 
> HDFS-9628.HDFS-8707.001.patch, HDFS-9628.HDFS-8707.002.patch, 
> HDFS-9628.HDFS-8707.003.patch, HDFS-9628.HDFS-8707.003.patch, 
> HDFS-9628.HDFS-8707.004.patch, HDFS-9628.HDFS-8707.005.patch, 
> HDFS-9628.HDFS-8707.005.patch, HDFS-9628.HDFS-8707.006.patch, 
> HDFS-9628.HDFS-8707.006.patch, HDFS-9628.HDFS-8707.007.patch, 
> HDFS-9628.HDFS-8707.008.patch, HDFS-9628.HDFS-8707.009.patch, 
> HDFS-9628.HDFS-8707.010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9289) Make DataStreamer#block thread safe and verify genStamp in commitBlock

2016-01-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-9289:
--
Fix Version/s: (was: 2.7.3)
   (was: 3.0.0)
   2.7.2

> Make DataStreamer#block thread safe and verify genStamp in commitBlock
> --
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: HDFS-9289-branch-2.6.patch, HDFS-9289.1.patch, 
> HDFS-9289.2.patch, HDFS-9289.3.patch, HDFS-9289.4.patch, HDFS-9289.5.patch, 
> HDFS-9289.6.patch, HDFS-9289.7.patch, HDFS-9289.branch-2.7.patch, 
> HDFS-9289.branch-2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9517) Make TestDistCpUtils.testUnpackAttributes testable

2016-01-13 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097346#comment-15097346
 ] 

Colin Patrick McCabe commented on HDFS-9517:


+1.  Thanks, [~jojochuang].

> Make TestDistCpUtils.testUnpackAttributes testable
> --
>
> Key: HDFS-9517
> URL: https://issues.apache.org/jira/browse/HDFS-9517
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Trivial
> Attachments: HDFS-9517.001.patch
>
>
> testUnpackAttributes() test method in TestDistCpUtils does not have @Test 
> annotation and is not testable.
> I searched around and saw no discussion it was omitted, so I assume it was 
> just unintentional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9517) Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes

2016-01-13 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9517:
---
Summary: Fix missing @Test annotation on 
TestDistCpUtils.testUnpackAttributes  (was: Make 
TestDistCpUtils.testUnpackAttributes testable)

> Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes
> 
>
> Key: HDFS-9517
> URL: https://issues.apache.org/jira/browse/HDFS-9517
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Trivial
> Attachments: HDFS-9517.001.patch
>
>
> testUnpackAttributes() test method in TestDistCpUtils does not have @Test 
> annotation and is not testable.
> I searched around and saw no discussion it was omitted, so I assume it was 
> just unintentional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8615) Correct HTTP method in WebHDFS document

2016-01-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8615:
--
Fix Version/s: (was: 2.7.3)
   (was: 2.8.0)
   2.7.2

Pulled this into 2.7.2 to keep the release up-to-date with 2.6.3. Changing 
fix-versions to reflect the same.

> Correct HTTP method in WebHDFS document
> ---
>
> Key: HDFS-8615
> URL: https://issues.apache.org/jira/browse/HDFS-8615
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.4.1
>Reporter: Akira AJISAKA
>Assignee: Brahma Reddy Battula
>  Labels: newbie
> Fix For: 2.7.2, 2.6.3
>
> Attachments: HDFS-8615.branch-2.6.patch, HDFS-8615.patch
>
>
> For example, {{-X PUT}} should be removed from the following curl command.
> {code:title=WebHDFS.md}
> ### Get ACL Status
> * Submit a HTTP GET request.
> curl -i -X PUT 
> "http://:/webhdfs/v1/?op=GETACLSTATUS"
> {code}
> Other than this example, there are several commands which {{-X PUT}} should 
> be removed from. We should fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9597) BaseReplicationPolicyTest should update data node stats after adding a data node

2016-01-13 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097351#comment-15097351
 ] 

Vinod Kumar Vavilapalli commented on HDFS-9597:
---

[~benoyantony], there is a branch-2.8 where you need to land this patch for it 
to be in 2.8.0.

> BaseReplicationPolicyTest should update data node stats after adding a data 
> node
> 
>
> Key: HDFS-9597
> URL: https://issues.apache.org/jira/browse/HDFS-9597
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Blocker
> Fix For: 2.8.0
>
> Attachments: HDFS-9597.001.patch
>
>
> Looks like HDFS-9034 broke 
> TestReplicationPolicyConsiderLoad#testChooseTargetWithDecomNodes.
> This test has been failing since yesterday.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9466) TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky

2016-01-13 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097350#comment-15097350
 ] 

Colin Patrick McCabe commented on HDFS-9466:


Hmm.  Can you be clearer on what the race condition is here?

> TestShortCircuitCache#testDataXceiverCleansUpSlotsOnFailure is flaky
> 
>
> Key: HDFS-9466
> URL: https://issues.apache.org/jira/browse/HDFS-9466
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fs, hdfs-client
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-9466.001.patch, HDFS-9466.002.patch
>
>
> This test is flaky and fails quite frequently in trunk.
> Error Message
> expected:<1> but was:<2>
> Stacktrace
> {noformat}
> java.lang.AssertionError: expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache$17.accept(TestShortCircuitCache.java:636)
>   at 
> org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.visit(ShortCircuitRegistry.java:395)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.checkNumberOfSegmentsAndSlots(TestShortCircuitCache.java:631)
>   at 
> org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache.testDataXceiverCleansUpSlotsOnFailure(TestShortCircuitCache.java:684)
> {noformat}
> Thanks to [~xiaochen] for identifying the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9493) Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk

2016-01-13 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097348#comment-15097348
 ] 

Vinod Kumar Vavilapalli commented on HDFS-9493:
---

[~eddyxu], there is a branch-2.8 where you need to land this patch for it to 
make to 2.8.0.

> Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk
> ---
>
> Key: HDFS-9493
> URL: https://issues.apache.org/jira/browse/HDFS-9493
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Mingliang Liu
>Assignee: Tony Wu
> Fix For: 2.8.0
>
> Attachments: HDFS-9493.001.patch, HDFS-9493.002.patch, 
> HDFS-9493.003.patch
>
>
> Tested in both Gentoo Linux and Mac.
> {quote}
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 34.159 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> testMetasaveAfterDelete(org.apache.hadoop.hdfs.server.namenode.TestMetaSave)  
> Time elapsed: 15.318 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestMetaSave.testMetasaveAfterDelete(TestMetaSave.java:154)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9431) DistributedFileSystem#concat fails if the target path is relative.

2016-01-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-9431:
--
Fix Version/s: (was: 2.7.3)
   (was: 2.8.0)
   2.7.2

Pulled this into 2.7.2 to keep the release up-to-date with 2.6.3. Changing 
fix-versions to reflect the same.

> DistributedFileSystem#concat fails if the target path is relative.
> --
>
> Key: HDFS-9431
> URL: https://issues.apache.org/jira/browse/HDFS-9431
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Kazuho Fujii
>Assignee: Kazuho Fujii
> Fix For: 2.7.2, 2.6.3
>
> Attachments: HDFS-9431.001.patch, HDFS-9431.002.patch
>
>
> {{DistributedFileSystem#concat}} fails if the target path is relative.
> The method tries to send a relative path to DFSClient at the first call.
> bq.  dfs.concat(getPathName(trg), srcsStr);
> But, {{getPathName}} failed. It seems that {{trg}} should be {{absF}} like 
> the second call.
> bq.  dfs.concat(getPathName(absF), srcsStr);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9647) DiskBalancer : Add getRuntimeSettings

2016-01-13 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-9647:
---
Attachment: HDFS-9647-HDFS-1312.001.patch

This patch depends on HDFS-9645. Attaching patch for code review purpose.

> DiskBalancer : Add getRuntimeSettings
> -
>
> Key: HDFS-9647
> URL: https://issues.apache.org/jira/browse/HDFS-9647
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-1312
>
> Attachments: HDFS-9647-HDFS-1312.001.patch
>
>
> Adds an RPC to read the runtime values of disk balancer like disk bandwidth.
> This is similar to getdiskbandwidth used by balancer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-9646:

Status: Patch Available  (was: Open)

> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> ---
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Takuya Fukudome
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9289) Make DataStreamer#block thread safe and verify genStamp in commitBlock

2016-01-13 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097181#comment-15097181
 ] 

Vinod Kumar Vavilapalli commented on HDFS-9289:
---

Pulled this into 2.7.2 to keep the release up-to-date with 2.6.3. Changing 
fix-versions to reflect the same.

> Make DataStreamer#block thread safe and verify genStamp in commitBlock
> --
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Fix For: 2.7.2, 2.6.3
>
> Attachments: HDFS-9289-branch-2.6.patch, HDFS-9289.1.patch, 
> HDFS-9289.2.patch, HDFS-9289.3.patch, HDFS-9289.4.patch, HDFS-9289.5.patch, 
> HDFS-9289.6.patch, HDFS-9289.7.patch, HDFS-9289.branch-2.7.patch, 
> HDFS-9289.branch-2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9631) Restarting namenode after deleting a directory with snapshot will fail

2016-01-13 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097203#comment-15097203
 ] 

Wei-Chiu Chuang commented on HDFS-9631:
---

I know [~yzhangal] has hit a similar issue in production. Maybe this test 
failure will be fixed after Yongjun finds a solution.

> Restarting namenode after deleting a directory with snapshot will fail
> --
>
> Key: HDFS-9631
> URL: https://issues.apache.org/jira/browse/HDFS-9631
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>
> I found a number of {{TestOpenFilesWithSnapshot}} tests failed quite 
> frequently. 
> {noformat}
> FAILED:  
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testParentDirWithUCFileDeleteWithSnapShot
> Error Message:
> Timed out waiting for Mini HDFS Cluster to start
> Stack Trace:
> java.io.IOException: Timed out waiting for Mini HDFS Cluster to start
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1345)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2024)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1985)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testParentDirWithUCFileDeleteWithSnapShot(TestOpenFilesWithSnapshot.java:82)
> {noformat}
> These tests ({{testParentDirWithUCFileDeleteWithSnapshot}}, 
> {{testOpenFilesWithRename}}, {{testWithCheckpoint}}) are unable to reconnect 
> to the namenode after restart. It looks like the reconnection failed due to 
> an EOFException when BPServiceActor sends a heartbeat.
> {noformat}
> 2016-01-07 23:25:43,678 [main] WARN  hdfs.MiniDFSCluster 
> (MiniDFSCluster.java:waitClusterUp(1338)) - Waiting for the Mini HDFS Cluster 
> to start...
> 2016-01-07 23:25:44,679 [main] WARN  hdfs.MiniDFSCluster 
> (MiniDFSCluster.java:waitClusterUp(1338)) - Waiting for the Mini HDFS Cluster 
> to start...
> 2016-01-07 23:25:44,720 [DataNode: 
> [[[DISK]file:/home/weichiu/hadoop2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data1/,
>  [DISK]file:
> /home/weichiu/hadoop2/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data2/]]
>   heartbeating to localhost/127.0.0.1:60472] WARN  datanode
> .DataNode (BPServiceActor.java:offerService(752)) - IOException in 
> offerService
> java.io.EOFException: End of File Exception between local host is: 
> "weichiu.vpc.cloudera.com/172.28.211.219"; destination host is: 
> "localhost":6047
> 2; :; For more details see:  http://wiki.apache.org/hadoop/EOFException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:793)
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:766)
> at org.apache.hadoop.ipc.Client.call(Client.java:1452)
> at org.apache.hadoop.ipc.Client.call(Client.java:1385)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy18.sendHeartbeat(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:154)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:557)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:660)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:851)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1110)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1005)
> {noformat}
> It appears that these three tests all call {{doWriteAndAbort()}}, which 
> creates files and then abort, and then set the parent directory with a 
> snapshot, and then delete the parent directory. 
> Interestingly, if the parent directory does not have a snapshot, the tests 
> will not fail. Additionally, if the parent directory is not deleted, the 
> tests will not fail.
> The following test will fail intermittently:
> {code:java}
> public void testDeleteParentDirWithSnapShot() throws 

[jira] [Created] (HDFS-9647) DiskBalancer : Add getRuntimeSettings

2016-01-13 Thread Anu Engineer (JIRA)
Anu Engineer created HDFS-9647:
--

 Summary: DiskBalancer : Add getRuntimeSettings
 Key: HDFS-9647
 URL: https://issues.apache.org/jira/browse/HDFS-9647
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer & mover
Affects Versions: HDFS-1312
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: HDFS-1312


Adds an RPC to read the runtime values of disk balancer like disk bandwidth.
This is similar to getdiskbandwidth used by balancer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097214#comment-15097214
 ] 

Jing Zhao commented on HDFS-9646:
-

{{ErasureCodingWorker#ReconstructAndTransferBlock}} uses the length of the 
first internal block to decide whether to continue the recovery work:
{code}
long firstStripedBlockLength = getBlockLen(blockGroup, 0);
while (positionInBlock < firstStripedBlockLength) {
{code}

However, if we are recovering a block whose length is less than the first one, 
we will run into an unnecessary iteration which generates decoded result filled 
with 0. Then at the end of {{recoverTargets}}, we set the limit of the decoding 
output buffer based on the length of the block-to-be-recovered:
{code}
  long blockLen = getBlockLen(blockGroup, targetIndices[i]);
  long remaining = blockLen - positionInBlock;
  if (remaining < 0) {
targetBuffers[i].limit(0);
  } else if (remaining < toRecoverLen) {
targetBuffers[i].limit((int)remaining);
  }
{code}

This will set the buffer limit to 0, and cause {{transferData2Targets}} to 
return 0.

> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> ---
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Takuya Fukudome
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8999) Namenode need not wait for {{blockReceived}} for the last block before completing a file.

2016-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097320#comment-15097320
 ] 

Hadoop QA commented on HDFS-8999:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 14s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s 
{color} | {color:red} Patch generated 7 new checkstyle issues in 
hadoop-hdfs-project (total was 1043, now 1044). {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 169m 26s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 168m 56s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 30s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 381m 34s {color} 
| 

[jira] [Commented] (HDFS-9493) Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk

2016-01-13 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097368#comment-15097368
 ] 

Lei (Eddy) Xu commented on HDFS-9493:
-

[~vinodkv] Thanks for reminding me! Cherry picked it into {{branch-2.8}} now.

> Test o.a.h.hdfs.server.namenode.TestMetaSave fails in trunk
> ---
>
> Key: HDFS-9493
> URL: https://issues.apache.org/jira/browse/HDFS-9493
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Mingliang Liu
>Assignee: Tony Wu
> Fix For: 2.8.0
>
> Attachments: HDFS-9493.001.patch, HDFS-9493.002.patch, 
> HDFS-9493.003.patch
>
>
> Tested in both Gentoo Linux and Mac.
> {quote}
> ---
>  T E S T S
> ---
> Running org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 34.159 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestMetaSave
> testMetasaveAfterDelete(org.apache.hadoop.hdfs.server.namenode.TestMetaSave)  
> Time elapsed: 15.318 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestMetaSave.testMetasaveAfterDelete(TestMetaSave.java:154)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097443#comment-15097443
 ] 

Kai Zheng commented on HDFS-9646:
-

Hi [~jingzhao],

The patch looks great! I'm reading it and the related codes. So far I have a 
question: probably the current codes think {{maxTargetLength}} in your sense is 
right the length of the first block in the group, aka {{firstStripedBlockLength 
= getBlockLen(blockGroup, 0)}}. If so, I thought the thinking would be correct. 
Maybe {{getBlockLen}} doesn't return the exact length of the first block as 
someone may think it should?

> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> ---
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Takuya Fukudome
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097523#comment-15097523
 ] 

Kai Zheng commented on HDFS-9646:
-

The patch is a great fix along with good refactorings. Some comments:
1. It's good to refactor and avoid duplicate codes and computing around 
{{getReadLength}}. Minors: 1) {{positionInBlock}} would be good to be 
explicitly initialized to 0 in the beginning of the {{run}} method; 2) 
{{toRecover}} better to use the original name {{toRecoverLen}}; 3) {{success}} 
could be {{successList}}.

2. In the test, introducing {{RecoveryType}} is nice. Suggest: change {{Any}} 
to {{Both}}, and the logic for it can be, generate dead blocks of both data 
ones and parity ones, thus the test would be much thorough. A minor: {{toDead}} 
could be {{toDie}}. 

3. Question: do we need new test codes to expose the issue and ensure the issue 
is fixed? I'm not sure about this, because existing tests have already all 
sorts of file lengths, maybe lacking the right one for the reported case as you 
described above (the max length of the targeted blocks should be smaller than 
the first block).

> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> ---
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Takuya Fukudome
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097538#comment-15097538
 ] 

Hadoop QA commented on HDFS-9646:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
48s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
50s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 42s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 52s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 49m 27s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
30s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 126m 32s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailure210 |
|   | hadoop.hdfs.server.namenode.TestNNThroughputBenchmark |
|   | hadoop.hdfs.TestReadStripedFileWithDecoding |
|   | hadoop.hdfs.server.namenode.TestStartup |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestScrLazyPersistFiles |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.TestFSInputChecker |
|   | hadoop.hdfs.TestReadStripedFileWithDecoding |
|   | hadoop.hdfs.server.namenode.TestStartup |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 

[jira] [Updated] (HDFS-9517) Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes

2016-01-13 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9517:
---
  Resolution: Fixed
   Fix Version/s: 2.9.0
Target Version/s: 2.9.0
  Status: Resolved  (was: Patch Available)

> Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes
> 
>
> Key: HDFS-9517
> URL: https://issues.apache.org/jira/browse/HDFS-9517
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Trivial
> Fix For: 2.9.0
>
> Attachments: HDFS-9517.001.patch
>
>
> testUnpackAttributes() test method in TestDistCpUtils does not have @Test 
> annotation and is not testable.
> I searched around and saw no discussion it was omitted, so I assume it was 
> just unintentional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097447#comment-15097447
 ] 

Kai Zheng commented on HDFS-9646:
-

Oh, I got your point. You want to only read and recover the max length of 
{{target}} blocks to recover. This sounds a good optimization in addition to 
the fix.

> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> ---
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Takuya Fukudome
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9595) DiskBalancer : Add cancelPlan RPC

2016-01-13 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097454#comment-15097454
 ] 

Arpit Agarwal commented on HDFS-9595:
-

The patch looks great. Nitpick typo - _Cancels and executing disk balancer 
plan_ should be _Cancels an executing disk balancer plan_.

+1 otherwise.

> DiskBalancer : Add cancelPlan RPC
> -
>
> Key: HDFS-9595
> URL: https://issues.apache.org/jira/browse/HDFS-9595
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-1312
>
> Attachments: HDFS-9595-HDFS-1312.001.patch
>
>
> Add an RPC that allows users to cancel a running disk balancer plan



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8999) Namenode need not wait for {{blockReceived}} for the last block before completing a file.

2016-01-13 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8999:
--
Attachment: h8999_20160114.patch

Oops, the index calculated was incorrect in the last patch.

h8999_20160114.patch: fixes the bug.

> Namenode need not wait for {{blockReceived}} for the last block before 
> completing a file.
> -
>
> Key: HDFS-8999
> URL: https://issues.apache.org/jira/browse/HDFS-8999
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jitendra Nath Pandey
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h8999_20151228.patch, h8999_20160106.patch, 
> h8999_20160106b.patch, h8999_20160106c.patch, h8999_20160111.patch, 
> h8999_20160113.patch, h8999_20160114.patch
>
>
> This comes out of a discussion in HDFS-8763. Pasting [~jingzhao]'s comment 
> from the jira:
> {quote}
> ...whether we need to let NameNode wait for all the block_received msgs to 
> announce the replica is safe. Looking into the code, now we have
># NameNode knows the DataNodes involved when initially setting up the 
> writing pipeline
># If any DataNode fails during the writing, client bumps the GS and 
> finally reports all the DataNodes included in the new pipeline to NameNode 
> through the updatePipeline RPC.
># When the client received the ack for the last packet of the block (and 
> before the client tries to close the file on NameNode), the replica has been 
> finalized in all the DataNodes.
> Then in this case, when NameNode receives the close request from the client, 
> the NameNode already knows the latest replicas for the block. Currently the 
> checkReplication call only counts in all the replicas that NN has already 
> received the block_received msg, but based on the above #2 and #3, it may be 
> safe to also count in all the replicas in the 
> BlockUnderConstructionFeature#replicas?
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097511#comment-15097511
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9646:
---

Thanks [~tfukudom] and [~jingzhao].

Patch looks good.  Just a minor comment.
- getReadLength can safely return int since it returns the min of remaining and 
recoverLength, where recoverLength is an int.



> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> ---
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Takuya Fukudome
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9624) DataNode start slowly due to the initial DU command operations

2016-01-13 Thread Lin Yiqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-9624:

Attachment: HDFS-9624.006.patch

Sorry for Wang, I misunderstand your meaning of timer injection. I use the 
fakeTimer to the testcase and the test finished in milliseconds instead of 
waitting several seconds.Update the latest patch.

> DataNode start slowly due to the initial DU command operations
> --
>
> Key: HDFS-9624
> URL: https://issues.apache.org/jira/browse/HDFS-9624
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9624.001.patch, HDFS-9624.002.patch, 
> HDFS-9624.003.patch, HDFS-9624.004.patch, HDFS-9624.005.patch, 
> HDFS-9624.006.patch
>
>
> It seems starting datanode so slowly when I am finishing migration of 
> datanodes and restart them.I look the dn logs:
> {code}
> 2016-01-06 16:05:08,118 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> new volume: DS-70097061-42f8-4c33-ac27-2a6ca21e60d4
> 2016-01-06 16:05:08,118 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> volume - /home/data/data/hadoop/dfs/data/data12/current, StorageType: DISK
> 2016-01-06 16:05:08,176 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Registered FSDatasetState MBean
> 2016-01-06 16:05:08,177 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544
> 2016-01-06 16:05:08,178 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data2/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data3/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data4/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data5/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data6/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data7/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data8/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data9/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data10/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data11/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data12/current...
> 2016-01-06 16:09:49,646 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on 
> /home/data/data/hadoop/dfs/data/data7/current: 281466ms
> 2016-01-06 16:09:54,235 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on 
> /home/data/data/hadoop/dfs/data/data9/current: 286054ms
> 2016-01-06 16:09:57,859 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on 
> /home/data/data/hadoop/dfs/data/data2/current: 289680ms
> 2016-01-06 

[jira] [Commented] (HDFS-9517) Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes

2016-01-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097406#comment-15097406
 ] 

Hudson commented on HDFS-9517:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9104 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9104/])
HDFS-9517. Fix missing @Test annotation on (cmccabe: rev 
8315582c4ff2951144b096c23a64e753f397572d)
* 
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/util/TestDistCpUtils.java
* hadoop-common-project/hadoop-common/CHANGES.txt


> Fix missing @Test annotation on TestDistCpUtils.testUnpackAttributes
> 
>
> Key: HDFS-9517
> URL: https://issues.apache.org/jira/browse/HDFS-9517
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Trivial
> Fix For: 2.9.0
>
> Attachments: HDFS-9517.001.patch
>
>
> testUnpackAttributes() test method in TestDistCpUtils does not have @Test 
> annotation and is not testable.
> I searched around and saw no discussion it was omitted, so I assume it was 
> just unintentional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9646) ErasureCodingWorker may fail when recovering data blocks with length less than the first internal block

2016-01-13 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097502#comment-15097502
 ] 

Kai Sasaki commented on HDFS-9646:
--

Hello, [~jingzhao]
BTW, the patch seems to include the fix of HDFS-9585. Do you thing we can make 
HDFS-9585 close after fixing this JIRA?
https://issues.apache.org/jira/browse/HDFS-9585

> ErasureCodingWorker may fail when recovering data blocks with length less 
> than the first internal block
> ---
>
> Key: HDFS-9646
> URL: https://issues.apache.org/jira/browse/HDFS-9646
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Takuya Fukudome
>Assignee: Jing Zhao
>Priority: Critical
> Attachments: HDFS-9646.000.patch, test-reconstruct-stripe-file.patch
>
>
> This is reported by [~tfukudom]: ErasureCodingWorker may fail with the 
> following exception when recovering a non-full internal block.
> {code}
> 2016-01-06 11:14:44,740 WARN  datanode.DataNode 
> (ErasureCodingWorker.java:run(467)) - Failed to recover striped block: 
> BP-987302662-172.29.4.13-1450757377698:blk_-92233720368
> 54322288_29751
> java.io.IOException: Transfer failed for all targets.
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.ErasureCodingWorker$ReconstructAndTransferBlock.run(ErasureCodingWorker.java:455)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9635) Add one more volume choosing policy with considering volume IO load

2016-01-13 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15097541#comment-15097541
 ] 

Kai Zheng commented on HDFS-9635:
-

Just in case it can help, regarding volume choosing policy with multiple 
storage types, there are some optimizations in HDFS-9608. Wonder if we could 
consolidate all these inputs, thoughts and effort together to come up a 
comprehensive policy allowing kinds of configuring and tuning. After all, 
having a few of such policies in the list, users may be hard to choose.

> Add one more volume choosing policy with considering volume IO load
> ---
>
> Key: HDFS-9635
> URL: https://issues.apache.org/jira/browse/HDFS-9635
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yong Zhang
>Assignee: Yong Zhang
>
> We have RoundRobinVolumeChoosingPolicy and 
> AvailableSpaceVolumeChoosingPolicy, but both not consider volume IO load.
> This jira will add a Add one more volume choosing policy base on how many 
> xceiver count on volume.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9624) DataNode start slowly due to the initial DU command operations

2016-01-13 Thread Lin Yiqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-9624:

Attachment: HDFS-9624.005.patch

Thanks [~andrew.wang] for comments. I found that do timer injection in this 
test looks not very convenient so I create block files to consume time in the 
test and update the patch.

> DataNode start slowly due to the initial DU command operations
> --
>
> Key: HDFS-9624
> URL: https://issues.apache.org/jira/browse/HDFS-9624
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9624.001.patch, HDFS-9624.002.patch, 
> HDFS-9624.003.patch, HDFS-9624.004.patch, HDFS-9624.005.patch
>
>
> It seems starting datanode so slowly when I am finishing migration of 
> datanodes and restart them.I look the dn logs:
> {code}
> 2016-01-06 16:05:08,118 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> new volume: DS-70097061-42f8-4c33-ac27-2a6ca21e60d4
> 2016-01-06 16:05:08,118 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> volume - /home/data/data/hadoop/dfs/data/data12/current, StorageType: DISK
> 2016-01-06 16:05:08,176 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Registered FSDatasetState MBean
> 2016-01-06 16:05:08,177 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544
> 2016-01-06 16:05:08,178 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data2/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data3/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data4/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data5/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data6/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data7/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data8/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data9/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data10/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data11/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data12/current...
> 2016-01-06 16:09:49,646 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on 
> /home/data/data/hadoop/dfs/data/data7/current: 281466ms
> 2016-01-06 16:09:54,235 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on 
> /home/data/data/hadoop/dfs/data/data9/current: 286054ms
> 2016-01-06 16:09:57,859 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on 
> /home/data/data/hadoop/dfs/data/data2/current: 289680ms
> 2016-01-06 16:10:00,333 INFO 
> 

[jira] [Updated] (HDFS-9628) libhdfs++: Implement builder apis from C bindings

2016-01-13 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9628:
-
Attachment: HDFS-9628.HDFS-8707.009.patch

New patch: rebased on latest HDFS-8707

> libhdfs++: Implement builder apis from C bindings
> -
>
> Key: HDFS-9628
> URL: https://issues.apache.org/jira/browse/HDFS-9628
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9628.HDFS-8707.000.patch, 
> HDFS-9628.HDFS-8707.001.patch, HDFS-9628.HDFS-8707.002.patch, 
> HDFS-9628.HDFS-8707.003.patch, HDFS-9628.HDFS-8707.003.patch, 
> HDFS-9628.HDFS-8707.004.patch, HDFS-9628.HDFS-8707.005.patch, 
> HDFS-9628.HDFS-8707.005.patch, HDFS-9628.HDFS-8707.006.patch, 
> HDFS-9628.HDFS-8707.006.patch, HDFS-9628.HDFS-8707.007.patch, 
> HDFS-9628.HDFS-8707.008.patch, HDFS-9628.HDFS-8707.009.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9636) libhdfs++: for consistency, include files should be in hdfspp

2016-01-13 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9636:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to HDFS-8707.  Thanks for the patch Bob!

> libhdfs++: for consistency, include files should be in hdfspp
> -
>
> Key: HDFS-9636
> URL: https://issues.apache.org/jira/browse/HDFS-9636
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9636.HDFS-8707.000.patch, 
> HDFS-9636.HDFS-8707.001.patch, HDFS-9636.HDFS-8707.001.patch
>
>
> The existing hdfs library resides in hdfs/hdfs.h.  To maintain Least 
> Astonishment, we should move the libhdfspp files into hdfspp/hdfspp.h 
> (they're currently in the libhdfspp/ directory).
> Likewise, the install step in the root directory should put the include files 
> in /include/hdfspp and include/hdfs (it currently erroneously puts the hdfs 
> file into libhdfs/)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9624) DataNode start slowly due to the initial DU command operations

2016-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096432#comment-15096432
 ] 

Hadoop QA commented on HDFS-9624:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} HDFS-9624 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12782076/HDFS-9624.005.patch |
| JIRA Issue | HDFS-9624 |
| Powered by | Apache Yetus 0.2.0-SNAPSHOT   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14110/console |


This message was automatically generated.



> DataNode start slowly due to the initial DU command operations
> --
>
> Key: HDFS-9624
> URL: https://issues.apache.org/jira/browse/HDFS-9624
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9624.001.patch, HDFS-9624.002.patch, 
> HDFS-9624.003.patch, HDFS-9624.004.patch, HDFS-9624.005.patch
>
>
> It seems starting datanode so slowly when I am finishing migration of 
> datanodes and restart them.I look the dn logs:
> {code}
> 2016-01-06 16:05:08,118 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> new volume: DS-70097061-42f8-4c33-ac27-2a6ca21e60d4
> 2016-01-06 16:05:08,118 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Added 
> volume - /home/data/data/hadoop/dfs/data/data12/current, StorageType: DISK
> 2016-01-06 16:05:08,176 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Registered FSDatasetState MBean
> 2016-01-06 16:05:08,177 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544
> 2016-01-06 16:05:08,178 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data2/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data3/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data4/current...
> 2016-01-06 16:05:08,179 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data5/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data6/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data7/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data8/current...
> 2016-01-06 16:05:08,180 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data9/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data10/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data11/current...
> 2016-01-06 16:05:08,181 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning 
> block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on volume 
> /home/data/data/hadoop/dfs/data/data12/current...
> 2016-01-06 16:09:49,646 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time 
> taken to scan block pool BP-1942012336-xx.xx.xx.xx-1406726500544 on 
> /home/data/data/hadoop/dfs/data/data7/current: 281466ms
> 2016-01-06 

[jira] [Updated] (HDFS-9628) libhdfs++: Implement builder apis from C bindings

2016-01-13 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9628:
-
Attachment: HDFS-9628.HDFS-8707.008.patch

New patch: fixed hdfs_builder_test main function

> libhdfs++: Implement builder apis from C bindings
> -
>
> Key: HDFS-9628
> URL: https://issues.apache.org/jira/browse/HDFS-9628
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9628.HDFS-8707.000.patch, 
> HDFS-9628.HDFS-8707.001.patch, HDFS-9628.HDFS-8707.002.patch, 
> HDFS-9628.HDFS-8707.003.patch, HDFS-9628.HDFS-8707.003.patch, 
> HDFS-9628.HDFS-8707.004.patch, HDFS-9628.HDFS-8707.005.patch, 
> HDFS-9628.HDFS-8707.005.patch, HDFS-9628.HDFS-8707.006.patch, 
> HDFS-9628.HDFS-8707.006.patch, HDFS-9628.HDFS-8707.007.patch, 
> HDFS-9628.HDFS-8707.008.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9628) libhdfs++: Implement builder apis from C bindings

2016-01-13 Thread Bob Hansen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Hansen updated HDFS-9628:
-
Attachment: HDFS-9628.HDFS-8707.010.patch

New patch: catch up new code with rebase

> libhdfs++: Implement builder apis from C bindings
> -
>
> Key: HDFS-9628
> URL: https://issues.apache.org/jira/browse/HDFS-9628
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Bob Hansen
> Attachments: HDFS-9628.HDFS-8707.000.patch, 
> HDFS-9628.HDFS-8707.001.patch, HDFS-9628.HDFS-8707.002.patch, 
> HDFS-9628.HDFS-8707.003.patch, HDFS-9628.HDFS-8707.003.patch, 
> HDFS-9628.HDFS-8707.004.patch, HDFS-9628.HDFS-8707.005.patch, 
> HDFS-9628.HDFS-8707.005.patch, HDFS-9628.HDFS-8707.006.patch, 
> HDFS-9628.HDFS-8707.006.patch, HDFS-9628.HDFS-8707.007.patch, 
> HDFS-9628.HDFS-8707.008.patch, HDFS-9628.HDFS-8707.009.patch, 
> HDFS-9628.HDFS-8707.010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9635) Add one more volume choosing policy with considering volume IO load

2016-01-13 Thread Yong Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096240#comment-15096240
 ] 

Yong Zhang commented on HDFS-9635:
--

Hi [~andrew.wang], thanks for your comment. BlockPlacementPolicy consider 
DataNode writing load, but current VolumeCoosingPolicy not consider writing 
load on disk. we had some customers face some disks IO busy but some free in 
same DataNode, we want to balance the writing thread on different disk with 
same storage type.
So a DataXceiver monitor and some metrics will be added, and the new 
VolumeCoosingPolicy will choose the free disk.

> Add one more volume choosing policy with considering volume IO load
> ---
>
> Key: HDFS-9635
> URL: https://issues.apache.org/jira/browse/HDFS-9635
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yong Zhang
>Assignee: Yong Zhang
>
> We have RoundRobinVolumeChoosingPolicy and 
> AvailableSpaceVolumeChoosingPolicy, but both not consider volume IO load.
> This jira will add a Add one more volume choosing policy base on how many 
> xceiver count on volume.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9635) Add one more volume choosing policy with considering volume IO load

2016-01-13 Thread Yong Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096274#comment-15096274
 ] 

Yong Zhang commented on HDFS-9635:
--

As guys discussed in HDFS-8538, especially you mentioned 
https://issues.apache.org/jira/browse/HDFS-8538?focusedCommentId=14574914=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14574914.
 IMO, AvailableSpaceVolumeChoosingPolicy is ok for scenario of disk with 
different free space or capacity. we also use 
AvailableSpaceVolumeChoosingPolicy by default.
By the way, I don't think collect OS level IO statistics is a good idea. 
Considering heterogeneous mechine write performance depends not only disk IO, 
but also machine CPU, network bandwith, hadware new/old and so on. So I think 
we can collect data writing delay metrics for both BlockPlacementPolicy and 
VolumeCoosingPolicy in further work. It is usefull for multi-tenant cluster.

> Add one more volume choosing policy with considering volume IO load
> ---
>
> Key: HDFS-9635
> URL: https://issues.apache.org/jira/browse/HDFS-9635
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yong Zhang
>Assignee: Yong Zhang
>
> We have RoundRobinVolumeChoosingPolicy and 
> AvailableSpaceVolumeChoosingPolicy, but both not consider volume IO load.
> This jira will add a Add one more volume choosing policy base on how many 
> xceiver count on volume.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9628) libhdfs++: Implement builder apis from C bindings

2016-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096553#comment-15096553
 ] 

Hadoop QA commented on HDFS-9628:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
41s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 3s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 0s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 9s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 9s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
9s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 7s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 7s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 48s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 46s 
{color} | {color:green} hadoop-hdfs-native-client in the patch passed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 38s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12782084/HDFS-9628.HDFS-8707.010.patch
 |
| JIRA Issue | HDFS-9628 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  cc  |
| uname | Linux d520cc98d10e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / 5276e19 |
| Default Java | 1.7.0_91 |
| Multi-JDK versions |  

[jira] [Commented] (HDFS-9047) Retire libwebhdfs

2016-01-13 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096566#comment-15096566
 ] 

Kihwal Lee commented on HDFS-9047:
--

Also fixed the BUILDING.txt in trunk, branch-2 and branch-2.8 that was missed 
in the original commit.

> Retire libwebhdfs
> -
>
> Key: HDFS-9047
> URL: https://issues.apache.org/jira/browse/HDFS-9047
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Allen Wittenauer
>Assignee: Haohui Mai
> Fix For: 2.8.0
>
> Attachments: HDFS-9047-branch-2.7.patch, HDFS-9047.000.patch
>
>
> This library is basically a mess:
> * It's not part of the mvn package
> * It's missing functionality and barely maintained
> * It's not in the precommit runs so doesn't get exercised regularly
> * It's not part of the unit tests (at least, that I can see)
> * It isn't documented in any official documentation
> But most importantly:  
> * It fails at it's primary mission of being pure C (HDFS-3917 is STILL open)
> Let's cut our losses and just remove it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-3917) Remove JNI code from libwebhdfs (C client library)

2016-01-13 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HDFS-3917.
--
Resolution: Not A Problem

libwebhdfs has been removed.

> Remove JNI code from libwebhdfs (C client library)
> --
>
> Key: HDFS-3917
> URL: https://issues.apache.org/jira/browse/HDFS-3917
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> The current implementation of libwebhdfs (C client library) uses JNI for 
> loading NameNode configuration and implementing hdfsCopy/hdfsMove. We need to 
> implement the same functionalities in libwebhdfs without using JNI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9628) libhdfs++: Implement builder apis from C bindings

2016-01-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096512#comment-15096512
 ] 

Hadoop QA commented on HDFS-9628:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
41s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 58s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 55s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s 
{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s 
{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s 
{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 2m 14s 
{color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 2m 14s {color} | 
{color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 14s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 2m 11s 
{color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 2m 11s {color} | 
{color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 11s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 7s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 9s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 15s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 19s {color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 10s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12782083/HDFS-9628.HDFS-8707.009.patch
 |
| JIRA Issue | HDFS-9628 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  xml  cc  |
| uname | Linux 9a46d6b2d7d0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 

[jira] [Commented] (HDFS-9047) Retire libwebhdfs

2016-01-13 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096557#comment-15096557
 ] 

Kihwal Lee commented on HDFS-9047:
--

Removed from branch-2.7.

> Retire libwebhdfs
> -
>
> Key: HDFS-9047
> URL: https://issues.apache.org/jira/browse/HDFS-9047
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Allen Wittenauer
>Assignee: Haohui Mai
> Fix For: 2.8.0
>
> Attachments: HDFS-9047-branch-2.7.patch, HDFS-9047.000.patch
>
>
> This library is basically a mess:
> * It's not part of the mvn package
> * It's missing functionality and barely maintained
> * It's not in the precommit runs so doesn't get exercised regularly
> * It's not part of the unit tests (at least, that I can see)
> * It isn't documented in any official documentation
> But most importantly:  
> * It fails at it's primary mission of being pure C (HDFS-3917 is STILL open)
> Let's cut our losses and just remove it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9047) Retire libwebhdfs

2016-01-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096592#comment-15096592
 ] 

Hudson commented on HDFS-9047:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9100 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9100/])
Supplement to HDFS-9047. (kihwal: rev c722b62908984f8fb6ab2e0bfd40c090e8c830c7)
* BUILDING.txt


> Retire libwebhdfs
> -
>
> Key: HDFS-9047
> URL: https://issues.apache.org/jira/browse/HDFS-9047
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Allen Wittenauer
>Assignee: Haohui Mai
> Fix For: 2.8.0
>
> Attachments: HDFS-9047-branch-2.7.patch, HDFS-9047.000.patch
>
>
> This library is basically a mess:
> * It's not part of the mvn package
> * It's missing functionality and barely maintained
> * It's not in the precommit runs so doesn't get exercised regularly
> * It's not part of the unit tests (at least, that I can see)
> * It isn't documented in any official documentation
> But most importantly:  
> * It fails at it's primary mission of being pure C (HDFS-3917 is STILL open)
> Let's cut our losses and just remove it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8999) Namenode need not wait for {{blockReceived}} for the last block before completing a file.

2016-01-13 Thread Tsz Wo Nicholas Sze (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8999:
--
Attachment: h8999_20160113.patch

h8999_20160113.patch: adds a test and fixes some bugs.


> Namenode need not wait for {{blockReceived}} for the last block before 
> completing a file.
> -
>
> Key: HDFS-8999
> URL: https://issues.apache.org/jira/browse/HDFS-8999
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Jitendra Nath Pandey
>Assignee: Tsz Wo Nicholas Sze
> Attachments: h8999_20151228.patch, h8999_20160106.patch, 
> h8999_20160106b.patch, h8999_20160106c.patch, h8999_20160111.patch, 
> h8999_20160113.patch
>
>
> This comes out of a discussion in HDFS-8763. Pasting [~jingzhao]'s comment 
> from the jira:
> {quote}
> ...whether we need to let NameNode wait for all the block_received msgs to 
> announce the replica is safe. Looking into the code, now we have
># NameNode knows the DataNodes involved when initially setting up the 
> writing pipeline
># If any DataNode fails during the writing, client bumps the GS and 
> finally reports all the DataNodes included in the new pipeline to NameNode 
> through the updatePipeline RPC.
># When the client received the ack for the last packet of the block (and 
> before the client tries to close the file on NameNode), the replica has been 
> finalized in all the DataNodes.
> Then in this case, when NameNode receives the close request from the client, 
> the NameNode already knows the latest replicas for the block. Currently the 
> checkReplication call only counts in all the replicas that NN has already 
> received the block_received msg, but based on the above #2 and #3, it may be 
> safe to also count in all the replicas in the 
> BlockUnderConstructionFeature#replicas?
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)