[jira] [Resolved] (HDFS-10171) Balancer should log config values

2016-03-15 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge resolved HDFS-10171.
---
   Resolution: Duplicate
Fix Version/s: 2.8.0

{noformat}
2016-03-15 22:42:31,618 [Thread-0] INFO  balancer.Balancer 
(Balancer.java:getLong(231)) - dfs.balancer.movedWinWidth = 2000 
(default=540)
2016-03-15 22:42:31,618 [Thread-0] INFO  balancer.Balancer 
(Balancer.java:getInt(240)) - dfs.balancer.moverThreads = 1000 (default=1000)
2016-03-15 22:42:31,618 [Thread-0] INFO  balancer.Balancer 
(Balancer.java:getInt(240)) - dfs.balancer.dispatcherThreads = 200 (default=200)
2016-03-15 22:42:31,618 [Thread-0] INFO  balancer.Balancer 
(Balancer.java:getInt(240)) - dfs.datanode.balance.max.concurrent.moves = 5 
(default=5)
2016-03-15 22:42:31,618 [Thread-0] INFO  balancer.Balancer 
(Balancer.java:getLong(231)) - dfs.balancer.getBlocks.size = 2147483648 
{noformat}

> Balancer should log config values
> -
>
> Key: HDFS-10171
> URL: https://issues.apache.org/jira/browse/HDFS-10171
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.7.2
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
>
> To improve supportability, Balancer should log config values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9262) Support reconfiguring dfs.datanode.lazywriter.interval.sec without DN restart

2016-03-15 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9262:

Attachment: HDFS-9262-HDFS-9000.005.patch

V005 is rebased on trunk.

> Support reconfiguring dfs.datanode.lazywriter.interval.sec without DN restart
> -
>
> Key: HDFS-9262
> URL: https://issues.apache.org/jira/browse/HDFS-9262
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9262-HDFS-9000.002.patch, 
> HDFS-9262-HDFS-9000.003.patch, HDFS-9262-HDFS-9000.004.patch, 
> HDFS-9262-HDFS-9000.005.patch, HDFS-9262.001.patch
>
>
> This is to reconfigure
> {code}
> dfs.datanode.lazywriter.interval.sec
> {code}
> without restarting DN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9857) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-1]

2016-03-15 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196786#comment-15196786
 ] 

Zhe Zhang commented on HDFS-9857:
-

Thanks Rakesh. +1 pending Jenkins. Nice work here!

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-1]
> ---
>
> Key: HDFS-9857
> URL: https://issues.apache.org/jira/browse/HDFS-9857
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-9857-001.patch, HDFS-9857-02.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{UnderReplicatedBlocks}} to {{LowRedundancyBlocks}}
> - {{neededReplications}} to {{neededReconstruction}}
> - {{replicationQueuesInitializer}} to {{reconstructionQueuesInitializer}}
> Thanks [~zhz], [~andrew.wang] for the useful 
> [discussions|https://issues.apache.org/jira/browse/HDFS-7955?focusedCommentId=15149406=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15149406]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10171) Balancer should log config values

2016-03-15 Thread John Zhuge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Zhuge updated HDFS-10171:
--
Description: To improve supportability, Balancer should log config values.  
(was: To improve supportability, Balancer should log config values and 
iteration termination reasons.
* In {{Dispatcher$Dispatcher}}, log all parameters.
* In {{Dispatcher$dispatchBlocks}}, log termination reasons.)
Summary: Balancer should log config values  (was: Balancer should log 
config values and iteration termination reasons)

> Balancer should log config values
> -
>
> Key: HDFS-10171
> URL: https://issues.apache.org/jira/browse/HDFS-10171
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.7.2
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
>
> To improve supportability, Balancer should log config values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10171) Balancer should log config values and iteration termination reasons

2016-03-15 Thread John Zhuge (JIRA)
John Zhuge created HDFS-10171:
-

 Summary: Balancer should log config values and iteration 
termination reasons
 Key: HDFS-10171
 URL: https://issues.apache.org/jira/browse/HDFS-10171
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer & mover
Affects Versions: 2.7.2
Reporter: John Zhuge
Assignee: John Zhuge
Priority: Minor


To improve supportability, Balancer should log config values and iteration 
termination reasons.
* In {{Dispatcher$Dispatcher}}, log all parameters.
* In {{Dispatcher$dispatchBlocks}}, log termination reasons.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails

2016-03-15 Thread Lin Yiqun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196761#comment-15196761
 ] 

Lin Yiqun commented on HDFS-9904:
-

Thanks [~kihwal] for commit!

> testCheckpointCancellationDuringUpload occasionally fails 
> --
>
> Key: HDFS-9904
> URL: https://issues.apache.org/jira/browse/HDFS-9904
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Lin Yiqun
> Fix For: 2.7.3
>
> Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch
>
>
> The failure was at the end of the test case where the txid of the standby 
> (former active) is checked. Since the checkpoint/uploading was canceled , it 
> is not supposed to have the new checkpoint. Looking at the test log, that was 
> still the case, but the standby then did checkpoint on its own and bumped up 
> the txid, right before the check was performed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9857) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-1]

2016-03-15 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196759#comment-15196759
 ] 

Rakesh R commented on HDFS-9857:


Thanks [~zhz] for the reviews. I've attached patch addressing the comments.

Following are the changes done compare to the previous patch:
# Fixed 1st review comment.
# Modified variable {{underReplicatedBlocksCount}} to 
{{lowRedundancyBlocksCount}}
{code}
-  private volatile long underReplicatedBlocksCount = 0L;
+  private volatile long lowRedundancyBlocksCount = 0L;
{code}
# Modified {{neededReplications}} to {{neededReconstruction}}, 
{{under-replicated}} to {{low redundancy}} in logs/comments
# Modified BlockManager method {{#processExtraRedundancyBlocksOnReCommission}} 
to {{#processExtraRedundancyBlocksOnReCommission}}
# Few changes done in TestReplicationPolicy.java - 
{{ChooseUnderReplicatedBlocks}} to {{ChooseLowRedundancyBlocks}} and modified 
comments.

bq. fileReplication should be renamed. We can take care of it when we rename 
getExpectedReplicaNum
Since the current patch contains many changes, how about addressing couple of 
other items including this through other sub-task?

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-1]
> ---
>
> Key: HDFS-9857
> URL: https://issues.apache.org/jira/browse/HDFS-9857
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-9857-001.patch, HDFS-9857-02.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{UnderReplicatedBlocks}} to {{LowRedundancyBlocks}}
> - {{neededReplications}} to {{neededReconstruction}}
> - {{replicationQueuesInitializer}} to {{reconstructionQueuesInitializer}}
> Thanks [~zhz], [~andrew.wang] for the useful 
> [discussions|https://issues.apache.org/jira/browse/HDFS-7955?focusedCommentId=15149406=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15149406]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9857) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-1]

2016-03-15 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-9857:
---
Attachment: HDFS-9857-02.patch

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-1]
> ---
>
> Key: HDFS-9857
> URL: https://issues.apache.org/jira/browse/HDFS-9857
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-9857-001.patch, HDFS-9857-02.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{UnderReplicatedBlocks}} to {{LowRedundancyBlocks}}
> - {{neededReplications}} to {{neededReconstruction}}
> - {{replicationQueuesInitializer}} to {{reconstructionQueuesInitializer}}
> Thanks [~zhz], [~andrew.wang] for the useful 
> [discussions|https://issues.apache.org/jira/browse/HDFS-7955?focusedCommentId=15149406=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15149406]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9960) OzoneHandler : Add localstorage support for keys

2016-03-15 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196745#comment-15196745
 ] 

Chris Nauroth commented on HDFS-9960:
-

Hi [~anu].  It looks like these Checkstyle and Findbugs warnings are 
potentially relevant.  {{Hashtable}} is generally not used in favor of 
{{HashMap}} or {{ConcurrentHashMap}}, because {{Hashtable}} uses some 
coarse-grained locking that doesn't perform as well as the others.  Use of the 
platform default encoding is discouraged, because it can cause unpredictable 
behavior when code starts running on a system with an unexpected default 
encoding.  We generally try to stick to UTF-8 everywhere.  Could you please 
take a look?

> OzoneHandler : Add localstorage support for keys
> 
>
> Key: HDFS-9960
> URL: https://issues.apache.org/jira/browse/HDFS-9960
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-9960-HDFS-7240.001.patch
>
>
> Adds local storage handler support for keys. This allows all REST api's to be 
> exercised via MiniDFScluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9961) Ozone: Add buckets commands to CLI

2016-03-15 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-9961:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

+1 for the patch.  Checkstyle warnings are not actionable, and test failures 
are not related.  I have committed this to the HDFS-7240 feature branch.  
[~anu], thank you.

> Ozone: Add buckets commands to CLI
> --
>
> Key: HDFS-9961
> URL: https://issues.apache.org/jira/browse/HDFS-9961
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: HDFS-7240
>
> Attachments: HDFS-9961-HDFS-7240.001.patch
>
>
> Add command for buckets to ozone CLI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9952) Expose FSNamesystem lock wait time as metrics

2016-03-15 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-9952:

Attachment: HDFS-9952-01.patch

Attaching the patch.

> Expose FSNamesystem lock wait time as metrics
> -
>
> Key: HDFS-9952
> URL: https://issues.apache.org/jira/browse/HDFS-9952
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-9952-01.patch
>
>
> Expose FSNameSystem's readlock() and writeLock() wait time as metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9952) Expose FSNamesystem lock wait time as metrics

2016-03-15 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-9952:

Status: Patch Available  (was: Open)

> Expose FSNamesystem lock wait time as metrics
> -
>
> Key: HDFS-9952
> URL: https://issues.apache.org/jira/browse/HDFS-9952
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-9952-01.patch
>
>
> Expose FSNameSystem's readlock() and writeLock() wait time as metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9959) add log when block removed from last live datanode

2016-03-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196729#comment-15196729
 ] 

Hadoop QA commented on HDFS-9959:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 20m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
59s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 0s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 50s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
32s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 233m 50s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_74 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.TestEncryptionZones |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestHAAppend |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || 

[jira] [Updated] (HDFS-9847) HDFS configuration without time unit name should accept friendly time units

2016-03-15 Thread Lin Yiqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-9847:

Attachment: HDFS-9847.004.patch

That's a good idea. Update the latest patch for addressing comments.

> HDFS configuration without time unit name should accept friendly time units
> ---
>
> Key: HDFS-9847
> URL: https://issues.apache.org/jira/browse/HDFS-9847
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9847.001.patch, HDFS-9847.002.patch, 
> HDFS-9847.003.patch, HDFS-9847.004.patch, timeduration-w-y.patch
>
>
> In HDFS-9821, it talks about the issue of leting existing keys use friendly 
> units e.g. 60s, 5m, 1d, 6w etc. But there are som configuration key names 
> contain time unit name, like {{dfs.blockreport.intervalMsec}}, so we can make 
> some other configurations which without time unit name to accept friendly 
> time units. The time unit  {{seconds}} is frequently used in hdfs. We can 
> updating this configurations first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion

2016-03-15 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196712#comment-15196712
 ] 

John Zhuge commented on HDFS-9940:
--

I like 2) better because all rebalance configuration is done at Balancer.

We need to design a solution to support HDFS-7466 as well.

> Rename dfs.balancer.max.concurrent.moves to avoid confusion
> ---
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9917) IBR accumulate more objects when SNN was down for sometime.

2016-03-15 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196709#comment-15196709
 ] 

Brahma Reddy Battula commented on HDFS-9917:


I meant to say,we can avoid RPC to namenode and unnecessary GC for these IBR's..

> IBR accumulate more objects when SNN was down for sometime.
> ---
>
> Key: HDFS-9917
> URL: https://issues.apache.org/jira/browse/HDFS-9917
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>
> SNN was down for sometime because of some reasons..After restarting SNN,it 
> became unreponsive because 
> - 29 DN's sending IBR in each 5 million ( most of them are delete IBRs), 
> where as each datanode had only ~2.5 million blocks.
> - GC can't trigger on this objects since all will be under RPC queue. 
> To recover this( to clear this objects) ,restarted all the DN's one by 
> one..This issue happened in 2.4.1 where split of blockreport was not 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion

2016-03-15 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196705#comment-15196705
 ] 

John Zhuge commented on HDFS-9940:
--

[~yzhangal], great idea if we don't have to manually config each DN for 
rebalance. Since we already have {{hdfs dfsadmin -setBalancerBandwidth}}, we 
have 2 choices.

1) Add {{hdfs dfsadmin -setBalancerConcurrentMoves}} and {{hdfs dfsadmin 
-getBalancerConcurrentMoves}}

2) Balancer automatically calls Namenode API {{setBalancerBandwidth}} and newly 
added {{setBalancerConcurrentMoves}} based on config values (or even command 
options). Obsolete {{hdfs dfsadmin -setBalancerBandwidth}} and {{hdfs dfsadmin 
-getBalancerBandwidth}}.



> Rename dfs.balancer.max.concurrent.moves to avoid confusion
> ---
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart

2016-03-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196701#comment-15196701
 ] 

Hadoop QA commented on HDFS-9349:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
5s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
59s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 50s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 38s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 41s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
36s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 201m 4s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_74 Failed junit tests | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.tools.TestDFSAdmin |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot |
| JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.tools.TestDFSAdmin |
|   | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793658/HDFS-9349-HDFS-9000.005.patch
 |
| JIRA Issue | HDFS-9349 |
| Optional Tests |  asflicense  compile  javac  

[jira] [Updated] (HDFS-8901) Use ByteBuffer in striping positional read

2016-03-15 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-8901:

Attachment: HDFS-8901-v5.patch

Rebased.

> Use ByteBuffer in striping positional read
> --
>
> Key: HDFS-8901
> URL: https://issues.apache.org/jira/browse/HDFS-8901
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-8901-v2.patch, HDFS-8901-v3.patch, 
> HDFS-8901-v4.patch, HDFS-8901-v5.patch, initial-poc.patch
>
>
> Native erasure coder prefers to direct ByteBuffer for performance 
> consideration. To prepare for it, this change uses ByteBuffer through the 
> codes in implementing striping position read. It will also fix avoiding 
> unnecessary data copying between striping read chunk buffers and decode input 
> buffers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9917) IBR accumulate more objects when SNN was down for sometime.

2016-03-15 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196635#comment-15196635
 ] 

Brahma Reddy Battula commented on HDFS-9917:


bq. I suggest that NN could just ignore the pending IBRs before the first full 
BR. Would it fix the problem?

Yes, I think its same as clearing on reRegister() at datanode itself.
Advantage of clearing on reRegister() in DN itself, is 
unnecessary RPC will go to namenode and Namenode need to unnecessary GC for 
these IBR's..

We may also need to limit the DN keep accumulating the IBRs and use lot of 
memory

> IBR accumulate more objects when SNN was down for sometime.
> ---
>
> Key: HDFS-9917
> URL: https://issues.apache.org/jira/browse/HDFS-9917
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>
> SNN was down for sometime because of some reasons..After restarting SNN,it 
> became unreponsive because 
> - 29 DN's sending IBR in each 5 million ( most of them are delete IBRs), 
> where as each datanode had only ~2.5 million blocks.
> - GC can't trigger on this objects since all will be under RPC queue. 
> To recover this( to clear this objects) ,restarted all the DN's one by 
> one..This issue happened in 2.4.1 where split of blockreport was not 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8905) Refactor DFSInputStream#ReaderStrategy

2016-03-15 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-8905:

Attachment: HDFS-8905-v9.patch

Rebased one more time.

> Refactor DFSInputStream#ReaderStrategy
> --
>
> Key: HDFS-8905
> URL: https://issues.apache.org/jira/browse/HDFS-8905
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-8905-HDFS-7285-v1.patch, HDFS-8905-v2.patch, 
> HDFS-8905-v3.patch, HDFS-8905-v4.patch, HDFS-8905-v5.patch, 
> HDFS-8905-v6.patch, HDFS-8905-v7.patch, HDFS-8905-v8.patch, HDFS-8905-v9.patch
>
>
> DFSInputStream#ReaderStrategy family don't look very good. This refactors a 
> little bit to make them make more sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve the performance and GC friendliness of NameNode startup and full block reports

2016-03-15 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196630#comment-15196630
 ] 

Vinayakumar B commented on HDFS-9260:
-

How about bringing this into branch-2?

> Improve the performance and GC friendliness of NameNode startup and full 
> block reports
> --
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Fix For: 3.0.0
>
> Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFS-9260.017.patch, HDFS-9260.018.patch, 
> HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion

2016-03-15 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196566#comment-15196566
 ] 

Yongjun Zhang commented on HDFS-9940:
-

Hi Guys,

I wonder if this can be a balancer config only, and we don't need to set at 
datanode side. That is, when balancer starts, it reads this config, and it 
tells NN about this config, then NN can tell each datanode about this config as 
a piggyback of heartbeat response. This is similar like how 

{{final static int DNA_BALANCERBANDWIDTHUPDATE = 8; // update balancer 
bandwidth}}

works.

If what I'm proposing here works, then we can just use 
{{dfs.balancer.max.concurrent.moves}}, and user doesn't need to set DN. 

Thanks.



> Rename dfs.balancer.max.concurrent.moves to avoid confusion
> ---
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9954) Test RPC timeout fix of HADOOP-12672 against HDFS

2016-03-15 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9954:
---
Resolution: Invalid
Status: Resolved  (was: Patch Available)

It turned out that creating HDFS issue and attaching patch does not invoke HDFS 
tests. test-patch runs tests based on the contents of the patch.

> Test RPC timeout fix of HADOOP-12672 against HDFS
> -
>
> Key: HDFS-9954
> URL: https://issues.apache.org/jira/browse/HDFS-9954
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>  Labels: test
> Attachments: HDFS-9954.006.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9857) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-1]

2016-03-15 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196506#comment-15196506
 ] 

Zhe Zhang commented on HDFS-9857:
-

Seems the patch needs a small rebase. I also found a nit and a follow-on task.
# Should be {{blocksToReconstruct}}?
{code}
  blocksToReplicate = neededReconstruction
  .chooseLowRedundancyBlocks(blocksToProcess);
{code}
# {{fileReplication}} should be renamed. We can take care of it when we rename 
{{getExpectedReplicaNum}}.
{code}
short fileReplication = getExpectedReplicaNum(storedBlock);
{code}

+1 after addressing. Thanks Rakesh for the work!

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-1]
> ---
>
> Key: HDFS-9857
> URL: https://issues.apache.org/jira/browse/HDFS-9857
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-9857-001.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{UnderReplicatedBlocks}} to {{LowRedundancyBlocks}}
> - {{neededReplications}} to {{neededReconstruction}}
> - {{replicationQueuesInitializer}} to {{reconstructionQueuesInitializer}}
> Thanks [~zhz], [~andrew.wang] for the useful 
> [discussions|https://issues.apache.org/jira/browse/HDFS-7955?focusedCommentId=15149406=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15149406]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart

2016-03-15 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196503#comment-15196503
 ] 

Arpit Agarwal commented on HDFS-9349:
-

Thank you for updating the patch [~xiaobingo]. The synchronization needs more 
work as the caller of {{getProtectedDirectories}} assumes the set will not be 
modified. Modifications to {{protectedDirectories}} will be rare so let's just 
make it a volatile reference and {{setProtectedDirectories}} can replace the 
reference atomically with a newly constructed set. Also you can use 
{{parseProtectedDirectories}} to construct the new set.

> Support reconfiguring fs.protected.directories without NN restart
> -
>
> Key: HDFS-9349
> URL: https://issues.apache.org/jira/browse/HDFS-9349
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9349-HDFS-9000.003.patch, 
> HDFS-9349-HDFS-9000.004.patch, HDFS-9349-HDFS-9000.005.patch, 
> HDFS-9349.001.patch, HDFS-9349.002.patch
>
>
> This is to reconfigure
> {code}
> fs.protected.directories
> {code}
> without restarting NN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion

2016-03-15 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196482#comment-15196482
 ] 

John Zhuge commented on HDFS-9940:
--

Balancer uses config {{dfs.datanode.balance.max.concurrent.moves}} to set field 
{{Dispatcher$maxConcurrentMovesPerNode}} that sets the size of the thread pool 
{{moveExecutor}} local to {{Dispatcher$executePendingMove}}.
* If the value is higher than {{dfs.datanode.balance.max.concurrent.moves}} on 
the Datanode, Balancer may send more requests than DN can handle; DN will log 
"Not able to copy block ... because threads quota is exceeded" and return ERROR 
to Balancer. Thus some Balancer threads are wasted.
* if the value is smaller, the potential of the DN is not reached.
I can understand the original author's decision to use the same config name.

How about {{dfs.balancer.max.concurrent.moves.per.datanode}}?

> Rename dfs.balancer.max.concurrent.moves to avoid confusion
> ---
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-8210) Ozone: Implement storage container manager

2016-03-15 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth reassigned HDFS-8210:
---

Assignee: Chris Nauroth  (was: Jitendra Nath Pandey)

> Ozone: Implement storage container manager 
> ---
>
> Key: HDFS-8210
> URL: https://issues.apache.org/jira/browse/HDFS-8210
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Jitendra Nath Pandey
>Assignee: Chris Nauroth
> Attachments: HDFS-8210-HDFS-7240.1.patch, 
> HDFS-8210-HDFS-7240.2.patch, HDFS-8210-HDFS-7240.3.patch, 
> HDFS-8210-HDFS-7240.4.patch, HDFS-8210-HDFS-7240.5.patch
>
>
> The storage container manager collects datanode heartbeats, manages 
> replication and exposes API to lookup containers. This jira implements 
> storage container manager by re-using the block manager implementation in 
> namenode. This jira provides initial implementation that works with 
> datanodes. The additional protocols will be added in subsequent jiras.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9956) LDAP PERFORMANCE ISSUE AND FAIL OVER

2016-03-15 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196443#comment-15196443
 ] 

Allen Wittenauer commented on HDFS-9956:


is a naming services caching daemon being used or is this just a raw LDAP 
connection?

> LDAP PERFORMANCE ISSUE AND FAIL OVER
> 
>
> Key: HDFS-9956
> URL: https://issues.apache.org/jira/browse/HDFS-9956
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: sanjay kenganahalli vamanna
>
> The typical LDAP group name resolution works well under typical scenarios. 
> However, we have seen cases where a user is mapped to many groups (in an 
> extreme case, a user is mapped to more than 100 groups). The way it's being 
> implemented now makes this case super slow resolving groups from 
> ActiveDirectory and making the namenode to failover.
> Instead of failover, we can use the 
> parameter(ha.zookeeper.session-timeout.ms) in the getgroups method to 
> time-out and send the failed response back to the user so that we can prevent 
> name node failover. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart

2016-03-15 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-9349:

Attachment: HDFS-9349-HDFS-9000.005.patch

> Support reconfiguring fs.protected.directories without NN restart
> -
>
> Key: HDFS-9349
> URL: https://issues.apache.org/jira/browse/HDFS-9349
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9349-HDFS-9000.003.patch, 
> HDFS-9349-HDFS-9000.004.patch, HDFS-9349-HDFS-9000.005.patch, 
> HDFS-9349.001.patch, HDFS-9349.002.patch
>
>
> This is to reconfigure
> {code}
> fs.protected.directories
> {code}
> without restarting NN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9349) Support reconfiguring fs.protected.directories without NN restart

2016-03-15 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196437#comment-15196437
 ] 

Xiaobing Zhou commented on HDFS-9349:
-

V005 is rebased on trunk.

> Support reconfiguring fs.protected.directories without NN restart
> -
>
> Key: HDFS-9349
> URL: https://issues.apache.org/jira/browse/HDFS-9349
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9349-HDFS-9000.003.patch, 
> HDFS-9349-HDFS-9000.004.patch, HDFS-9349.001.patch, HDFS-9349.002.patch
>
>
> This is to reconfigure
> {code}
> fs.protected.directories
> {code}
> without restarting NN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion

2016-03-15 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196389#comment-15196389
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9940:
---

dfs.balancer.max.concurrent.moves is also confusing since 
"max.concurrent.moves" is per datanode.

Ignoring incompatibility for a moment, what are the best names for these two 
properties?

> Rename dfs.balancer.max.concurrent.moves to avoid confusion
> ---
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9668) Optimize the locking in FsDatasetImpl

2016-03-15 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196374#comment-15196374
 ] 

Arpit Agarwal commented on HDFS-9668:
-

bq. It means one slow operation of finalizeBlock, addBlock and createRbw in a 
slow storage can block all the other same operations in the same DataNode, 
especially in HBase when many wal/flusher/compactor are configured.
Detecting slow disks is a known problem for DataNodes. If this problem does not 
manifest in regular operation perhaps we should try to add slow disk detection 
instead.

> Optimize the locking in FsDatasetImpl
> -
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and all other files are stored in HDD), we observe many 
> long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part 
> of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48521 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread 
> t@93336
>java.lang.Thread.State: BLOCKED
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:)
>   - waiting to lock <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by 
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
>   
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread 
> t@93335
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140)
>   - locked <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
> {noformat}
> We measured the execution of some operations in FsDatasetImpl during the 
> test. Here following is the result.
> !execution_time.png!
> The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy 
> load take a really long time.
> It means one slow operation of finalizeBlock, addBlock and createRbw in a 
> slow storage can block all the other same operations in the same DataNode, 
> especially in HBase when many wal/flusher/compactor are configured.
> We need a finer grained lock mechanism in a new FsDatasetImpl implementation 
> and users can choose the implementation by configuring 
> "dfs.datanode.fsdataset.factory" in DataNode.
> We can implement the lock by 

[jira] [Commented] (HDFS-9940) Rename dfs.balancer.max.concurrent.moves to avoid confusion

2016-03-15 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196363#comment-15196363
 ] 

John Zhuge commented on HDFS-9940:
--

+1 HDFS-7466

> Rename dfs.balancer.max.concurrent.moves to avoid confusion
> ---
>
> Key: HDFS-9940
> URL: https://issues.apache.org/jira/browse/HDFS-9940
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer & mover
>Affects Versions: 2.6.0
>Reporter: John Zhuge
>Assignee: John Zhuge
>Priority: Minor
>  Labels: supportability
> Fix For: 2.8.0
>
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7285) Erasure Coding Support inside HDFS

2016-03-15 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers reassigned HDFS-7285:


Assignee: Zhe Zhang  (was: Matt Hardy)

> Erasure Coding Support inside HDFS
> --
>
> Key: HDFS-7285
> URL: https://issues.apache.org/jira/browse/HDFS-7285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Weihua Jiang
>Assignee: Zhe Zhang
> Fix For: 3.0.0
>
> Attachments: Compare-consolidated-20150824.diff, 
> Consolidated-20150707.patch, Consolidated-20150806.patch, 
> Consolidated-20150810.patch, ECAnalyzer.py, ECParser.py, 
> HDFS-7285-Consolidated-20150911.patch, HDFS-7285-initial-PoC.patch, 
> HDFS-7285-merge-consolidated-01.patch, 
> HDFS-7285-merge-consolidated-trunk-01.patch, 
> HDFS-7285-merge-consolidated.trunk.03.patch, 
> HDFS-7285-merge-consolidated.trunk.04.patch, 
> HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, 
> HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, 
> HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, 
> HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, 
> HDFSErasureCodingSystemTestPlan-20150824.pdf, 
> HDFSErasureCodingSystemTestReport-20150826.pdf, fsimage-analysis-20150105.pdf
>
>
> Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice 
> of data reliability, comparing to the existing HDFS 3-replica approach. For 
> example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, 
> with storage overhead only being 40%. This makes EC a quite attractive 
> alternative for big data storage, particularly for cold data. 
> Facebook had a related open source project called HDFS-RAID. It used to be 
> one of the contribute packages in HDFS but had been removed since Hadoop 2.0 
> for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends 
> on MapReduce to do encoding and decoding tasks; 2) it can only be used for 
> cold files that are intended not to be appended anymore; 3) the pure Java EC 
> coding implementation is extremely slow in practical use. Due to these, it 
> might not be a good idea to just bring HDFS-RAID back.
> We (Intel and Cloudera) are working on a design to build EC into HDFS that 
> gets rid of any external dependencies, makes it self-contained and 
> independently maintained. This design lays the EC feature on the storage type 
> support and considers compatible with existing HDFS features like caching, 
> snapshot, encryption, high availability and etc. This design will also 
> support different EC coding schemes, implementations and policies for 
> different deployment scenarios. By utilizing advanced libraries (e.g. Intel 
> ISA-L library), an implementation can greatly improve the performance of EC 
> encoding/decoding and makes the EC solution even more attractive. We will 
> post the design document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7285) Erasure Coding Support inside HDFS

2016-03-15 Thread Matt Hardy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Hardy reassigned HDFS-7285:


Assignee: Matt Hardy  (was: Zhe Zhang)

> Erasure Coding Support inside HDFS
> --
>
> Key: HDFS-7285
> URL: https://issues.apache.org/jira/browse/HDFS-7285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Weihua Jiang
>Assignee: Matt Hardy
> Fix For: 3.0.0
>
> Attachments: Compare-consolidated-20150824.diff, 
> Consolidated-20150707.patch, Consolidated-20150806.patch, 
> Consolidated-20150810.patch, ECAnalyzer.py, ECParser.py, 
> HDFS-7285-Consolidated-20150911.patch, HDFS-7285-initial-PoC.patch, 
> HDFS-7285-merge-consolidated-01.patch, 
> HDFS-7285-merge-consolidated-trunk-01.patch, 
> HDFS-7285-merge-consolidated.trunk.03.patch, 
> HDFS-7285-merge-consolidated.trunk.04.patch, 
> HDFS-EC-Merge-PoC-20150624.patch, HDFS-EC-merge-consolidated-01.patch, 
> HDFS-bistriped.patch, HDFSErasureCodingDesign-20141028.pdf, 
> HDFSErasureCodingDesign-20141217.pdf, HDFSErasureCodingDesign-20150204.pdf, 
> HDFSErasureCodingDesign-20150206.pdf, HDFSErasureCodingPhaseITestPlan.pdf, 
> HDFSErasureCodingSystemTestPlan-20150824.pdf, 
> HDFSErasureCodingSystemTestReport-20150826.pdf, fsimage-analysis-20150105.pdf
>
>
> Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice 
> of data reliability, comparing to the existing HDFS 3-replica approach. For 
> example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, 
> with storage overhead only being 40%. This makes EC a quite attractive 
> alternative for big data storage, particularly for cold data. 
> Facebook had a related open source project called HDFS-RAID. It used to be 
> one of the contribute packages in HDFS but had been removed since Hadoop 2.0 
> for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends 
> on MapReduce to do encoding and decoding tasks; 2) it can only be used for 
> cold files that are intended not to be appended anymore; 3) the pure Java EC 
> coding implementation is extremely slow in practical use. Due to these, it 
> might not be a good idea to just bring HDFS-RAID back.
> We (Intel and Cloudera) are working on a design to build EC into HDFS that 
> gets rid of any external dependencies, makes it self-contained and 
> independently maintained. This design lays the EC feature on the storage type 
> support and considers compatible with existing HDFS features like caching, 
> snapshot, encryption, high availability and etc. This design will also 
> support different EC coding schemes, implementations and policies for 
> different deployment scenarios. By utilizing advanced libraries (e.g. Intel 
> ISA-L library), an implementation can greatly improve the performance of EC 
> encoding/decoding and makes the EC solution even more attractive. We will 
> post the design document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9959) add log when block removed from last live datanode

2016-03-15 Thread yunjiong zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yunjiong zhao updated HDFS-9959:

Attachment: HDFS-9959.1.patch

Update patch:
1. log after release the write lock
2. change error to info.

> add log when block removed from last live datanode
> --
>
> Key: HDFS-9959
> URL: https://issues.apache.org/jira/browse/HDFS-9959
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
>Priority: Minor
> Attachments: HDFS-9959.1.patch, HDFS-9959.patch
>
>
> Add logs like "BLOCK* No live nodes contain block blk_1073741825_1001, last 
> datanode contain it is node: 127.0.0.1:65341" in BlockStateChange should help 
> to identify which datanode should be fixed first to recover missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9668) Optimize the locking in FsDatasetImpl

2016-03-15 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196237#comment-15196237
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9668:
---

> Currently, read operations have no special advantage over write operations. 
> Using a reader/writer lock changes that. ...

[~cmccabe], good point.  We probably should use a fair lock.  Or we could add a 
conf similar to dfs.namenode.fslock.fair.

[~jingcheng...@intel.com], thanks for working on this.  The idea sounds good.  
One issue concern me about this is: how could we make sure that the 
synchronization is correct, especially outside the class?  The current patch 
only changes FsDatasetImpl but the FsDatasetImpl object is also synchronized in 
other classes such as FsVolumeImpl.

> Optimize the locking in FsDatasetImpl
> -
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and all other files are stored in HDD), we observe many 
> long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part 
> of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48521 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread 
> t@93336
>java.lang.Thread.State: BLOCKED
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:)
>   - waiting to lock <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by 
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
>   
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread 
> t@93335
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140)
>   - locked <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
> {noformat}
> We measured the execution of some operations in FsDatasetImpl during the 
> test. Here following is the result.
> !execution_time.png!
> The operations of finalizeBlock, addBlock and createRbw on HDD in a heavy 
> load take a really long time.
> It means one slow operation of finalizeBlock, addBlock and createRbw in a 
> slow storage can block all the other same operations in the same DataNode, 
> especially in HBase when many wal/flusher/compactor are configured.
> 

[jira] [Commented] (HDFS-9917) IBR accumulate more objects when SNN was down for sometime.

2016-03-15 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196158#comment-15196158
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9917:
---

> Before Full BR, all pending IBRs will be flushed. ...

Yes, this is the current problem.  I suggest that NN could just ignore the 
pending IBRs before the first full BR.  Would it fix the problem?

> IBR accumulate more objects when SNN was down for sometime.
> ---
>
> Key: HDFS-9917
> URL: https://issues.apache.org/jira/browse/HDFS-9917
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>
> SNN was down for sometime because of some reasons..After restarting SNN,it 
> became unreponsive because 
> - 29 DN's sending IBR in each 5 million ( most of them are delete IBRs), 
> where as each datanode had only ~2.5 million blocks.
> - GC can't trigger on this objects since all will be under RPC queue. 
> To recover this( to clear this objects) ,restarted all the DN's one by 
> one..This issue happened in 2.4.1 where split of blockreport was not 
> available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10170) DiskBalancer: Force rebase diskbalancer branch

2016-03-15 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195995#comment-15195995
 ] 

Arpit Agarwal commented on HDFS-10170:
--

+1 for force rebase to make DiskBalancer testable on Mac. I suspect we'd have 
hit a similar issue on Windows.

> DiskBalancer: Force rebase diskbalancer branch
> --
>
> Key: HDFS-10170
> URL: https://issues.apache.org/jira/browse/HDFS-10170
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Minor
> Fix For: HDFS-1312
>
>
> In one of patches we renamed – DiskbalancerException.java to 
> DiskBalancerException.java. The only change was the small b ==> B, This 
> causes issues on a Mac where the file system may not be case sensitive.
> So when you clone the repo, git ends up creating DiskbalanceException.java 
> with a small letter ‘b’  and tries to rename it to big letter. However on a 
> Mac it fails and we get java files where the class name is different from the 
> file name.
> We can fix this issue by re-writing the git history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8457) Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into parent interface

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8457:

Fix Version/s: (was: HDFS-7240)

> Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into 
> parent interface
> -
>
> Key: HDFS-8457
> URL: https://issues.apache.org/jira/browse/HDFS-8457
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8457-HDFS-7240.01.patch, 
> HDFS-8457-HDFS-7240.02.patch, HDFS-8457-HDFS-7240.03.patch, 
> HDFS-8457-HDFS-7240.04.patch, HDFS-8457-HDFS-7240.05.patch, 
> HDFS-8457-HDFS-7240.06.patch, HDFS-8457-HDFS-7240.07.patch
>
>
> FsDatasetSpi can be split up into HDFS-specific and HDFS-agnostic parts. The 
> HDFS-specific parts can continue to be retained in FsDataSpi while those 
> relating to volume management, block pools and upgrade can be moved to a 
> parent interface.
> There will be no change to implementations of FsDatasetSpi.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8661) DataNode should filter the set of NameSpaceInfos passed to Datasets

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8661:

Fix Version/s: (was: HDFS-7240)

> DataNode should filter the set of NameSpaceInfos passed to Datasets
> ---
>
> Key: HDFS-8661
> URL: https://issues.apache.org/jira/browse/HDFS-8661
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-7240
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8661-HDFS-7240.01.patch, 
> HDFS-8661-HDFS-7240.02.patch, HDFS-8661-HDFS-7240.03.patch, 
> HDFS-8661-HDFS-7240.04.patch, v03-v04.diff
>
>
> {{DataNode#refreshVolumes}} passes the list of NamespaceInfos to each dataset 
> when adding new volumes.
> This list should be filtered by the correct NodeType(s) for each dataset. 
> e.g. in a shared HDFS+Ozone cluster, FsDatasets would be notified of NN block 
> pools and Ozone datasets would be notified of Ozone block pool(s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8679) Move DatasetSpi to new package

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8679:

Fix Version/s: (was: HDFS-7240)

> Move DatasetSpi to new package
> --
>
> Key: HDFS-8679
> URL: https://issues.apache.org/jira/browse/HDFS-8679
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8679-HDFS-7240.01.patch, 
> HDFS-8679-HDFS-7240.02.patch
>
>
> The DatasetSpi and VolumeSpi interfaces are currently in 
> {{org.apache.hadoop.hdfs.server.datanode.fsdataset}}. They can be moved to a 
> new package {{org.apache.hadoop.hdfs.server.datanode.dataset}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8392) DataNode support for multiple datasets

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8392:

Fix Version/s: (was: HDFS-7240)

> DataNode support for multiple datasets
> --
>
> Key: HDFS-8392
> URL: https://issues.apache.org/jira/browse/HDFS-8392
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8392-HDFS-7240.01.patch, 
> HDFS-8392-HDFS-7240.02.patch, HDFS-8392-HDFS-7240.03.patch
>
>
> For HDFS-7240 we would like to share available DataNode storage across HDFS 
> blocks and Ozone objects.
> The DataNode already supports sharing available storage across multiple block 
> pool IDs for the federation feature. However all federated block pools use 
> the same dataset implementation i.e. {{FsDatasetImpl}}.
> We can extend the DataNode to support multiple dataset implementations so the 
> same storage space can be shared across one or more HDFS block pools and one 
> or more Ozone block pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8677) Ozone: Introduce KeyValueContainerDatasetSpi

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-8677.
-
Resolution: Fixed

We'll revisit FsDataset changes later.

> Ozone: Introduce KeyValueContainerDatasetSpi
> 
>
> Key: HDFS-8677
> URL: https://issues.apache.org/jira/browse/HDFS-8677
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8677-HDFS-7240.01.patch, 
> HDFS-8677-HDFS-7240.02.patch, HDFS-8677-HDFS-7240.03.patch, 
> HDFS-8677-HDFS-7240.04.patch, HDFS-8677-HDFS-7240.05.patch
>
>
> KeyValueContainerDatasetSpi will be a new interface for Ozone containers, 
> just as FsDatasetSpi is an interface for manipulating HDFS block files.
> The interface will have support for both key-value containers for storing 
> Ozone metadata and blobs for storing user data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8679) Move DatasetSpi to new package

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-8679.
-
Resolution: Later

We'll revisit FsDataset changes later.

> Move DatasetSpi to new package
> --
>
> Key: HDFS-8679
> URL: https://issues.apache.org/jira/browse/HDFS-8679
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: HDFS-7240
>
> Attachments: HDFS-8679-HDFS-7240.01.patch, 
> HDFS-8679-HDFS-7240.02.patch
>
>
> The DatasetSpi and VolumeSpi interfaces are currently in 
> {{org.apache.hadoop.hdfs.server.datanode.fsdataset}}. They can be moved to a 
> new package {{org.apache.hadoop.hdfs.server.datanode.dataset}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8661) DataNode should filter the set of NameSpaceInfos passed to Datasets

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-8661.
-
Resolution: Later

We'll revisit FsDataset changes later.

> DataNode should filter the set of NameSpaceInfos passed to Datasets
> ---
>
> Key: HDFS-8661
> URL: https://issues.apache.org/jira/browse/HDFS-8661
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-7240
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: HDFS-7240
>
> Attachments: HDFS-8661-HDFS-7240.01.patch, 
> HDFS-8661-HDFS-7240.02.patch, HDFS-8661-HDFS-7240.03.patch, 
> HDFS-8661-HDFS-7240.04.patch, v03-v04.diff
>
>
> {{DataNode#refreshVolumes}} passes the list of NamespaceInfos to each dataset 
> when adding new volumes.
> This list should be filtered by the correct NodeType(s) for each dataset. 
> e.g. in a shared HDFS+Ozone cluster, FsDatasets would be notified of NN block 
> pools and Ozone datasets would be notified of Ozone block pool(s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-8679) Move DatasetSpi to new package

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDFS-8679:
-

> Move DatasetSpi to new package
> --
>
> Key: HDFS-8679
> URL: https://issues.apache.org/jira/browse/HDFS-8679
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: HDFS-7240
>
> Attachments: HDFS-8679-HDFS-7240.01.patch, 
> HDFS-8679-HDFS-7240.02.patch
>
>
> The DatasetSpi and VolumeSpi interfaces are currently in 
> {{org.apache.hadoop.hdfs.server.datanode.fsdataset}}. They can be moved to a 
> new package {{org.apache.hadoop.hdfs.server.datanode.dataset}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-8677) Ozone: Introduce KeyValueContainerDatasetSpi

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDFS-8677:
-

> Ozone: Introduce KeyValueContainerDatasetSpi
> 
>
> Key: HDFS-8677
> URL: https://issues.apache.org/jira/browse/HDFS-8677
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-8677-HDFS-7240.01.patch, 
> HDFS-8677-HDFS-7240.02.patch, HDFS-8677-HDFS-7240.03.patch, 
> HDFS-8677-HDFS-7240.04.patch, HDFS-8677-HDFS-7240.05.patch
>
>
> KeyValueContainerDatasetSpi will be a new interface for Ozone containers, 
> just as FsDatasetSpi is an interface for manipulating HDFS block files.
> The interface will have support for both key-value containers for storing 
> Ozone metadata and blobs for storing user data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8457) Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into parent interface

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-8457.
-
Resolution: Later

We'll revisit FsDataset changes later.

> Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into 
> parent interface
> -
>
> Key: HDFS-8457
> URL: https://issues.apache.org/jira/browse/HDFS-8457
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: HDFS-7240
>
> Attachments: HDFS-8457-HDFS-7240.01.patch, 
> HDFS-8457-HDFS-7240.02.patch, HDFS-8457-HDFS-7240.03.patch, 
> HDFS-8457-HDFS-7240.04.patch, HDFS-8457-HDFS-7240.05.patch, 
> HDFS-8457-HDFS-7240.06.patch, HDFS-8457-HDFS-7240.07.patch
>
>
> FsDatasetSpi can be split up into HDFS-specific and HDFS-agnostic parts. The 
> HDFS-specific parts can continue to be retained in FsDataSpi while those 
> relating to volume management, block pools and upgrade can be moved to a 
> parent interface.
> There will be no change to implementations of FsDatasetSpi.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-8661) DataNode should filter the set of NameSpaceInfos passed to Datasets

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDFS-8661:
-

> DataNode should filter the set of NameSpaceInfos passed to Datasets
> ---
>
> Key: HDFS-8661
> URL: https://issues.apache.org/jira/browse/HDFS-8661
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-7240
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: HDFS-7240
>
> Attachments: HDFS-8661-HDFS-7240.01.patch, 
> HDFS-8661-HDFS-7240.02.patch, HDFS-8661-HDFS-7240.03.patch, 
> HDFS-8661-HDFS-7240.04.patch, v03-v04.diff
>
>
> {{DataNode#refreshVolumes}} passes the list of NamespaceInfos to each dataset 
> when adding new volumes.
> This list should be filtered by the correct NodeType(s) for each dataset. 
> e.g. in a shared HDFS+Ozone cluster, FsDatasets would be notified of NN block 
> pools and Ozone datasets would be notified of Ozone block pool(s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8392) DataNode support for multiple datasets

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-8392.
-
Resolution: Later

We'll revisit FsDataset changes later.

> DataNode support for multiple datasets
> --
>
> Key: HDFS-8392
> URL: https://issues.apache.org/jira/browse/HDFS-8392
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: HDFS-7240
>
> Attachments: HDFS-8392-HDFS-7240.01.patch, 
> HDFS-8392-HDFS-7240.02.patch, HDFS-8392-HDFS-7240.03.patch
>
>
> For HDFS-7240 we would like to share available DataNode storage across HDFS 
> blocks and Ozone objects.
> The DataNode already supports sharing available storage across multiple block 
> pool IDs for the federation feature. However all federated block pools use 
> the same dataset implementation i.e. {{FsDatasetImpl}}.
> We can extend the DataNode to support multiple dataset implementations so the 
> same storage space can be shared across one or more HDFS block pools and one 
> or more Ozone block pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-8457) Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into parent interface

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDFS-8457:
-

> Ozone: Refactor FsDatasetSpi to pull up HDFS-agnostic functionality into 
> parent interface
> -
>
> Key: HDFS-8457
> URL: https://issues.apache.org/jira/browse/HDFS-8457
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: HDFS-7240
>
> Attachments: HDFS-8457-HDFS-7240.01.patch, 
> HDFS-8457-HDFS-7240.02.patch, HDFS-8457-HDFS-7240.03.patch, 
> HDFS-8457-HDFS-7240.04.patch, HDFS-8457-HDFS-7240.05.patch, 
> HDFS-8457-HDFS-7240.06.patch, HDFS-8457-HDFS-7240.07.patch
>
>
> FsDatasetSpi can be split up into HDFS-specific and HDFS-agnostic parts. The 
> HDFS-specific parts can continue to be retained in FsDataSpi while those 
> relating to volume management, block pools and upgrade can be moved to a 
> parent interface.
> There will be no change to implementations of FsDatasetSpi.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-8392) DataNode support for multiple datasets

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDFS-8392:
-

> DataNode support for multiple datasets
> --
>
> Key: HDFS-8392
> URL: https://issues.apache.org/jira/browse/HDFS-8392
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Fix For: HDFS-7240
>
> Attachments: HDFS-8392-HDFS-7240.01.patch, 
> HDFS-8392-HDFS-7240.02.patch, HDFS-8392-HDFS-7240.03.patch
>
>
> For HDFS-7240 we would like to share available DataNode storage across HDFS 
> blocks and Ozone objects.
> The DataNode already supports sharing available storage across multiple block 
> pool IDs for the federation feature. However all federated block pools use 
> the same dataset implementation i.e. {{FsDatasetImpl}}.
> We can extend the DataNode to support multiple dataset implementations so the 
> same storage space can be shared across one or more HDFS block pools and one 
> or more Ozone block pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-03-15 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195976#comment-15195976
 ] 

Andrew Wang commented on HDFS-3702:
---

Since we're just adding a flag to the existing {{create}} flags enumset, it 
doesn't affect our API signature. Note there are no changes in FileSystem or 
DistributedFileSystem. It also doesn't involve any NN memory overhead, which is 
a nice bonus compared to a storage policy with xattrs.

I also like this scheme also since it gives us a lot of flexibility at the 
application level. For example, applications like distcp or the httpfs and nfs 
gateway might always want this flag on (no matter the destination folder), to 
avoid data load imbalance. For HBase's WAL, it would give them the flexibility 
to redo their filesystem layout, for instance if all WALs no longer go in a 
single "/logs" directory.

Overall, it feels a lot like Linux-y filesystem hints like fadvise / madvise, 
and a good use of flags.

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8210) Ozone: Implement storage container manager

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-8210:

Fix Version/s: (was: HDFS-7240)

> Ozone: Implement storage container manager 
> ---
>
> Key: HDFS-8210
> URL: https://issues.apache.org/jira/browse/HDFS-8210
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-8210-HDFS-7240.1.patch, 
> HDFS-8210-HDFS-7240.2.patch, HDFS-8210-HDFS-7240.3.patch, 
> HDFS-8210-HDFS-7240.4.patch, HDFS-8210-HDFS-7240.5.patch
>
>
> The storage container manager collects datanode heartbeats, manages 
> replication and exposes API to lookup containers. This jira implements 
> storage container manager by re-using the block manager implementation in 
> namenode. This jira provides initial implementation that works with 
> datanodes. The additional protocols will be added in subsequent jiras.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-8210) Ozone: Implement storage container manager

2016-03-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reopened HDFS-8210:
-

We've rebased the branch and this change was left out. Reopening the Jira.

> Ozone: Implement storage container manager 
> ---
>
> Key: HDFS-8210
> URL: https://issues.apache.org/jira/browse/HDFS-8210
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Fix For: HDFS-7240
>
> Attachments: HDFS-8210-HDFS-7240.1.patch, 
> HDFS-8210-HDFS-7240.2.patch, HDFS-8210-HDFS-7240.3.patch, 
> HDFS-8210-HDFS-7240.4.patch, HDFS-8210-HDFS-7240.5.patch
>
>
> The storage container manager collects datanode heartbeats, manages 
> replication and exposes API to lookup containers. This jira implements 
> storage container manager by re-using the block manager implementation in 
> namenode. This jira provides initial implementation that works with 
> datanodes. The additional protocols will be added in subsequent jiras.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-10170) DiskBalancer: Force rebase diskbalancer branch

2016-03-15 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer resolved HDFS-10170.
-
Resolution: Fixed

> DiskBalancer: Force rebase diskbalancer branch
> --
>
> Key: HDFS-10170
> URL: https://issues.apache.org/jira/browse/HDFS-10170
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer & mover
>Affects Versions: HDFS-1312
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Minor
> Fix For: HDFS-1312
>
>
> In one of patches we renamed – DiskbalancerException.java to 
> DiskBalancerException.java. The only change was the small b ==> B, This 
> causes issues on a Mac where the file system may not be case sensitive.
> So when you clone the repo, git ends up creating DiskbalanceException.java 
> with a small letter ‘b’  and tries to rename it to big letter. However on a 
> Mac it fails and we get java files where the class name is different from the 
> file name.
> We can fix this issue by re-writing the git history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10170) DiskBalancer: Force rebase diskbalancer branch

2016-03-15 Thread Anu Engineer (JIRA)
Anu Engineer created HDFS-10170:
---

 Summary: DiskBalancer: Force rebase diskbalancer branch
 Key: HDFS-10170
 URL: https://issues.apache.org/jira/browse/HDFS-10170
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: balancer & mover
Affects Versions: HDFS-1312
Reporter: Anu Engineer
Assignee: Anu Engineer
Priority: Minor
 Fix For: HDFS-1312


In one of patches we renamed – DiskbalancerException.java to 
DiskBalancerException.java. The only change was the small b ==> B, This causes 
issues on a Mac where the file system may not be case sensitive.

So when you clone the repo, git ends up creating DiskbalanceException.java with 
a small letter ‘b’  and tries to rename it to big letter. However on a Mac it 
fails and we get java files where the class name is different from the file 
name.

We can fix this issue by re-writing the git history.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7648) Verify the datanode directory layout

2016-03-15 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195908#comment-15195908
 ] 

Colin Patrick McCabe commented on HDFS-7648:


Hi [~rakesh_r], can you rebase the patch on trunk?

{code}
  LOG.warn("Block: " + blockId
  + " has to be upgraded to block ID-based layout");
{code}
Perhaps "Block XYZ is in the wrong directory" would be clearer?

+1 once these are addressed.

> Verify the datanode directory layout
> 
>
> Key: HDFS-7648
> URL: https://issues.apache.org/jira/browse/HDFS-7648
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Rakesh R
> Attachments: HDFS-7648-3.patch, HDFS-7648-4.patch, HDFS-7648-5.patch, 
> HDFS-7648.patch, HDFS-7648.patch
>
>
> HDFS-6482 changed datanode layout to use block ID to determine the directory 
> to store the block.  We should have some mechanism to verify it.  Either 
> DirectoryScanner or block report generation could do the check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7648) Verify that HDFS blocks are in the correct datanode directories

2016-03-15 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7648:
---
Summary: Verify that HDFS blocks are in the correct datanode directories  
(was: Verify the datanode directory layout)

> Verify that HDFS blocks are in the correct datanode directories
> ---
>
> Key: HDFS-7648
> URL: https://issues.apache.org/jira/browse/HDFS-7648
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Rakesh R
> Attachments: HDFS-7648-3.patch, HDFS-7648-4.patch, HDFS-7648-5.patch, 
> HDFS-7648.patch, HDFS-7648.patch
>
>
> HDFS-6482 changed datanode layout to use block ID to determine the directory 
> to store the block.  We should have some mechanism to verify it.  Either 
> DirectoryScanner or block report generation could do the check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-9955) DataNode won't self-heal after some block dirs were manually misplaced

2016-03-15 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe resolved HDFS-9955.

Resolution: Duplicate

> DataNode won't self-heal after some block dirs were manually misplaced
> --
>
> Key: HDFS-9955
> URL: https://issues.apache.org/jira/browse/HDFS-9955
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.6.0
> Environment: CentOS 6, Cloudera 5.4.4 (patched Hadoop 2.6.0)
>Reporter: David Watzke
>  Labels: data-integrity
>
> I have accidentally ran this tool on top of DataNode's datadirs (of a 
> datanode that was shut down at the moment): 
> https://github.com/killerwhile/volume-balancer
> The tool makes assumptions about block directory placement that are no longer 
> valid in hadoop 2.6.0 and it was just moving them around between different 
> datadirs to make the disk usage balanced. OK, it was not a good idea to run 
> it but my concern is the way the datanode was (not) handling the resulting 
> state. I've seen these messages in DN log (see below) which means DN knew 
> about this but didn't do anything to fix it (self-heal by copying the other 
> replica) - which seems like a bug to me. If you need any additional info 
> please just ask.
> {noformat}
> 2016-03-04 12:40:06,008 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
> block BP-680964103-A.B.C.D-1375882473930:blk_-3159875140074863904_0 on volume 
> /data/18/cdfs/dn
> 2016-03-04 12:40:06,009 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
> block BP-680964103-A.B.C.D-1375882473930:blk_8369468090548520777_0 on volume 
> /data/18/cdfs/dn
> 2016-03-04 12:40:06,011 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
> block BP-680964103-A.B.C.D-1375882473930:blk_1226431637_0 on volume 
> /data/18/cdfs/dn
> 2016-03-04 12:40:06,012 WARN 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: I/O error while finding 
> block BP-680964103-A.B.C.D-1375882473930:blk_1169332185_0 on volume 
> /data/18/cdfs/dn
> 2016-03-04 12:40:06,825 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> opReadBlock BP-680964103-A.B.C.D-1375882473930:blk_1226781281_1099829669050 
> received exception java.io.IOException: BlockId 1226781281 is not valid.
> 2016-03-04 12:40:06,825 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> DatanodeRegistration(X.Y.Z.30, 
> datanodeUuid=9da950ca-87ae-44ee-9391-0bca669c796b, infoPort=50075, 
> ipcPort=50020, 
> storageInfo=lv=-56;cid=cluster12;nsid=1625487778;c=1438754073236):Got 
> exception while serving 
> BP-680964103-A.B.C.D-1375882473930:blk_1226781281_1099829669050 to 
> /X.Y.Z.30:48146
> java.io.IOException: BlockId 1226781281 is not valid.
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:650)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:641)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:214)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:282)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:529)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:243)
> at java.lang.Thread.run(Thread.java:745)
> 2016-03-04 12:40:06,826 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: 
> prg04-002.xyz.tld:50010:DataXceiver error processing READ_BLOCK operation  
> src: /X.Y.Z.30:48146 dst: /X.Y.Z.30:50010
> java.io.IOException: BlockId 1226781281 is not valid.
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:650)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFile(FsDatasetImpl.java:641)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getMetaDataInputStream(FsDatasetImpl.java:214)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.(BlockSender.java:282)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:529)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
> at 
> 

[jira] [Commented] (HDFS-9951) Use string constants for XML tags in OfflineImageReconstructor

2016-03-15 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195795#comment-15195795
 ] 

Colin Patrick McCabe commented on HDFS-9951:


Thanks for working on this.  Can you put the string constants into 
{{PBImageXmlWriter.java}}?  {{OfflineImageReconstructor#Node}} is not a public 
class.

> Use string constants for XML tags in OfflineImageReconstructor
> --
>
> Key: HDFS-9951
> URL: https://issues.apache.org/jira/browse/HDFS-9951
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>Priority: Minor
> Attachments: HDFS-9551.001.patch, HDFS-9551.002.patch
>
>
> In class {{OfflineImageReconstructor}}, it uses many {{SectionProcessors}} to 
> process xml files and load the subtree of the XML into a Node structure. But 
> there are lots of places that node removes key by directively writing value 
> in methods rather than define them first. Like this:
> {code}
> Node expiration = directive.removeChild("expiration");
> {code}
> We could improve this to define them in Node and them invoked like this way:
> {code}
> Node expiration=directive.removeChild(Node.CACHE_MANAGER_SECTION_EXPIRATION);
> {code}
> And it will be good to manager node key's name in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-03-15 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195791#comment-15195791
 ] 

Arpit Agarwal commented on HDFS-3702:
-

Thanks [~stack], [~eddyxu].

It would be great if we can avoid one off {{createFile}} parameters. What do 
you think of per-target block placement policies as [proposed in this 
comment|https://issues.apache.org/jira/browse/HDFS-3702?focusedCommentId=13420775=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13420775]
 e.g. set a custom placement policy for /hbase/.logs/. The implementation will 
be easier now that we have extended attributes.

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9668) Optimize the locking in FsDatasetImpl

2016-03-15 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195782#comment-15195782
 ] 

Colin Patrick McCabe edited comment on HDFS-9668 at 3/15/16 5:48 PM:
-

Hi [~jingcheng...@intel.com],

Thanks again for your comments.  I agree that consistency is always a headache. 
 However, we are already "inconsistent" in a bunch of cases.  For example, 
{{FsDatasetSpi#getStoredBlock}} returns a Block structure with a genstamp and 
block ID.  But since it drops the lock when it returns, the genstamp may change 
in between the call to {{getStoredBlock}} and the actual usage of that 
information.

bq. But still it is hard to remove the locks in createRbw, etc where the 
long-time blocking occur. I think this is what we have to tackle in the future.

For {{createRbw}}, it seems like we could:
1. add the entry to the volumeMap
2. drop the lock and attempt to create the block file on-disk
3. if the creation failed, take back the lock and remove the entry from the 
volumeMap

Step #1 would ensure that if another thread attempted to create the same RBW 
replica, it would fail.

bq. But "synchronized" doesn't guarantee fairness, is it fair to ask lock to 
support fairness?

Currently, read operations have no special advantage over write operations.  
Using a reader/writer lock changes that.  It's easy to come up with a workload 
where read requests come in often enough so that there is no time at all for 
write requests.  This is especially true since we are doing filesystem I/O 
while holding the reader lock.  We have observed Java Reader/Writer locks to 
starve writers in practice.  That's why there is an option for the FSNamesystem 
lock to be fair.

Hmm.  I wonder if, as a first step, we could try moving all the filesystem I/O 
that we can outside the lock?  That would provide a huge performance boost just 
by itself.  And it would make it much easier to have a reader/writer lock later 
if required.


was (Author: cmccabe):
Hi [~jingcheng...@intel.com],

Thanks again for your comments.  I agree that consistency is always a headache. 
 However, we are already "inconsistent" in a bunch of cases.  For example, 
{{FsDatasetSpi#getStoredBlock}} returns a Block structure with a genstamp and 
block ID.  But since it drops the lock when it returns, the genstamp may change 
in between the call to {{getStoredBlock}} and the actual usage of that 
information.

bq. But still it is hard to remove the locks in createRbw, etc where the 
long-time blocking occur. I think this is what we have to tackle in the future.

For {{createRbw}}, it seems like we could:
1. add the entry to the volumeMap
2. drop the lock and attempt to create the block file on-disk
3. if the creation failed, take back the lock and remove the entry from the 
volumeMap

Step #1 would ensure that if another thread attempted to create the same RBW 
replica, it would fail.

bq. But "synchronized" doesn't guarantee fairness, is it fair to ask lock to 
support fairness?

Currently, read operations have no special advantage over write operations.  
Using a reader/writer lock changes that.  It's easy, even trivial, to come up 
with a workload where read requests come in often enough so that there is no 
time at all for write requests.  This is especially true since we are doing 
filesystem I/O while holding the reader lock.  We have observed Java 
Reader/Writer locks to starve writers in practice.  That's why there is an 
option for the FSNamesystem lock to be fair.

Hmm.  I wonder if, as a first step, we could try moving all the filesystem I/O 
that we can outside the lock?  That would provide a huge performance boost just 
by itself.  And it would make it much easier to have a reader/writer lock later 
if required.

> Optimize the locking in FsDatasetImpl
> -
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and all other files are stored in HDD), we observe many 
> long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part 
> of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48521 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread 
> t@93336
>java.lang.Thread.State: BLOCKED
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:)
>   - waiting to lock <18324c9> (a 
> 

[jira] [Commented] (HDFS-9668) Optimize the locking in FsDatasetImpl

2016-03-15 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195782#comment-15195782
 ] 

Colin Patrick McCabe commented on HDFS-9668:


Hi [~jingcheng...@intel.com],

Thanks again for your comments.  I agree that consistency is always a headache. 
 However, we are already "inconsistent" in a bunch of cases.  For example, 
{{FsDatasetSpi#getStoredBlock}} returns a Block structure with a genstamp and 
block ID.  But since it drops the lock when it returns, the genstamp may change 
in between the call to {{getStoredBlock}} and the actual usage of that 
information.

bq. But still it is hard to remove the locks in createRbw, etc where the 
long-time blocking occur. I think this is what we have to tackle in the future.

For {{createRbw}}, it seems like we could:
1. add the entry to the volumeMap
2. drop the lock and attempt to create the block file on-disk
3. if the creation failed, take back the lock and remove the entry from the 
volumeMap

Step #1 would ensure that if another thread attempted to create the same RBW 
replica, it would fail.

bq. But "synchronized" doesn't guarantee fairness, is it fair to ask lock to 
support fairness?

Currently, both read operations have no special advantage over write 
operations.  Using a reader/writer lock changes that.  It's easy, even trivial, 
to come up with a workload where read requests come in often enough so that 
there is no time at all for write requests.  This is especially true since we 
are doing filesystem I/O while holding the reader lock.  We have observed Java 
Reader/Writer locks to starve writers in practice.  That's why there is an 
option for the FSNamesystem lock to be fair.

Hmm.  I wonder if, as a first step, we could try moving all the filesystem I/O 
that we can outside the lock?  That would provide a huge performance boost just 
by itself.  And it would make it much easier to have a reader/writer lock later 
if required.

> Optimize the locking in FsDatasetImpl
> -
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and all other files are stored in HDD), we observe many 
> long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part 
> of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48521 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread 
> t@93336
>java.lang.Thread.State: BLOCKED
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:)
>   - waiting to lock <18324c9> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) owned by 
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" t@93335
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:113)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
>   at java.lang.Thread.run(Thread.java:745)
>Locked ownable synchronizers:
>   - None
>   
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48520 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779271_40851]" - Thread 
> t@93335
>java.lang.Thread.State: RUNNABLE
>   at java.io.UnixFileSystem.createFileExclusively(Native Method)
>   at java.io.File.createNewFile(File.java:1012)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:66)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:271)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:286)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1140)
>   - locked <18324c9> (a 
> 

[jira] [Comment Edited] (HDFS-9668) Optimize the locking in FsDatasetImpl

2016-03-15 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195782#comment-15195782
 ] 

Colin Patrick McCabe edited comment on HDFS-9668 at 3/15/16 5:47 PM:
-

Hi [~jingcheng...@intel.com],

Thanks again for your comments.  I agree that consistency is always a headache. 
 However, we are already "inconsistent" in a bunch of cases.  For example, 
{{FsDatasetSpi#getStoredBlock}} returns a Block structure with a genstamp and 
block ID.  But since it drops the lock when it returns, the genstamp may change 
in between the call to {{getStoredBlock}} and the actual usage of that 
information.

bq. But still it is hard to remove the locks in createRbw, etc where the 
long-time blocking occur. I think this is what we have to tackle in the future.

For {{createRbw}}, it seems like we could:
1. add the entry to the volumeMap
2. drop the lock and attempt to create the block file on-disk
3. if the creation failed, take back the lock and remove the entry from the 
volumeMap

Step #1 would ensure that if another thread attempted to create the same RBW 
replica, it would fail.

bq. But "synchronized" doesn't guarantee fairness, is it fair to ask lock to 
support fairness?

Currently, read operations have no special advantage over write operations.  
Using a reader/writer lock changes that.  It's easy, even trivial, to come up 
with a workload where read requests come in often enough so that there is no 
time at all for write requests.  This is especially true since we are doing 
filesystem I/O while holding the reader lock.  We have observed Java 
Reader/Writer locks to starve writers in practice.  That's why there is an 
option for the FSNamesystem lock to be fair.

Hmm.  I wonder if, as a first step, we could try moving all the filesystem I/O 
that we can outside the lock?  That would provide a huge performance boost just 
by itself.  And it would make it much easier to have a reader/writer lock later 
if required.


was (Author: cmccabe):
Hi [~jingcheng...@intel.com],

Thanks again for your comments.  I agree that consistency is always a headache. 
 However, we are already "inconsistent" in a bunch of cases.  For example, 
{{FsDatasetSpi#getStoredBlock}} returns a Block structure with a genstamp and 
block ID.  But since it drops the lock when it returns, the genstamp may change 
in between the call to {{getStoredBlock}} and the actual usage of that 
information.

bq. But still it is hard to remove the locks in createRbw, etc where the 
long-time blocking occur. I think this is what we have to tackle in the future.

For {{createRbw}}, it seems like we could:
1. add the entry to the volumeMap
2. drop the lock and attempt to create the block file on-disk
3. if the creation failed, take back the lock and remove the entry from the 
volumeMap

Step #1 would ensure that if another thread attempted to create the same RBW 
replica, it would fail.

bq. But "synchronized" doesn't guarantee fairness, is it fair to ask lock to 
support fairness?

Currently, both read operations have no special advantage over write 
operations.  Using a reader/writer lock changes that.  It's easy, even trivial, 
to come up with a workload where read requests come in often enough so that 
there is no time at all for write requests.  This is especially true since we 
are doing filesystem I/O while holding the reader lock.  We have observed Java 
Reader/Writer locks to starve writers in practice.  That's why there is an 
option for the FSNamesystem lock to be fair.

Hmm.  I wonder if, as a first step, we could try moving all the filesystem I/O 
that we can outside the lock?  That would provide a huge performance boost just 
by itself.  And it would make it much easier to have a reader/writer lock later 
if required.

> Optimize the locking in FsDatasetImpl
> -
>
> Key: HDFS-9668
> URL: https://issues.apache.org/jira/browse/HDFS-9668
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Attachments: HDFS-9668-1.patch, HDFS-9668-2.patch, execution_time.png
>
>
> During the HBase test on a tiered storage of HDFS (WAL is stored in 
> SSD/RAMDISK, and all other files are stored in HDD), we observe many 
> long-time BLOCKED threads on FsDatasetImpl in DataNode. The following is part 
> of the jstack result:
> {noformat}
> "DataXceiver for client DFSClient_NONMAPREDUCE_-1626037897_1 at 
> /192.168.50.16:48521 [Receiving block 
> BP-1042877462-192.168.50.13-1446173170517:blk_1073779272_40852]" - Thread 
> t@93336
>java.lang.Thread.State: BLOCKED
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:)
>   - waiting to lock <18324c9> (a 
> 

[jira] [Commented] (HDFS-9579) Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level

2016-03-15 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195706#comment-15195706
 ] 

Ming Ma commented on HDFS-9579:
---

Thanks [~liuml07]! Any comments from others? 

> Provide bytes-read-by-network-distance metrics at FileSystem.Statistics level
> -
>
> Key: HDFS-9579
> URL: https://issues.apache.org/jira/browse/HDFS-9579
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9579-2.patch, HDFS-9579-3.patch, HDFS-9579-4.patch, 
> HDFS-9579-5.patch, HDFS-9579-6.patch, HDFS-9579-7.patch, HDFS-9579-8.patch, 
> HDFS-9579-9.patch, HDFS-9579.patch, MR job counters.png
>
>
> For cross DC distcp or other applications, it becomes useful to have insight 
> as to the traffic volume for each network distance to distinguish cross-DC 
> traffic, local-DC-remote-rack, etc.
> FileSystem's existing {{bytesRead}} metrics tracks all the bytes read. To 
> provide additional metrics for each network distance, we can add additional 
> metrics to FileSystem level and have {{DFSInputStream}} update the value 
> based on the network distance between client and the datanode.
> {{DFSClient}} will resolve client machine's network location as part of its 
> initialization. It doesn't need to resolve datanode's network location for 
> each read as {{DatanodeInfo}} already has the info.
> There are existing HDFS specific metrics such as {{ReadStatistics}} and 
> {{DFSHedgedReadMetrics}}. But these metrics are only accessible via 
> {{DFSClient}} or {{DFSInputStream}}. Not something that application framework 
> such as MR and Tez can get to. That is the benefit of storing these new 
> metrics in FileSystem.Statistics.
> This jira only includes metrics generation by HDFS. The consumption of these 
> metrics at MR and Tez will be tracked by separated jiras.
> We can add similar metrics for HDFS write scenario later if it is necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9895) Remove DataNode#conf because there is already a copy in the base class

2016-03-15 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195689#comment-15195689
 ] 

Colin Patrick McCabe commented on HDFS-9895:


I think this JIRA was misleadingly titled.  The patch is about removing the 
reference to the {{Configuration}} object inside {{DataNode.java}}, since we 
have a reference to the exact same configuration object in the base class.  It 
doesn't change which aspects of the configuration we cache.

I don't think this affects the thread-safety of anything, or the 
reconfiguration logic.  Just like before the patch, {{DataNode.java}} is still 
playing with a reference to a thread-safe (but mutable) Configuration.  
Reconfiguration still happens by means of the {{ReconfigurationThread}} 
invoking the {{reconfigureProperty}} method.

> Remove DataNode#conf because there is already a copy in the base class
> --
>
> Key: HDFS-9895
> URL: https://issues.apache.org/jira/browse/HDFS-9895
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9895.000.patch
>
>
> Since DataNode inherits ReconfigurableBase with Configured as base class 
> where configuration is maintained, DataNode#conf should be removed for the 
> purpose of brevity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9895) Remove DataNode#conf because there is already a reference to it in the base class

2016-03-15 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9895:
---
Summary: Remove DataNode#conf because there is already a reference to it in 
the base class  (was: Remove DataNode#conf because there is already a copy in 
the base class)

> Remove DataNode#conf because there is already a reference to it in the base 
> class
> -
>
> Key: HDFS-9895
> URL: https://issues.apache.org/jira/browse/HDFS-9895
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9895.000.patch
>
>
> Since DataNode inherits ReconfigurableBase with Configured as base class 
> where configuration is maintained, DataNode#conf should be removed for the 
> purpose of brevity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9931) Remove NameNode#conf because there is already a reference to it in the base class

2016-03-15 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9931:
---
Summary: Remove NameNode#conf because there is already a reference to it in 
the base class  (was: Remove all cached configuration from NameNode)

> Remove NameNode#conf because there is already a reference to it in the base 
> class
> -
>
> Key: HDFS-9931
> URL: https://issues.apache.org/jira/browse/HDFS-9931
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
>
> Since NameNode inherits ReconfigurableBase with Configured as base class 
> where configuration is maintained, all cached configurations in NameNode 
> should be removed for brevity and consistency purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9895) Remove DataNode#conf because there is already a reference to it in the base class

2016-03-15 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195689#comment-15195689
 ] 

Colin Patrick McCabe edited comment on HDFS-9895 at 3/15/16 5:13 PM:
-

I think this JIRA was misleadingly titled.  The patch is about removing the 
reference to the {{Configuration}} object inside {{DataNode.java}}, since we 
have a reference to the exact same configuration object in the base class.  It 
doesn't change which aspects of the configuration we cache.

I don't think this affects the thread-safety of anything, or the 
reconfiguration logic.  Just like before the patch, {{DataNode.java}} is still 
playing with a reference to a thread-safe (but mutable) Configuration.  
Reconfiguration still happens by means of the {{ReconfigurationThread}} 
invoking the {{reconfigureProperty}} method.  There is no case where as "swap a 
configuration instance"-- the {{Configured}} base class doesn't support 
swapping in a new object anyway.


was (Author: cmccabe):
I think this JIRA was misleadingly titled.  The patch is about removing the 
reference to the {{Configuration}} object inside {{DataNode.java}}, since we 
have a reference to the exact same configuration object in the base class.  It 
doesn't change which aspects of the configuration we cache.

I don't think this affects the thread-safety of anything, or the 
reconfiguration logic.  Just like before the patch, {{DataNode.java}} is still 
playing with a reference to a thread-safe (but mutable) Configuration.  
Reconfiguration still happens by means of the {{ReconfigurationThread}} 
invoking the {{reconfigureProperty}} method.

> Remove DataNode#conf because there is already a reference to it in the base 
> class
> -
>
> Key: HDFS-9895
> URL: https://issues.apache.org/jira/browse/HDFS-9895
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9895.000.patch
>
>
> Since DataNode inherits ReconfigurableBase with Configured as base class 
> where configuration is maintained, DataNode#conf should be removed for the 
> purpose of brevity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9895) Remove DataNode#conf because there is already a copy in the base class

2016-03-15 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9895:
---
Description: Since DataNode inherits ReconfigurableBase with Configured as 
base class where configuration is maintained, DataNode#conf should be removed 
for the purpose of brevity.  (was: Since DataNode inherits ReconfigurableBase 
with Configured as base class where configuration is maintained, all cached 
configurations in DataNode should be removed for brevity and consistency 
purpose.)

> Remove DataNode#conf because there is already a copy in the base class
> --
>
> Key: HDFS-9895
> URL: https://issues.apache.org/jira/browse/HDFS-9895
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9895.000.patch
>
>
> Since DataNode inherits ReconfigurableBase with Configured as base class 
> where configuration is maintained, DataNode#conf should be removed for the 
> purpose of brevity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9895) Remove DataNode#conf because there is already a copy in the base class

2016-03-15 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-9895:
---
Summary: Remove DataNode#conf because there is already a copy in the base 
class  (was: Remove all cached configuration from DataNode)

> Remove DataNode#conf because there is already a copy in the base class
> --
>
> Key: HDFS-9895
> URL: https://issues.apache.org/jira/browse/HDFS-9895
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HDFS-9895.000.patch
>
>
> Since DataNode inherits ReconfigurableBase with Configured as base class 
> where configuration is maintained, all cached configurations in DataNode 
> should be removed for brevity and consistency purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9847) HDFS configuration without time unit name should accept friendly time units

2016-03-15 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195658#comment-15195658
 ] 

Arpit Agarwal commented on HDFS-9847:
-

Thank you [~linyiqun]. I don't think we need getLongTimeSeconds and 
getLongTimeMillis either. Callers can just use {{getTimeDuration}}. I also 
suggest adding a getTimeDuration overload that accepts defaultValue as String 
so defaults can be defined with units.

e.g. 
{code}
conf.getTimeDuration(DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY,
DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_DEFAULT, TimeUnit.SECONDS) // 
seconds for backwards compatibility.
{code}

where {{DFS_HEARTBEAT_INTERVAL_DEFAULT = "3s"}}.


> HDFS configuration without time unit name should accept friendly time units
> ---
>
> Key: HDFS-9847
> URL: https://issues.apache.org/jira/browse/HDFS-9847
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9847.001.patch, HDFS-9847.002.patch, 
> HDFS-9847.003.patch, timeduration-w-y.patch
>
>
> In HDFS-9821, it talks about the issue of leting existing keys use friendly 
> units e.g. 60s, 5m, 1d, 6w etc. But there are som configuration key names 
> contain time unit name, like {{dfs.blockreport.intervalMsec}}, so we can make 
> some other configurations which without time unit name to accept friendly 
> time units. The time unit  {{seconds}} is frequently used in hdfs. We can 
> updating this configurations first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails

2016-03-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195578#comment-15195578
 ] 

Hudson commented on HDFS-9904:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9464 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9464/])
HDFS-9904. testCheckpointCancellationDuringUpload occasionally fails. (kihwal: 
rev d4574017845cfa7521e703f80efd404afd09b8c4)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java


> testCheckpointCancellationDuringUpload occasionally fails 
> --
>
> Key: HDFS-9904
> URL: https://issues.apache.org/jira/browse/HDFS-9904
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Lin Yiqun
> Fix For: 2.7.3
>
> Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch
>
>
> The failure was at the end of the test case where the txid of the standby 
> (former active) is checked. Since the checkpoint/uploading was canceled , it 
> is not supposed to have the new checkpoint. Looking at the test log, that was 
> still the case, but the standby then did checkpoint on its own and bumped up 
> the txid, right before the check was performed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails

2016-03-15 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HDFS-9904.
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.3

> testCheckpointCancellationDuringUpload occasionally fails 
> --
>
> Key: HDFS-9904
> URL: https://issues.apache.org/jira/browse/HDFS-9904
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Lin Yiqun
> Fix For: 2.7.3
>
> Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch
>
>
> The failure was at the end of the test case where the txid of the standby 
> (former active) is checked. Since the checkpoint/uploading was canceled , it 
> is not supposed to have the new checkpoint. Looking at the test log, that was 
> still the case, but the standby then did checkpoint on its own and bumped up 
> the txid, right before the check was performed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails

2016-03-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195554#comment-15195554
 ] 

Kihwal Lee commented on HDFS-9904:
--

I've committed this to trunk through branch-2.7. Thanks for working on this Lin 
Yiqun.

> testCheckpointCancellationDuringUpload occasionally fails 
> --
>
> Key: HDFS-9904
> URL: https://issues.apache.org/jira/browse/HDFS-9904
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Lin Yiqun
> Fix For: 2.7.3
>
> Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch
>
>
> The failure was at the end of the test case where the txid of the standby 
> (former active) is checked. Since the checkpoint/uploading was canceled , it 
> is not supposed to have the new checkpoint. Looking at the test log, that was 
> still the case, but the standby then did checkpoint on its own and bumped up 
> the txid, right before the check was performed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails

2016-03-15 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9904:
-
Assignee: Lin Yiqun

> testCheckpointCancellationDuringUpload occasionally fails 
> --
>
> Key: HDFS-9904
> URL: https://issues.apache.org/jira/browse/HDFS-9904
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
>Assignee: Lin Yiqun
> Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch
>
>
> The failure was at the end of the test case where the txid of the standby 
> (former active) is checked. Since the checkpoint/uploading was canceled , it 
> is not supposed to have the new checkpoint. Looking at the test log, that was 
> still the case, but the standby then did checkpoint on its own and bumped up 
> the txid, right before the check was performed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9904) testCheckpointCancellationDuringUpload occasionally fails

2016-03-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195531#comment-15195531
 ] 

Kihwal Lee commented on HDFS-9904:
--

+1 I've verified that the config is only set for the specific test case.

> testCheckpointCancellationDuringUpload occasionally fails 
> --
>
> Key: HDFS-9904
> URL: https://issues.apache.org/jira/browse/HDFS-9904
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.3
>Reporter: Kihwal Lee
> Attachments: HDFS-9904.001.patch, HDFS-9904.002.patch
>
>
> The failure was at the end of the test case where the txid of the standby 
> (former active) is checked. Since the checkpoint/uploading was canceled , it 
> is not supposed to have the new checkpoint. Looking at the test log, that was 
> still the case, but the standby then did checkpoint on its own and bumped up 
> the txid, right before the check was performed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-03-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195522#comment-15195522
 ] 

Hadoop QA commented on HDFS-9694:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 10s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 44s {color} 
| {color:red} hadoop-hdfs-project-jdk1.8.0_74 with JDK v1.8.0_74 generated 1 
new + 48 unchanged - 1 fixed = 49 total (was 49) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 4m 3s {color} 
| {color:red} hadoop-hdfs-project-jdk1.7.0_95 with JDK v1.7.0_95 generated 1 
new + 50 unchanged - 1 fixed = 51 total (was 51) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 0s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 4 new + 
0 unchanged - 0 fixed = 4 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 

[jira] [Commented] (HDFS-9847) HDFS configuration without time unit name should accept friendly time units

2016-03-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195406#comment-15195406
 ] 

Hadoop QA commented on HDFS-9847:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 3s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 44s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 8s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 15s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 57s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 44s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 16s 
{color} | {color:red} root: patch generated 5 new + 1014 unchanged - 1 fixed = 
1019 total (was 1015) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 11s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 27s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 3s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 13s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| 

[jira] [Commented] (HDFS-9945) Datanode command for evicting writers

2016-03-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195386#comment-15195386
 ] 

Kihwal Lee commented on HDFS-9945:
--

HDFS-2043 TestHFlush
HDFS-9780 TestRollingFileSystemSinkWithSecureHdfs
HDFS-9950 TestDecommissioningStatus
HDFS-10169 TestEditLog 
HDFS-9767 TestFileAppend
HDFS-6532 TestCrcCorruption
I will work on some of these.

The two checkstyle warnings are about the existing method length being over 150 
lines.

> Datanode command for evicting writers
> -
>
> Key: HDFS-9945
> URL: https://issues.apache.org/jira/browse/HDFS-9945
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-9945.patch, HDFS-9945.v2.patch
>
>
> It will be useful if there is a command to evict writers from a datanode. 
> When a set of datanodes are being decommissioned, they can get blocked by 
> slow writers at the end.  It was rare in the old days since mapred jobs 
> didn't last too long, but with many different types of apps running on 
> today's YARN cluster, we are often see very long tail in datanode 
> decommissioning.
> I propose a new dfsadmin command, {{evictWriters}}, to be added. I initially 
> thought about having namenode automatically telling datanodes on 
> decommissioning, but realized that having a command is more flexible. E.g. 
> users can choose not to do this at all, choose when to evict writers, or 
> whether to try multiple times for whatever reasons.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6532) Intermittent test failure org.apache.hadoop.hdfs.TestCrcCorruption.testCorruptionDuringWrt

2016-03-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195362#comment-15195362
 ] 

Kihwal Lee commented on HDFS-6532:
--

Still happening.

{noformat}
testCorruptionDuringWrt(org.apache.hadoop.hdfs.TestCrcCorruption)  Time 
elapsed: 50.284 sec  <<< ERROR!
java.lang.Exception: test timed out after 5 milliseconds
at java.lang.Object.wait(Native Method)
at 
org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:764)
at 
org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:689)
at 
org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:770)
at 
org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:747)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
at 
org.apache.hadoop.hdfs.TestCrcCorruption.testCorruptionDuringWrt(TestCrcCorruption.java:136)
{noformat}

> Intermittent test failure 
> org.apache.hadoop.hdfs.TestCrcCorruption.testCorruptionDuringWrt
> --
>
> Key: HDFS-6532
> URL: https://issues.apache.org/jira/browse/HDFS-6532
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs-client
>Affects Versions: 2.4.0
>Reporter: Yongjun Zhang
>
> Per https://builds.apache.org/job/Hadoop-Hdfs-trunk/1774/testReport, we had 
> the following failure. Local rerun is successful
> {code}
> Regression
> org.apache.hadoop.hdfs.TestCrcCorruption.testCorruptionDuringWrt
> Failing for the past 1 build (Since Failed#1774 )
> Took 50 sec.
> Error Message
> test timed out after 5 milliseconds
> Stacktrace
> java.lang.Exception: test timed out after 5 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(DFSOutputStream.java:2024)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:2008)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2107)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:70)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:98)
>   at 
> org.apache.hadoop.hdfs.TestCrcCorruption.testCorruptionDuringWrt(TestCrcCorruption.java:133)
> {code}
> See relevant exceptions in log
> {code}
> 2014-06-14 11:56:15,283 WARN  datanode.DataNode 
> (BlockReceiver.java:verifyChunks(404)) - Checksum error in block 
> BP-1675558312-67.195.138.30-1402746971712:blk_1073741825_1001 from 
> /127.0.0.1:41708
> org.apache.hadoop.fs.ChecksumException: Checksum error: 
> DFSClient_NONMAPREDUCE_-1139495951_8 at 64512 exp: 1379611785 got: -12163112
>   at 
> org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:353)
>   at 
> org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:284)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.verifyChunks(BlockReceiver.java:402)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:537)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:734)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:741)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:234)
>   at java.lang.Thread.run(Thread.java:662)
> 2014-06-14 11:56:15,285 WARN  datanode.DataNode 
> (BlockReceiver.java:run(1207)) - IOException in BlockReceiver.run(): 
> java.io.IOException: Shutting down writer and responder due to a checksum 
> error in received data. The error response has been sent upstream.
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstreamUnprotected(BlockReceiver.java:1352)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstream(BlockReceiver.java:1278)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1199)
>   at java.lang.Thread.run(Thread.java:662)
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10169) TestEditLog.testBatchedSyncWithClosedLogs sometimes fails.

2016-03-15 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-10169:
-

 Summary: TestEditLog.testBatchedSyncWithClosedLogs sometimes fails.
 Key: HDFS-10169
 URL: https://issues.apache.org/jira/browse/HDFS-10169
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Kihwal Lee


This failure has been seen multiple precomit builds recently.
{noformat}
testBatchedSyncWithClosedLogs[1](org.apache.hadoop.hdfs.server.namenode.TestEditLog)
  Time elapsed: 0.377 sec  <<< FAILURE!
java.lang.AssertionError: logging edit without syncing should do not affect 
txid expected:<1> but was:<2>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at 
org.apache.hadoop.hdfs.server.namenode.TestEditLog.testBatchedSyncWithClosedLogs(TestEditLog.java:594)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-2043) TestHFlush failing intermittently

2016-03-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195338#comment-15195338
 ] 

Kihwal Lee commented on HDFS-2043:
--

This seems to be an actual race in the code.

> TestHFlush failing intermittently
> -
>
> Key: HDFS-2043
> URL: https://issues.apache.org/jira/browse/HDFS-2043
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Aaron T. Myers
>
> I can't reproduce this failure reliably, but it seems like TestHFlush has 
> been failing intermittently, with the frequency increasing of late.
> Note the following two pre-commit test runs from different JIRAs where 
> TestHFlush seems to have failed spuriously:
> https://builds.apache.org/job/PreCommit-HDFS-Build/734//testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/680//testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-2043) TestHFlush failing intermittently

2016-03-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195335#comment-15195335
 ] 

Kihwal Lee commented on HDFS-2043:
--

This is how it fails nowadays in precommit.
{noformat}
testHFlushInterrupted(org.apache.hadoop.hdfs.TestHFlush)  Time elapsed: 2.259 
sec  <<< ERROR!
java.nio.channels.ClosedByInterruptException: null
at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:501)
at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:653)
{noformat}



> TestHFlush failing intermittently
> -
>
> Key: HDFS-2043
> URL: https://issues.apache.org/jira/browse/HDFS-2043
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Aaron T. Myers
>
> I can't reproduce this failure reliably, but it seems like TestHFlush has 
> been failing intermittently, with the frequency increasing of late.
> Note the following two pre-commit test runs from different JIRAs where 
> TestHFlush seems to have failed spuriously:
> https://builds.apache.org/job/PreCommit-HDFS-Build/734//testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/680//testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9961) Ozone: Add buckets commands to CLI

2016-03-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195302#comment-15195302
 ] 

Hadoop QA commented on HDFS-9961:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 32s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
44s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 38s 
{color} | {color:green} HDFS-7240 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s 
{color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
36s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 31s 
{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 8s 
{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 45s 
{color} | {color:green} HDFS-7240 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 38s 
{color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 36s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 
0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 36s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 25s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 114m 41s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 42s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 256m 29s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_74 Failed junit tests | 
hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.TestLocalDFS |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | 

[jira] [Updated] (HDFS-9928) Make HDFS commands guide up to date

2016-03-15 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9928:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 2.9.0
Target Version/s: 2.9.0
  Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2. Thanks, [~jojochuang]!

> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.9.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Fix For: 2.9.0
>
> Attachments: HDFS-9928-branch-2.002.patch, HDFS-9928-trunk.003.patch, 
> HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9928) Make HDFS commands guide up to date

2016-03-15 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195148#comment-15195148
 ] 

Masatake Iwasaki commented on HDFS-9928:


+1

> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.9.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Attachments: HDFS-9928-branch-2.002.patch, HDFS-9928-trunk.003.patch, 
> HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9960) OzoneHandler : Add localstorage support for keys

2016-03-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195096#comment-15195096
 ] 

Hadoop QA commented on HDFS-9960:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
50s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s 
{color} | {color:green} HDFS-7240 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s 
{color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s 
{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
35s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s 
{color} | {color:green} HDFS-7240 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 15s 
{color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
2s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 23s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 
2 unchanged - 0 fixed = 3 total (was 2) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 57s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 10 new + 0 
unchanged - 0 fixed = 10 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 58s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 30s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 122m 43s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
53s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 251m 24s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Found reliance on default encoding in 
org.apache.hadoop.ozone.web.client.OzoneBucket.getKey(String):in 
org.apache.hadoop.ozone.web.client.OzoneBucket.getKey(String): 
java.io.ByteArrayOutputStream.toString()  At OzoneBucket.java:[line 343] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.ozone.web.client.OzoneBucket.putKey(String, String):in 
org.apache.hadoop.ozone.web.client.OzoneBucket.putKey(String, String): 

[jira] [Commented] (HDFS-9951) Use string constants for XML tags in OfflineImageReconstructor

2016-03-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195069#comment-15195069
 ] 

Hadoop QA commented on HDFS-9951:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 14 new + 
21 unchanged - 3 fixed = 35 total (was 24) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 8s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 37s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 139m 3s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_74 Failed junit tests | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
| JDK v1.8.0_74 Timed out junit tests | 
org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer |
| JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
| JDK v1.7.0_95 Timed out junit tests | 

[jira] [Updated] (HDFS-9694) Make existing DFSClient#getFileChecksum() work for striped blocks

2016-03-15 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-9694:

Attachment: HDFS-9694-v6.patch

Thanks [~umamaheswararao] for the review and nice suggestions. The renaming 
makes a lot sense to consider the following work for the real striped approach 
for striped file and blocks. I absorbed your thinkings and adapted some bit, 
resulting this updated patch. 
Change summary:
* Javadoc update for {{getFileChecksum}}, done;
* Renamed: StripedFileChecksumComputer => 
StripedFileNonStripedChecksumComputer; StripedBlockChecksumComputer => 
NonStripedBlockGroupChecksumComputer.

Please let me know if this works for you or not, thanks.

> Make existing DFSClient#getFileChecksum() work for striped blocks
> -
>
> Key: HDFS-9694
> URL: https://issues.apache.org/jira/browse/HDFS-9694
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Kai Zheng
> Attachments: HDFS-9694-v1.patch, HDFS-9694-v2.patch, 
> HDFS-9694-v3.patch, HDFS-9694-v4.patch, HDFS-9694-v5.patch, HDFS-9694-v6.patch
>
>
> This is a sub-task of HDFS-8430 and will get the existing API 
> {{FileSystem#getFileChecksum(path)}} work for striped files. It will also 
> refactor existing codes and layout basic work for subsequent tasks like 
> support of the new API proposed there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9928) Make HDFS commands guide up to date

2016-03-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195015#comment-15195015
 ] 

Hadoop QA commented on HDFS-9928:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 10m 21s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793482/HDFS-9928-trunk.003.patch
 |
| JIRA Issue | HDFS-9928 |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 74037850fcfc 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / eba66a6 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14823/artifact/patchprocess/whitespace-eol.txt
 |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14823/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> Make HDFS commands guide up to date
> ---
>
> Key: HDFS-9928
> URL: https://issues.apache.org/jira/browse/HDFS-9928
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.9.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>  Labels: documentation, supportability
> Attachments: HDFS-9928-branch-2.002.patch, HDFS-9928-trunk.003.patch, 
> HDFS-9928.001.patch
>
>
> A few HDFS subcommands and options are missing in the documentation.
> # envvars: display computed Hadoop environment variables
> I also noticed (in HDFS-9927) that a few OIV options are missing, and I'll be 
> looking for other missing options as well.
> Filling this JIRA to fix them all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (HDFS-10167) CLONE - Erasure Coding: when recovering lost blocks, logs can be too verbose and hurt performance

2016-03-15 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B deleted HDFS-10167:
-


> CLONE - Erasure Coding: when recovering lost blocks, logs can be too verbose 
> and hurt performance
> -
>
> Key: HDFS-10167
> URL: https://issues.apache.org/jira/browse/HDFS-10167
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: dragon
>Assignee: Rui Li
>
> When we test reading data with datanodes killed, 
> {{DFSInputStream::getBestNodeDNAddrPair}} becomes a hot spot method and 
> effectively blocks the client JVM. This log seems too verbose:
> {code}
> if (chosenNode == null) {
>   DFSClient.LOG.warn("No live nodes contain block " + block.getBlock() +
>   " after checking nodes = " + Arrays.toString(nodes) +
>   ", ignoredNodes = " + ignoredNodes);
>   return null;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (HDFS-10130) CLONE - Erasure Coding: handle missing internal block locations in DFSStripedInputStream

2016-03-15 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B deleted HDFS-10130:
-


> CLONE - Erasure Coding: handle missing internal block locations in 
> DFSStripedInputStream
> 
>
> Key: HDFS-10130
> URL: https://issues.apache.org/jira/browse/HDFS-10130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: dragon
>Assignee: Jing Zhao
>
> Currently DFSStripedInputStream assumes we always have complete internal 
> block location information, i.e., we can always get all the DataNodes for a 
> striped block group. In a lot of scenarios the client cannot get complete 
> block location info, e.g., some internal blocks are missing and the NameNode 
> has not finished the recovery yet. We should add functionality to handle 
> missing block locations in DFSStripedInputStream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (HDFS-10165) CLONE - Erasure coding: update EC command "-s" flag to "-p" when specifying policy

2016-03-15 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B deleted HDFS-10165:
-


> CLONE - Erasure coding: update EC command "-s" flag to "-p" when specifying 
> policy
> --
>
> Key: HDFS-10165
> URL: https://issues.apache.org/jira/browse/HDFS-10165
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: dragon
>Assignee: Zhe Zhang
>
> HDFS-8833 missed this update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (HDFS-10160) CLONE - Erasure coding: fix 2 failed tests of DFSStripedOutputStream

2016-03-15 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B deleted HDFS-10160:
-


> CLONE - Erasure coding: fix 2 failed tests of DFSStripedOutputStream
> 
>
> Key: HDFS-10160
> URL: https://issues.apache.org/jira/browse/HDFS-10160
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: dragon
>Assignee: Walter Su
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (HDFS-10139) CLONE - Erasure Coding: the number of chunks in packet is not updated when writing parity data

2016-03-15 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B deleted HDFS-10139:
-


> CLONE - Erasure Coding: the number of chunks in packet is not updated when 
> writing parity data
> --
>
> Key: HDFS-10139
> URL: https://issues.apache.org/jira/browse/HDFS-10139
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: dragon
>Assignee: Li Bo
>
> The member {{numChunks}} in {{DFSPacket}} is always zero if this packet 
> contains parity data. The calling of {{getNumChunks}} may  cause potential 
> errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Deleted] (HDFS-10147) CLONE - Erasure Coding: add test for namenode process over replicated striped block

2016-03-15 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B deleted HDFS-10147:
-


> CLONE - Erasure Coding: add test for namenode process over replicated striped 
> block
> ---
>
> Key: HDFS-10147
> URL: https://issues.apache.org/jira/browse/HDFS-10147
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: dragon
>Assignee: Takuya Fukudome
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   >